CN107341171A - Extract the method and system of data (gene) feature templates method and application template - Google Patents

Extract the method and system of data (gene) feature templates method and application template Download PDF

Info

Publication number
CN107341171A
CN107341171A CN201710303290.0A CN201710303290A CN107341171A CN 107341171 A CN107341171 A CN 107341171A CN 201710303290 A CN201710303290 A CN 201710303290A CN 107341171 A CN107341171 A CN 107341171A
Authority
CN
China
Prior art keywords
verb
noun
template
class
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710303290.0A
Other languages
Chinese (zh)
Other versions
CN107341171B (en
Inventor
刘洪利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201710303290.0A priority Critical patent/CN107341171B/en
Publication of CN107341171A publication Critical patent/CN107341171A/en
Application granted granted Critical
Publication of CN107341171B publication Critical patent/CN107341171B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2452Query translation
    • G06F16/24522Translation of natural language queries to structured queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

Extract data correlation characteristic value pattern or template method:One:Data resource is carried out to judge that languages pre-process, part-of-speech tagging, mark out noun, the verb of each sentence;And syntactic analysis is carried out, mark out the subject, predicate, object of each sentence;Two:Extract the overlapping phrase for being labeled as subject and noun in sentence set, the overlapping phrase for being labeled as predicate and verb, the overlapping phrase for being labeled as object and noun, the name set of words as subject is obtained respectively, verb set as predicate, with the name set of words as object, and their Subject-Verb in sentence/linked character relations corresponding to predicate object respectively, three:The accumulative word frequency of subject noun, predicate verb and object verb is counted respectively, mark out and be used as measurement subject name set of words: predicate verb set/predicate verb set: object noun is containing relevant phrase weight characteristic value size, noun: verb ≈ word frequency is than n: v verb: noun ≈ word frequency is than v: 2n.

Description

Extract the method and system of data (gene) feature templates method and application template
Technical field
The present invention relates to fields such as data mining, text mining, natural language processing, artificial intelligence, one is related specifically to Kind makes the method using data correlation feature mode or template based on natural language processing, text mining, and utilizes application The Intelligent Business of the template, the method and system of intelligent social.
Background technology
" data are blasts, and information is but very poor ".Briefly, data are exactly symbol.Data are in itself without any meaning Justice, the implication of data is exactly semantic (semantic).The data for being only endowed implication can be used, and at this time data are just For information, the implication of data is exactly semantic for conversion.Semanteme is the approach for contacting computer representation and real world.
Internet resources environment also develops to semantization, structuring and intelligent direction in itself.
Human development is stored to bulk information today with the mode of electronic document and these electronics over nearly ten or twenty year The quantity of document shows explosive growth.Join according to Mrrrill Lynch (Merrill Lynch) and high Dana Corporation (Gartner) Close the investigation carried out and show that 85% business data is more or less that storage is collected in a manner of unordered.Meanwhile investigate Claim that these rambling data double for every 18 months.Text is most basic, the most frequently used information carrier, includes people Class knowledge accumulation processes progressive process and is related to the daily social activities of people, state affairs, public service, business activity, society Turn over a finished item dynamic core value.It is particularly important in the processing of Computer Language Processing work Chinese version with treatment technology.Now All circles follow " knowledge is power ", Knowledge Source in data and information, if society, government, enterprises and individuals can efficiently and Effectively excavate the value of text data behind, it becomes possible to make more preferable decision-making, improve operating efficiency, lift quality of the life.
Natural language processing technique has been achieved for major progress in morphology and syntactical research.Comparatively speaking, it is right The research of semantic, pragmatic and Contextual Knowledge is always a bottleneck for being difficult to cross over, and it is even whole that difficult point concentrates on elimination sentence The semantic ambiguity of piece article rank.It is probably to be regarded by the application of initial machine translation (application still not very success so far) thinking Wild influence, understanding and assurance of the people during natural language processing to messenger particle degree have certain deviation.Need Illustrate be for many sentences in text be not each sentence can be formed with value, significant material.
The content of the invention
Although the present invention extracts the critical data element characteristic in text using the phrase of sentence as messenger particle degree, it is The object of cognition of data mining is used as using the function of social function main body and value granularity, it is believed that the mankind (including create economical The unit of value activities and personal income) thinking/behavior produce information source, producing scientific research commercial activity daily with people and disappear Take the active maps such as pastime culture and positively related value relation, only means and result are to generate to accord with by digitlization means Number change corresponding data amount, modern society seldom carries out information activities with papery writing words, in ecommerce and shifting The digitlization of people's almost all carries out social information's exchange activity under dynamic internet environment;In turn, it should can be by people The valuing characteristic element of original original text data (not secondary statistics) and the Internet of Things data of behavior excavate, Find producing scientific research commercial activity daily with people and the digitlization consumption pastime activity such as culture have the social thinking of corresponding relation/ Behavior memory driving data (gene) information.
Economical is substantially a set of value system, and cultural essence is a set of Value System, and data are to be based on thinking row For record or synchronization:Information source → natural language data for electronic documents resource.
The main body of the society and its data resource schematic diagram caused by Fig. 1 display informations source,
Each mainly function of the main body of the society and value are respectively:
1, individual has thinking work function and value;
2, personal colony has study consumption function and value;
3, unit has manufacture production/living resources/Service Source function and value;
4, comprehensive function and value of the office with manufacture from the resources of production to living resources.
The daily routines and action of the mankind are main in information and in data by value and values driving, thinking behavior It is to be come out with the semantic representation of verb:Thinking/behavior memory (sight) driving → information source → verb (value driving) → natural language Say data for electronic documents resource.Function and value information transitive relation are presented as in data between these main bodies of the society:It is different The verb-noun association phrase of (value), passes through the subject of natural language sentences, meaning based on the verb of use intensity (frequency) Language, the method structure of object language, which is realized, transmits verb-noun chain, describes data value chain (gene) information.
The information activities of the main body of the society produce following data resource:
1, personal data resource;
2, personal population sample data resource;
3, unit (department) data resource;
4, office complex sample data resource;
Wherein, numerous personal data resources are pooled together as personal population sample data resource -- data common property;It is many More unit (department) data resources pool together composition office complex sample data resource -- data sharing.Text mining is from big Measure in the set C of text and find implicit pattern p.If regarding C as inputs, regard p as outputs, then the process of text mining Exactly from a mapping for being input to output:C-p.Information extraction technology is exactly extra large from these main bodies of the society using computer technology The spy for meeting main body of the society demand or meeting main body of the society value demand is found in the electronic document and behavior Internet of Things data of amount Value indicative (data value gene), and personal (privacy) data will be related to and carry out desensitization bleaching, such as by personal email:embbiz@ 126.com desensitizes:e***iz@126.com.
Recognize based on more than, for achieving the above object, according to the first aspect of the present invention,
1. the method that one kind is based on natural language processing technique, extraction making data correlation feature mode or template is provided, It is characterized in that verb-noun (value) the chain data correlation pattern of making or template perform following steps:
(a) step 1:Languages judgement pretreatment, part-of-speech tagging, mark are carried out to the text data of (main body) data resource Outpour noun, the verb of each sentence;
And syntactic analysis is carried out to text data, the subject, predicate, object of each sentence are marked out wherein it is possible to will be by Dynamic voice subject is labeled as object;
Wherein, the reference resolution processing of subject noun or object noun is carried out;
(b) step 2:The overlapping phrase for being labeled as subject and noun in sentence set is extracted, is extracted overlapping in sentence set The phrase of predicate and verb is labeled as, extracts the overlapping phrase for being labeled as object and noun, obtains the noun as subject respectively Set, as the verb set of predicate, and the name set of words as object, and they respectively subject-predicate in sentence/ Linked character relation corresponding to predicate-object, subject name set of words: predicate verb set/predicate verb set: object name word set Close, i.e. noun: verb/verb: the phrase combination of noun and subject-predicate/predicate-object (one-to-one) between them Linked character relation;
(c) step 3:The accumulative word frequency of subject noun, predicate verb and object verb is counted respectively, is marked out and is used as Measure subject name set of words: predicate verb set/predicate verb set: object name set of words contains (one-to-one) incidence relation Phrase weight characteristic value size, i.e.
Subject noun word frequency n: predicate verb word frequency v/
Predicate verb word frequency v: object noun word frequency n2 (word frequency v, n, n2 are positive integers),
Obtain (main body) data resource of incidence relation weight:
Noun: verb ≈ word frequency than n: v set and
Verb: noun ≈ word frequency is gathered than v: 2n,
The word frequency phrase of high frequency is chosen in set turns into verb-noun (value) chain data correlation pattern or template (phrase Set).
2. it is based on described method providing a preferred scheme, it is characterised in that:
Wherein, step 4:Merge noun: (one-to-one) the association phrase and verb of verb: (one-to-one) of noun associates The noun phrase and word frequency of repetition of the same name before and after phrase:
...
...: verb: identical noun
Identical noun: verb: ...
...
The word of noun two composition too many levels association phrase chain of the same name is connected, is obtained ...: verb: (merging identical) noun: dynamic Word: (merge identical) noun: verb: ... too many levels phrase chain, so as to by man-to-man phrase chain (set) string in part Connection is formed with verb/noun phrase alternately for hinged node ...: verb: noun: verb: noun: ... too many levels various dimensions Verb/noun alternating phrase chain, in some instances it may even be possible to formed head and the tail interlink closed loop verb/noun alternating phrase it is more Link closed loop associates phrase chain;
Merge the word frequency n+n2 of subject noun and object noun, it is reciprocal to obtain alternate cycles ...: verb: noun: ... Link association phrase weight ...: verb: noun: ... ≈ word frequency ratio ...: v: (n+n2): ... verb-noun (value) chain Data correlation pattern or template (phrase set, word frequency v, n, n2 are positive integers),
That is, verb-noun (value) chain data correlation pattern or template (phrase set), both can be one-to-one Phrase chain (noun: verb ≈ word frequency is than n: v or verb: noun ≈ word frequency is than v: n2) can also be too many levels phrase chain Bar (...: verb: noun: verb ... ≈ word frequency ratios ...: v: (n+n2): v: ...) two kinds of forms.
3. a preferred scheme is further provided based on the above method, it is characterised in that:
Wherein it is possible to utilize the natural language processing such as corpus, digital dictionary, ontologies storehouse aid total score Analysis, the adverbial word mark for the predicate verb of each sentence is carried out, cumulative statistics its word frequency is gone back when extracting adverbial word, is obtained:
Adverbial word: verb: the adverbial word of the associated weights of noun: verb: noun ≈ word frequency is than a: v: n2 or noun: adverbial word: dynamic (word frequency v, n, n2 are that positive integer a is certainly for verb-noun (value) chain data correlation pattern of the word ≈ word frequency than n: a: v or template 0) right number can be;
Or obtain the phrase chain of too many levels:
Alternate cycles are reciprocal ...: adverbial word: verb: noun: adverbial word: ... the associated weights of phrase ...: adverbial word: verb: Noun: ... ≈ word frequency ratio ...: a: v: (n+n2): a: ... verb-noun (value) chain data correlation pattern or template (word Frequency v, n, n2 are that positive integer a is natural number and can be 0);Wherein, adverbial word can be sky.
4. according to the second aspect of the present invention, there is provided a kind of verb-noun (value) chain data correlation pattern or template (word Group set) comparison method, it is characterised in that perform following steps:
(a) step 1:Verb-noun (value) chain data correlation pattern that two difference (main body) data resources make Or template (set of phrase chain), the comparison of phrase is mutually carried out,
(b) step 2:If comparison result obtains:
Identical verb,
Identical noun,
Identical verb and identical noun,
Identical noun and identical verb,
Or identical verb/noun alternating too many levels phrase chain,
Wherein, if adverbial word can also add the comparison of identical adverbial word, i.e.
Identical noun, identical adverbial word and identical verb,
Identical adverbial word, identical verb and identical noun,
Into in next step;
(c) step 3:Identical phrase carry out word frequency than comparison;
(d) step 4:Output result:
One, word frequency is than equal result:
Identical verb and identical noun: word frequency is more equal than v: n2, template successful match;
Identical noun and identical verb: word frequency is more equal than n: v, template successful match;
Identical verb/noun alternating too many levels phrase: word frequency ratio ...: v: (n+n2): v: ...,
1 (n, n2) component is entirely equal;
2 (n+n2) total amounts are equal;
, template successful match;
Two, result of the word frequency than not grade:
Identical verb:
Show the sequence of associated high-frequency noun;
Identical noun:
Show the sequence of associated high-frequency verb;
Identical verb and identical noun:
Two template noun word frequency respectively verb word frequency proportion with their word frequency than inversely proportional relation, show word frequency Compare difference value;
Identical noun and identical verb:
Two template noun word frequency verb word frequency proportion and their word frequency than direct proportionality,
Display word frequency compares difference value;
Identical verb/noun replaces too many levels phrase:
Noun, verb word frequency compare difference value;
Wherein it is possible in verb-noun (value) the chain data correlation template made from unit (department) data resource, choosing Determine the high noun phrase (set) of word frequency, compare from verb-noun (value) the chain data pass of personal population sample data resource Gang mould plate (set);Verb-noun (value) chain data correlation template where the identical noun that the match is successful can be used as described The unit that (high frequency) noun of unit (department) is the theme supplies the verb-noun of the matching relationship between personal group need (value) chain data correlation template;
Wherein, verb-noun (value) the chain data correlation template made from unit (department) data resource, word frequency is selected High noun phrase, compare from the office complex sample data resource of numerous unit set or obtained from the making of all data resources Verb-noun (value) the chain data correlation template taken, the unit (department) can be obtained and provided in office complex sample data In the verb word frequency degree in (high frequency) noun-verb action (value) chain link that source is overall or all data resources are overall Difference value positioning scenarios relatively and situation.
5. according to the third aspect of the present invention, there is provided one kind uses verb-noun (value) chain data correlation template construct Class template, and the comparison method of class template, it is characterised in that:
One, the extraction of class template (data clues+processing rule+characteristic value collection), which makes, performs following steps:
(a) step 1:By natural language processing instruments such as corpus, ontologies storehouses, (main body) data are provided In verb-noun (value) the chain data correlation template (set) in source, all noun phrases classify/cluster, and to similar The adjacent principle for respectively retaining at least one phrase (i.e. adjacent verb) of template position or so where noun, a selection part (including Same class noun) composition verb-noun (value) chain data correlation template fragment, mark off generic verb-noun (valency Value) chain segment (set), form classification/cluster template, i.e. each class template may the verb including less phrase chain- Noun (value) chain segment (set), and classification is named:
Item name a ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... : verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... verb-noun (value) chain segment (word frequency v, n, n2 are positive integers) }
Wherein, same method, the identical similar synonym near synonym of verb phrases can also be used, divides the class of similar verb Other template:
(b) step 2:Classification identical verb-noun (value) chain segment set, with original phrase, (segment is all dynamic Word, noun) a part of clue is used as, combination clue (first) is gathered,
Verb-noun is obtained and matches respectively to retain (identical) principle of at least one verb phrases with class noun or so (value) chain segment ordering rule is gathered as part rule, rule of combination (second),
Word frequency with verb-noun (value) chain segment of class noun is used for a part of statistical characteristics, assemblage characteristic It is worth (3rd) set,
Simplify and represent and be developed in details respectively:
Class template (data clues+processing rule+characteristic value collection)=class template data clues (... it is verb 1, same Class noun 1, verb 1, verb 2, with class noun 2, verb 2...)+processing rule:Phrase chain segment ordering rule (...: verb 1 : with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2: ...)+characteristic value collection (...: verb 1: similar Noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency Than ...: v: (n+n2): v: ...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers)) }
Wherein, other classifying rules can also be set, with verb-noun (value) chain data correlation template classification rule one Rise, composition class template rule;
Wherein it is possible to verb-noun (value) chain data correlation template construct list for unit (department) data resource Position (department) class template;
Wherein it is possible to made for verb-noun (value) the chain data correlation template extraction of personal population data resource Personal demographic categories template;
Wherein it is possible to verb-noun (value) chain data correlation template extraction for office complex sample data resource Make office complex class template;
Two, the comparison of class template (data clues+processing rule+characteristic value collection) performs following steps:
Former class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...)+processing rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: Verb 2: ...) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v ∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... phrase chain piece hyphenation Frequency ratio (word frequency v, n, n2 are positive integers)) }
The object type template (data clues+processing rule+characteristic value collection) of comparison;
(a) step 1:According to former class template identical rule, from the object type template to be compared (data clues+from Manage rule+characteristic value collection) data clues in, extract the same noun phrase in former class template data clues;
If not extracting identical same noun phrase, return starts;
If extracting the same noun phrase of above-mentioned whole, data clues (... verb 1, with class noun 1, verb 1, verb 2, same Class noun 2, verb 2...), the match is successful for data clues, into next step;
(b) step 2:Equally classify according to former class template/clustering rule classified, obtain same category phrase chain Segment ordering rule:
To in verb-noun (value) the chain data correlation template (set) of the object to be compared (main body) data resource, All noun phrases classify/cluster, and respectively retain identical individual phrase original to adjacent with template position where class noun or so Then, identical a part of verb-noun (value) chain segment is chosen, marks off generic verb-noun (value) chain segment collection Close, form classification/cluster template, classification is named, obtained:
Object type title A phrase sequence processing regular (sequence of phrase chain segment) ...: verb 11: with class noun 11 : verb 11: ...;...: verb 22: with class noun 22: verb 22: ...;... phrase chain segment }
Compare
Former item name a phrase segment sequence processing rule (...: verb 1: with class noun 1: verb 1: ...;...∶ Verb 2: with class noun 2: verb 2: ...;... data correlation template phrase chain segment)
In phrase chain segment sequence;
If the phrase chain segment sequence ... of processing rule: verb 11: with class noun 11: verb 11: ... with ...: it is dynamic Word 1: with class noun 1: verb 1: ... matching is consistent;...: verb 22: with class noun 22: verb 22: ... with ...: verb 2: With class noun 2: verb 2: ... matching is consistent;... regular phrase chain segment sequence all matchings are handled by that analogy unanimously, to be entered In next step;
If phrase chain segment sequence processing rule mismatches, return starts;
(c) step 3:Rule is handled according to former class template identical, comparing similar in two class templates includes phase The word frequency ratio to be sorted with title and identical phrase chain segment:
Comparison object type template characteristic value set ...: verb 11: with class noun 11: verb 11: ... ≈ word frequency Than ...: v: (n+n2): v: ...;...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v ∶...;... word frequency ratio (word frequency v, n, n2 are positive integers) }
With
Former class template characteristic value collection ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+ n2)∶v∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word frequency ratio (word frequency v, n, n2 are positive integers) }
It is compared;If:
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 1 : with class noun 1: verb 1: ...;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 2 : with class noun 2: verb 2: ...
... characteristic value word frequency is than all identical or be approximately equal to by that analogy, and as the match is successful for characteristic value collection,
The match is successful for data clues in class template (data clues+processing rule+characteristic value collection), handles rule The sequence of phrase chain segment is consistent, and for the word frequency of characteristic value collection than equal, result is that the match is successful for two class templates;
If:Characteristic value word frequency is that it fails to match for class template than not waiting;
Wherein, if also using other classifying rules, determine that matching compares success or not according to other rules, all matchings Success can be just the success of whole template matches, and the failure of any one local matching can all cause whole template matches to fail.
6. according to the fourth aspect of the present invention, there is provided one kind uses verb-noun (value) chain data correlation template construct The method of custom built forms, it is characterised in that perform following steps:
(a) step 1:In verb-noun (value) chain data correlation template set of theme (first) data resource, Choose the name word list (for example, noun first, noun second ...) of high word frequency;
(b) step 2:With the noun of the high frequency name word list, the verb-noun (valency with target (second) data resource Value) the noun matching in chain data correlation template set compares;
(for example, noun first with ...: verb third: noun third: verb first: noun first: verb fourth: noun fourth: ... ≈ word frequency Than ...: v: (n+n2): v: (n+n2): v: (n+n2): ... noun third, noun first, noun third compares;)
(c) step 3:In verb-noun (value) chain data correlation template of target (second) data resource matching into Work(noun position of the same name (for example, noun first of the same name is in verb-noun (value) chain data correlation template ...: noun third: Verb first: noun first position of the same name: verb fourth: noun fourth: .. ≈ word frequency ratios ...: (n+n2): v: (n+n2): v: (n+n2) : ...), choose at least one verb and a noun on verb/noun word alternating phrase chain to the left, to the right or to the left and right sides;
The verb-noun chosen to the left, the to the right or to the left and right sides alternating word of the position (not including noun of the same name) Group association chain (for example, noun third: verb first ≈ word frequency ratios (n+n2): v and verb fourth: noun fourth ≈ word frequency is than v: (n+n2)) As theme (first) data resource and the data correlation custom built forms (set) of target (second) data resource;
Wherein, by the use of can be as the verb-noun (valency of unit (department) data resource of theme (first) data resource Value) chain data correlation template, the verb-noun of the personal population data resource with target (second) data resource can be used as (value) chain data correlation template, production unit's supply-personal group need custom built forms;
Wherein, by the use of can be as the verb-noun (valency of unit (department) data resource of theme (first) data resource Value) chain data correlation template, verb-name of the office complex sample data resource with target (second) data resource can be used as Word (value) chain data correlation template, production unit's supply-office complex value chain supply chain custom built forms;
Wherein, by the use of can be as verb-noun (value) chain number of the personal data resource of theme (first) data resource According to relation template, verb-noun (value) chain of the personal population data resource with target (second) data resource can be used as Data correlation template, make individual-team learning contacts custom built forms.
7. according to the fifth aspect of the present invention, there is provided a kind of intelligence system that data mining is carried out based on personal mobile device System, includes corpus, ontologies storehouse etc., it is characterised in that personal mobile device include the output of personal mobile device it is defeated Enter synchronization module, template characteristic extraction module, wherein:
(1) output of personal mobile device input synchronization module, for by the input method on personal mobile device, shooting Head, shared drive, caching, temporary file caching, application APP record and are saved in local temporary file, network opening connects Mouthful, navigation API etc. outputs input personal data resource synchronous asynchronous replicate and collect, there is provided give template characteristic extraction to make module Use, data desensitization bleaching or image extraction characteristic value are carried out to synchrodata wherein it is possible to pre-process;
(2) template characteristic extraction makes module, and data digging is carried out to personal data resource synchronous on personal mobile device Pick, characteristics extraction, make personal model of data, pattern or template;
Wherein it is possible to extraction makes data model, pattern or the template of personal data resource on personal mobile device;
Verb-noun (value) chain data correlation template and class template are made wherein it is possible to extract;
Perform following steps and make verb-noun (value) chain data correlation template:
(a) step 1:Text data is carried out to judge languages Preprocessing, part-of-speech tagging, marks out and carrys out each sentence Noun, verb;
And syntactic analysis is carried out, mark out subject, predicate, the object for carrying out each sentence;
Wherein it is possible to passive voice subject is labeled as object;
Wherein, according to the different type main body of data resource, the reference resolution processing of subject noun or object noun is carried out;
(b) step 2:The overlapping phrase for being labeled as subject and noun is extracted, extracts the overlapping word for being labeled as predicate and verb Group, extracts the overlapping phrase for being labeled as object and noun, and they respectively subject-predicate in respective sentence and predicate- Incidence relation corresponding to object:
The phrase set of the noun as subject is obtained respectively, as the phrase set of the verb of predicate, and as object Noun phrase collection unification subject name set of words: predicate verb set/predicate verb set: object name set of words, i.e.
Noun: verb/
Verb: noun
Phrase combination and subject-predicate between them/predicate-object (one-to-one) incidence relation;
(c) step 3:The word frequency of each phrase of (during extraction) word frequency statisticses simultaneously marks out, and has one-to-one chain as measurement The index of the phrase weight size of bar incidence relation,
Noun n: verb v and
Verb v: noun n2 (word frequency v, n, n2 are positive integers);
(d) step 4:Merge noun: the one-to-one chain association phrase and verb of verb: the one-to-one chain of noun closes Join the front and rear noun phrase repeated of phrase and word frequency, and connect two words composition too many levels association chain:
...
...: verb: identical noun
Identical noun: verb: ...
...
Obtain ...: verb: (merge identical) noun: verb: ... too many levels phrase chain, formed with verb-noun Alternately for link phrase node, i.e. ...: verb: noun: verb: noun: ... too many levels various dimensions verb/noun replace Phrase chain, so as to by verb: noun and noun: verb, which associate phrase link and moved in circles, to be together in series, or even is formed from beginning to end The alternate too many levels closed loop association phrase chain of verb/noun of the closed loop to interlink,
It is reciprocal to obtain alternate cycles ...: verb: noun: ... link association phrase weight index ...: verb: name Word: ... ≈ word frequency ratio ...: v: (n+n2): ... (word frequency v, n, n2 are just whole to verb-noun (value) chain data correlation template Number);
Wherein, following steps are continued executing with and make acquisition class template (data clues+processing rule+characteristic value collection):
(a) step 1:By natural language processing instruments such as corpus, ontologies storehouses, to personal (colony) data money In verb-noun (value) the chain data correlation template (set) in source, all noun phrases classify/cluster, and to similar The adjacent principle for respectively retaining at least one phrase (i.e. adjacent verb) of template position or so where noun, a selection part (including Same class noun) composition verb-noun (value) chain data correlation template fragment, mark off generic verb-noun (valency Value) chain segment (set), classification/cluster template is formed, and classification is named:
Item name a ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... : verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... verb-noun (value) chain segment (word frequency v, n, n2 are positive integers) }
Wherein, same method, the identical similar synonym near synonym of verb phrases can also be used, divides the class of similar verb Other template:
(b) step 2:Classification identical verb-noun (value) chain segment set, with original phrase, (segment is all dynamic Word, noun) a part of clue is used as, composition clue (first) is gathered,
Obtained with adjacent (identical) principle for respectively retaining at least one verb phrases with template position where class noun or so Gather with matching verb-noun (value) chain segment ordering rule as part rule, composition rule (second),
A part of statistical characteristics is used for the word frequency of verb-noun (value) chain segment of same class noun, composition is special Value indicative (3rd) is gathered;
Simplify expression and expression is developed in details and be respectively:
Class template (data clues+processing rule+characteristic value collection)=class template data clues (... it is verb 1, same Class noun 1, verb 1, verb 2, with class noun 2, verb 2...)+processing rule:Phrase chain segment ordering rule (...: verb 1 : with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2: ...)+characteristic value collection (...: verb 1: similar Noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency Than ...: v: (n+n2): v: ...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers)) }
Wherein, other homogenous characteristics data such as geographical position, time can also be extracted, with verb-noun (value) chain number According to relation template grouped data together, the more complicated class template of following form is formed:
Class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...;A <North latitude N1 ", east longitude E1 ">、B<North latitude N2 ", east longitude E2 ">、C<North latitude N3 ", east longitude E3 ">、D<North latitude N4 ", east longitude E4 " >...)+processing rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: Verb 2: ...;Place ordering rule A: B: C: D...) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ Word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v ∶...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers);Place temporal characteristics value A:Time1, B:Time2, C: Time3, D:time4...)};
Wherein, verb phrases can also be used identical close (synonym near synonym), marks off the class template of similar verb;
The synchronous personal sample data resource of personal mobile device, obtains verb-noun (value) chain data correlation template The personal class template (data clues+processing rule+characteristic value collection) of composition;Wherein, due to being the first person, default Subject noun.
8. based on above-mentioned identity attribute matching Compare System, it is further provided a preferred scheme, it is characterised in that number Include according to common platform server is excavated:Template characteristic extraction module, template matching module, ATL, matching result feedback And messaging module, wherein it is possible to according to input method, camera, shared drive, caching, temporary file caching, application program APP records be saved in local temporary file, network opening interface, navigation API etc. outputs input personal data resource with it is personal The different degree of main body correlation, the different weights of the word frequency of extraction are set;Wherein, the server of unit or PC include public Companion's module is shared in data mining:
(1) common data excavates shared companion's module, for by the unit (portion on the server or PC of unit (department) Door) data for electronic documents resource, particularly text data, converge share to data mining common platform in manual or automated manner Server, numerous unit data resource convergence composition office complex sample data resources;
Wherein, data prediction is carried out before sharing, exclusion content is identical or content is similar and time identical electronics is literary Shelves, it can also pre-process and data desensitization is carried out to synchrodata;
(2) the template characteristic extraction of data mining common platform server makes module, and data are carried out for data resource Excavate, make data model, pattern or template;
Wherein, the personal data resource synchronously downloaded on numerous personal mobile devices is pooled together to form personal colony's number According to data mining is carried out, characteristics extraction, personal population data model, pattern or template are made;Personal mobile device convergence Personal population sample data resource, obtain the individual crowd of numerous seriation verb-noun (value) chain data correlation template compositions Body class template (data clues+processing rule+characteristic value collection);
Wherein, to unit (department) data resource, the office complex sample data resource of numerous units convergence, mobile device Synchronous personal data resource, the personal population sample data resource of mobile device convergence, personal population sample data resource with The blended data resource (being used for unit supply template and demands of individuals template matches) of unit (department) data resource, personal data Blended data resource (being used for individual speciality template and unit post template matches), the total data of resource and unit data resource Resource, data mining is carried out, make data pattern or template;
Wherein it is possible to make verb-noun (value) chain data correlation template and class template, verb-noun is obtained (value) chain data correlation template and class template;
Wherein, following steps are performed the-class template of personal group need can be supplied with production unit:
(d) step 1:The class name gathered with unit (department) class template (segment) of unit (department) data resource Claim, compare the classification in personal demographic categories template (segment) set as the personal population sample data resource for comparing object Title;
(e) step 2:If the match is successful for item name, obtain item name of the same name and adhere to unit (department) classification separately Two set of template (segment) and personal demographic categories template (segment);Selection directly quote above-mentioned item name of the same name and Affiliated personal demographic categories template (segment) set for comparing object, component unit supply-personal group need class template;
Result is unit supply-personal group need class template=unit item name of the same name+personal colony class of the same name Other template (data clues+processing rule+characteristic value collection) collection is combined into;
(3) template matching module, the characteristic value for the data template of different subjects data resource compare;
Wherein, the characteristic value comparison method of its verb-noun (value) chain data correlation template and class template is carried out, It can be used for verb-noun (value) the chain data correlation template and class template, office complex sample of unit (department) Verb-noun (value) chain data correlation template and class template, verb-noun (value) chain data correlation template of individual And class template, verb-noun (value) the chain data correlation template of personal colony and class template, holography verb- The mutually corresponding matching of characteristic value between noun (value) chain data correlation template and class template compares;
Wherein, the template matching of verb-noun (value) chain data correlation performs following steps:
(a) step 1:Verb-noun (value) chain data correlation pattern that two difference (main body) data resources make Or template (phrase set), the comparison of phrase is mutually carried out,
(b) step 2:If the identical comparison result of phrase obtains:Identical verb, identical noun, identical verb and mutually of the same name Word, identical noun and identical verb or identical verb/noun alternating too many levels phrase chain, wherein, if adverbial word can be with Add the comparison of identical adverbial word, i.e. identical noun, identical adverbial word and identical verb, identical adverbial word, identical verb and mutually of the same name Word;
(c) step 3:Identical phrase carry out word frequency than comparison;
(d) step 4:Output result:
One, identical phrase word frequency is than equal result:
Identical verb and identical noun: word frequency is more equal than v: n2;
Identical noun and identical verb: word frequency is more equal than n: v;
Identical verb/noun alternating too many levels phrase: word frequency ratio ...: v: (n+n2): v: ...,
1 (n, n2) component is entirely equal;
2 (n+n2) total amounts are equal;
The match is successful for the template matching of verb-noun (value) chain data correlation;
Two, result of the identical phrase word frequency than not grade:
Identical verb:
Show the sequence of associated high-frequency noun;
Identical noun:
Show the sequence of associated high-frequency verb;
Identical verb and identical noun:
Noun word frequency is inversely proportional in verb word frequency proportion and word frequency ratio, and display word frequency compares difference value;
Identical noun and identical verb:
For noun word frequency in verb word frequency proportion and word frequency than directly proportional, display word frequency compares difference value;
Identical verb/noun replaces too many levels phrase:
Noun, verb word frequency compare difference value;
Wherein, verb-noun (value) chain data correlation template that personal data resource makes, can select certain topic Noun (for example, interest, hobby, speciality etc.) mutually compare, matching obtains the approximate verb-noun value of personal matching theme Chain incidence relation;
Or can be according to given noun, in verb-noun (value) chain data correlation mould that personal data resource makes On plate, verb-noun (value) chain data correlation template matching with the population sample data resource making of individual, obtain personal (interest, hobby, speciality etc.) in (given) noun-verb action (value) chain link in population data resource entirety is dynamic The difference value positioning scenarios and situation that word word frequency degree compares;
Wherein, verb-noun (value) the chain data correlation template made from unit (department) data resource, word frequency is selected High noun phrase, compare from the office complex sample data resource of numerous unit set or obtained from the making of all data resources Verb-noun (value) the chain data correlation template taken, the unit (department) can be obtained and provided in office complex sample data The verb word frequency degree in (high frequency) noun-verb action (value) chain link that source is overall or all data resources are overall compares Difference value positioning scenarios and situation;
Wherein, in verb-noun (value) the chain data correlation template made from unit (department) data resource, selected word Frequently high noun phrase, verb-noun (value) chain data correlation template from personal population sample data resource is compared;Matching Verb-noun (value) chain data correlation template where successful noun can be as (high frequency) noun of the unit (department) Verb-noun (value) the chain data correlation template for the matching relationship being the theme between supply and personal group need;
Wherein, former class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, dynamic Word 2...)+processing rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2 : verb 2: ...) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v ∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... phrase chain piece hyphenation Frequency ratio (word frequency v, n, n2 are positive integers)) }
The object type template (data clues+processing rule+characteristic value collection) of comparison;
Class template (data clues+processing rule+characteristic value collection), which compares, performs following steps:
(a) step 1:According to former class template identical rule, to the object to be compared (main body) object type template In the data clues of (data clues+processing rule+characteristic value collection), extraction compares of the same name in former class template data clues Phrase;
If not extracting identical same noun phrase, return starts;
If extracting the same noun phrase of above-mentioned whole, data clues (... verb 1, with class noun 1, verb 1, verb 2, same Class noun 2, verb 2...) the match is successful for data clues, into next step;
(b) step 2:Equally classify according to former class template/clustering rule classified, obtain same category phrase chain Segment sorts:
To in verb-noun (value) the chain data correlation template (set) of the object to be compared (main body) data resource, All noun phrases classify/cluster, and respectively retain identical individual phrase original to adjacent with template position where class noun or so Then, identical a part of verb-noun (value) chain segment is chosen, is divided into generic verb-noun (value) chain segment collection Close, form classification/cluster template, classification is named, obtained:
Object type title A processing regular (phrase chain segment) ...: verb 11: with class noun 11: verb 11 ∶...;...: verb 22: with class noun 22: verb 22: ...;... phrase chain segment }
Compare
Former item name a processing rule phrase segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2 : with class noun 2: verb 2: ...;... data correlation template phrase chain segment)
In phrase chain segment sequence processing rule;
If phrase chain segment sequence processing rule ...: verb 11: with class noun 11: verb 11: ... with ...: it is dynamic Word 1: with class noun 1: verb 1: ... matching is consistent;...: verb 22: with class noun 22: verb 22: ... with ...: verb 2: With class noun 2: verb 2: ... matching is consistent;... all matchings of phrase chain segment sequence processing rule are consistent by that analogy, enter In next step;
If processing rule phrase chain segment have it is unmatched, return start;
(c) step 3:Sorted according to former class template identical and handle rule, compare bag similar in two class templates Include the word frequency ratio of the identical phrase chain segment sequence of same names:
Comparison object type template characteristic value set ...: verb 11: with class noun 11: verb 11: ... ≈ word frequency Than ...: v: (n+n2): v: ...;...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v ∶...;... word frequency ratio (word frequency v, n, n2 are positive integers) }
With
Former class template characteristic value collection ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+ n2)∶v∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word frequency ratio (word frequency v, n, n2 are positive integers) }
It is compared;If:
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 1 : with class noun 1: verb 1: ...;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 2 : with class noun 2: verb 2: ...
... characteristic value word frequency is than all identical or be approximately equal to by that analogy, and as the match is successful for characteristic value collection,
The match is successful for data clues in class template (data clues+processing rule+characteristic value collection), phrase chain segment Sequence processing rule is consistent, and for the word frequency of characteristic value collection than equal, final result is that the match is successful for two class templates;
If:Characteristic value word frequency is that it fails to match for class template than not waiting;
Wherein, if setting other classifying rules to extract other characteristics, according to other rule determine matching compare into Whether is work(, and it can be just the success of whole template matches that all the match is successful, and the failure of any one local matching can all cause whole Template matches fail;
Wherein, template matching module can also be included on personal mobile device;
(4) ATL, the template of the data resource for preserving various main bodys,
Wherein, preserve unit (department) verb-noun (value) chain data correlation template and class template, handle official business it is comprehensive Close verb-noun (value) the chain data correlation template and class template, verb-noun (value) chain data of individual of sample Relation template and class template, verb-noun (value) the chain data correlation template of personal colony and class template, holography Verb-noun (value) chain data correlation template and class template, unit (department) and personal colony's blended data resource Verb-noun (value) chain data correlation template, the personal verb-noun with unit (department post) blended data resource (value) chain data correlation template, etc. template set;
(5) matching result feedback and messaging module, feed back to for each template matching message data that the match is successful Corresponding each data resource main equipment, and for the interactive message communication between them;Wherein, be also used for by verb- In noun (value) chain data correlation template or class template (data clues+processing rule+characteristic value collection) comparing module Corresponding each data resource main equipment is fed back to successful message data;
Personal mobile device is run wherein it is possible to be used on personal mobile device on independent (safety) chip processor Output input synchronization module and template characteristic extraction make module.
9. according to the sixth aspect of the present invention, there is provided one kind is based on personal mobile device, using verb-noun (value) chain The intelligence system of the custom built forms or class template (data clues+processing rule+characteristic value collection) of data correlation template, it is special Sign is that personal mobile device includes:Output input data synchronization module, template clue filtering module, template matches compare Module, personal cue library, ATL, output display correspond to Agent Service content module;Wherein,
(1) output input data synchronization module, for by the input method on personal mobile device, camera, it is shared in Deposit, cache, temporary file caching, application APP record are saved in local temporary file, network opening interface, navigation API Deng output input data synchronously replicate collect;
Data desensitization bleaching or image extraction characteristic value are carried out to synchrodata wherein it is possible to pre-process;
(2) template clue filtering module, for above-mentioned output the data that are collected into of input data synchronization module, one by one according to Secondary specific filtration resistance to all verbs in the custom built forms of verb-noun (value) chain data correlation template construct in ATL, Hints data in the phrases such as noun or class template (data clues+processing rule+characteristic value collection), number of results that the match is successful According to recorded to obtain personal cue library, and record accumulative matching times;
Wherein it is possible to recorded according to input method, camera, shared drive, caching, temporary file caching, application APP Be saved in local temporary file, network opening interface, navigation API etc. outputs input personal data resource and individual subject's phase The different degree of closing property, the different weights of the data of filtering are set to add up word frequency or record;
(3) template matches comparing module, the comparison for the template and the template of personal cue library extraction of ATL;
Wherein, the verb-noun (valency of unit supply-personal group need custom built forms is included but is not limited in ATL Value) chain data correlation template custom built forms (for example, noun third: verb first ≈ word frequency is than n: v and verb fourth: noun fourth ≈ words Frequency ratio v: n2), compare made from the extraction of personal cue library verb-noun (value) chain data correlation template of the same name (for example, Noun third: verb first ≈ word frequency is than n: v and verb fourth: noun fourth ≈ word frequency is than v: n2);If the word frequency with noun phrase is than identical Or be approximately equal to, that is, it is unit supply-the match is successful for personal group need custom built forms, is otherwise that it fails to match;
Wherein, in view of class template=unit item name of the same name+personal colony of unit supply-personal group need is same Name class template (data clues+processing rule+characteristic value collection) collection is combined into, with unit supply-individual crowd in ATL Body demand class template, compare the class template of the same name (data clues+processing rule+feature made from the extraction of personal cue library Value set),
Unit supply-personal group need class template data clues (... verb 1, with class noun 1, verb 1, dynamic Word 2, with class noun 2, verb 2...)+processing rule:The sequence of phrase chain segment (...: verb 1: with class noun 1: verb 1 ∶...;...: verb 2: with class noun 2: verb 2: ...)+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v∶...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers)) }
Class template of the same name (the data clues+processing rule+characteristic value collection that the object individual cue library extraction of comparison makes Close);
Perform following class template and compare step:
(a) step 1:- class template identical the rule of personal group need is supplied according to unit, to pair to be compared As (main body) object type template (data clues+processing rule+characteristic value collection) data clues in, extraction compare unit supply To the same noun phrase in the data clues of the class template of-personal group need;
If not extracting identical same noun phrase, return starts;
If extracting the same noun phrase of above-mentioned whole, data clues (... verb 1, with class noun 1, verb 1, verb 2, same Class noun 2, verb 2...) the match is successful for data clues, into next step;
(b) step 2:Supply according to unit-class template of personal group need equally classifies/clustering rule divided Class, obtain same category phrase chain segment:
To in verb-noun (value) the chain data correlation template (set) of the object to be compared (main body) data resource, All noun phrases classify/cluster, and respectively retain identical individual phrase original to adjacent with template position where class noun or so Then, identical a part of verb-noun (value) chain segment is chosen, is divided into generic verb-noun (value) chain segment collection Close, form classification/cluster template, classification is named, obtained:
Object type title A processing regular (phrase chain segment) ...: verb 11: with class noun 11: verb 11 ∶...;...: verb 22: with class noun 22: verb 22: ...;... phrase chain segment }
Compare
The class template a of unit supply-personal group need processing rule phrase segment (...: verb 1: same to class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2: ...;... data correlation template phrase chain segment)
In phrase chain segment sequence processing rule;
If the phrase chain segment sequence ... of processing rule: verb 11: with class noun 11: verb 11: ... with ...: it is dynamic Word 1: with class noun 1: verb 1: ... matching is consistent;...: verb 22: with class noun 22: verb 22: ... with ...: verb 2: With class noun 2: verb 2: ... matching is consistent;... regular phrase chain segment sequence all matchings are handled by that analogy unanimously, to be entered In next step;
If processing rule phrase chain segment sequence have it is unmatched, return start;
(c) step 3:Supply according to unit-the class template identical of personal group need handles rule, compare two The similar word frequency ratio for including the identical phrase chain segment sequence of same names in class template:
Comparison object type template characteristic value set ...: verb 11: with class noun 11: verb 11: ... ≈ word frequency Than ...: v: (n+n2): v: ...;...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v ∶...;... word frequency ratio (word frequency v, n, n2 are positive integers) }
With
Unit supply-personal group need class template characteristic value collection ...: verb 1: with class noun 1: verb 1 : ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+ n2)∶v∶...;... word frequency ratio (word frequency v, n, n2 are positive integers) }
It is compared;
If:
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 1 : with class noun 1: verb 1: ...;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 2 : with class noun 2: verb 2: ...
... characteristic value word frequency is than all identical or be approximately equal to by that analogy, and as the match is successful for characteristic value collection,
Number in the class template of the same name (data clues+processing rule+characteristic value collection) that personal cue library extraction makes According to clue, the match is successful, and the phrase chain segment for handling rule sorts unanimously, and than equal, final result is the word frequency of characteristic value collection The match is successful for two class templates;
If:Characteristic value word frequency is that it fails to match for class template than not waiting;
Unit supply-personal group need custom built forms the match is successful or personal colony class template of the same name the match is successful all Template matches are formed to compare successfully;
(4) output display corresponding data service content module, for the template matching in template matches comparing module After success, then the corresponding data, services content set by output display on personal mobile device;
(5) ATL, for preserving the unit supply-individual crowd of verb-noun (value) chain data correlation template construct Body demand custom built forms set and unit supply-personal group need class template set;
Wherein, allow through personal mobile device, renewal unit supply-personal group need custom built forms, individual can be downloaded People-team learning associates the templates such as custom built forms to ATL;
(6) personal cue library, for export input data synchronization module synchrodata filtering verb-noun (value) The number obtained in the phrases such as all verbs of the custom built forms of chain data correlation template, noun and class template after hints data According to the personal cue library of composition.
10. according to the seventh aspect of the present invention, there is provided one kind based on personal mobile device carry out data mining, using verb- The intelligence system of noun (value) chain data correlation template, it is characterised in that personal mobile device includes:Input method software number Management and matching result feedback interactive communication module, individual character are sent according to synchronization module, verb filtering and template generation module, template ATL, verb library;
Data mining common platform server includes:Template receives management and matching result feedback interactive communication module, mould Plate comparing module, ATL;
Wherein, personal mobile device includes:
(1) input method software data simultaneous module, for will be synchronous to the input method input data on personal mobile device Replicate and collect;
(2) verb filtering and template generation module, the data being collected into for above-mentioned input method software data simultaneous module, The verb in filtering verb library is compared successively, performs following steps generation individual character template:
(a) step 1:With all conventional verb set in verb library, input method software data simultaneous module is filtered successively The text data being collected into;
(b) step 2:Part-of-speech tagging is carried out to sentence where the filtering verb that the match is successful, marks out the name come in sentence Word;
Also Sentence Grammar where the filtering verb that the match is successful is analyzed, (trying one's best) marks out the subject for carrying out sentence, predicate And object;
Whether the verb for judging to be filled into is predicate verb;
Wherein, or the text data that the input method of time-wise separation is collected into adds and matches somebody with somebody punctuation mark automatically, enters The processing of the reference resolution of row subject noun or object noun;
(c) step 3:If the verb being filled into is predicate verb, extract that the sentence is overlapping to be labeled as predicate and verb Phrase, extraction is overlapping to be labeled as subject and noun, extracts the overlapping phrase for being labeled as object and noun, and they exist respectively The one-to-one incidence relation of corresponding subject-predicate/predicate-object in respective sentence:Subject noun: predicate verb/predicate moves Word: object noun;
(d) step 4:If extract step 3 noun: verb/verb: noun phrase combination and they between Subject-predicate/predicate-object (one-to-one) incidence relation, is saved in individual character ATL;
The data that input method software data simultaneous module is collected into, the verb in filtering individual character ATL is compared successively : noun/noun: verb phrases combination, record the matching word frequency of each phrase and mark out come, as measurement two have it is one-to-one The index of the phrase weight size of chain incidence relation, verb word frequency v: noun word frequency n2/ noun word frequency n: verb word frequency v (words Frequency v, n, n2 are positive integers), obtain verb: noun ≈ word frequency is than v: n2/ noun: verb-noun of the verb ≈ word frequency than n: v (value) chain data correlation template (set);
(3) template sends management and matching result feedback interactive communication module, the user management for personal mobile device Individual character ATL, which shows and managed, sends template to data mining common platform server and the template of specified body data resource It is compared, and manages user and communicated with corresponding Parties ' Mutual;
(4) individual character ATL, for preserving the individual character template set of verb filtering and template generation module generation;
(5) verb library, for preserving conventional verb;
Wherein, Chinese is often included, but are not limited to as follows with verb:
Represent action behavior:Say, see, walking, listening, laughing at, taking, circling in the air, running, eating, singing, drinking, striking, sitting, shouting, staring at, kicking, hearing, Touch, criticize, publicizing, safeguarding, learning, studying, carry out, start, stopping, forbidding
Represent that change be present disappears:, it is dead, have, be equal to, occur, develop, develop, grow, it is dead, exist, eliminate
Represent psychological activity:Think, like, hating, primary, miss, intend, liking, wishing, evil primary, worry, be disagreeable, feeling, thinking
Represent to judge:It is, be, is
Representing may wish necessity (auxiliary verb):Can, can, meeting, can with, be willing to, be ready, agree, dare, should, should, Match somebody with somebody, be worth, would rather
Represent to tend to (directional verb):It is upper and lower, into and out of, return, open, cross,, come up, get off, come in, come out, return Come, come, come, go, up, go down, enter, go out it is main, go back, open, the past
Represent development verb:As grown, withering, germinateing, result, spawning;
For plan, system, scheme, file etc.:
Work out, work out, draft, draft, authorize, audit, examine, transmit, deliver, submit, report, assign, put on record, deposit Shelves, present one's view
For information, data:
Investigate, study, collect, arrange, analyze, conclude, analyze, summarize, provide, report, feed back, pass on, notify, send out Cloth, maintenance management
On a certain work (higher level):
Preside over, organize, instructing, arranging, coordinating, indicating, supervising, managing, distributing, controlling, take the lead it is responsible, examination & approval, authorization, Sign and issue, ratify, assess
Thinking behavior:
Research, analysis, assess, development, suggest, proposal, participate in, recommend, plan
Direct action:
Organize, carry out, performing, instructing, leading, controlling, supervising, use, production, participate in, illustrate, explaining, providing, assisting
Higher level's behavior:
License, ratify, define, determining, instructing, establishing, planning, supervising, determining
Administration behaviour:
Reach, assess, control, coordinate, ensure, identify, keep, supervise
Expert's behavior:
Analyze, assist, promote, get in touch with, suggest, recommend, support, assess, evaluate
Subordinate's behavior:
Check, check, collect, obtain, submit, make
Other:Maintain, keep, establish, exploitation, prepare, processing, perform, reception, arrange, monitoring, report, manage, really Recognize, generalities, cooperation, cooperation, acquisition, verification, inspection, contact, design, test, construction, change, write, drafting, guiding, passing Pass, translate, operating, ensureing, preventing, solving, introducing, paying, calculating, revising, undertaking, negotiating, conferring, interviewing, refusing, be no Certainly, monitor, predict, compare, delete, use
Wherein, data mining common platform server includes:
(1) template receives management and matching result feedback interactive communication module, for receiving to send on personal mobile device The individual character template to come over, template matching module is forwarded, is compared with the specified template of ATL;
Matching result data feedback personal device, and it is interactive logical with comparison template main equipment to carry out personal device Letter;
(2) template matching module, for receiving individual character template and the template ratio for the body data resource specified from ATL It is right;And matching result is received into management and matching result feedback interactive communication module feedback to personal device by template;
(3) ATL, for preserving the ATL of various body data resources;
Wherein, template includes but is not limited to, verb-noun (value) the chain data correlation template and class of unit (department) Other template, verb-noun (value) the chain data correlation template of office complex sample and class template, verb-name of individual Word (value) chain data correlation template and class template, personal colony verb-noun (value) chain data correlation template with And the template set such as class template, verb-noun (value) the chain data correlation template of holography and class template.
The present invention effect be:Data value, information intelligent.
1, commercially largely structureless data (such as Word, PDF, this paper document extracts, XML file etc.) are collected with mechanism Exit pattern (namely valuable information and knowledge) is refined, establishes the distribution of upstream and downstream and periphery supply chain and extension value chain Model, pattern, template, to instruct business or the strategic transformation of mechanism progress internet thinking to provide simply and easily data branch Support.
2, for non-professional medium-sized and small enterprises personnel and the wound visitor of double wounds, the training for carrying out professional knowledge is generally required, is learned Specialized market investigation is practised, product analysis relevant knowledge, further extends the time for holding the market demand.And with middle petty trade or Mechanism demand is the theme, and data mining is carried out to its user or service object, obtains user or unit supply-individual crowd of service Body demand characteristic template, so as in the wider product kimonos for carrying out intelligent intelligence analysis and targetedly recommending process of consumption Business, plays a part of marketing expert data support system.
3, information source body creates main body integrated personal intelligent mobile terminal with consumption demand and extracts the data of oneself Pattern, the magnanimity that carries out socialization with associated mechanisms or other people is shared match, realize the accurate speciality of acquisition, interest, hobby, Value, service or study, social, collaboration collaboration object, and under personal " common property data " value chain support of magnanimity, intellectuality carries Enter a higher school to practise social life the Culture Character and improve work and cooperate with exchange economy benefit.
Brief description of the drawings
With reference to accompanying drawing, other features and advantage of the invention, the principle of the present invention can be carried out by following citing Explain, and become more apparent upon from the explanation of preferred embodiment.
Social function main body and its data resource schematic diagram caused by Fig. 1 display informations source;
Fig. 2 shows that extraction makes the flow of data correlation characteristic value pattern or an embodiment of template method invention Figure;
Fig. 3 shows that extraction makes another embodiment flow of data correlation characteristic value pattern or template inventive method Figure;
Fig. 4 shows the flow of an embodiment of the comparison method invention of data correlation characteristic value pattern or template Figure;
Fig. 5 shows one invented using verb-noun (value) chain data correlation template construct class template method The flow chart of embodiment;
Fig. 6 shows an embodiment flow chart of the comparison method invention of class template;
Fig. 7 shows that one embodiment of the present invention is customized using verb-noun (value) chain data correlation template construct The method flow diagram of template;
Fig. 8 shows an a kind of embodiment of the intelligence system invention that data mining is carried out based on personal mobile device System structure diagram;
Fig. 9 shows a kind of another embodiment party for the intelligence system invention that data mining is carried out based on personal mobile device The system structure diagram of formula;
Figure 10 shows a kind of another implementation for the intelligence system invention that data mining is carried out based on personal mobile device The class template method flow diagram of mode production unit supply-personal group need;
Figure 11 shows that one kind is based on personal mobile device, using determining for verb-noun (value) chain data correlation template One embodiment of the intelligence system of pallet or class template (data clues+processing rule+characteristic value collection) invention System structure diagram;
Figure 12 shows that one kind carries out data mining based on personal mobile device, using verb-noun (value) chain data The system structure diagram of one embodiment of the intelligence system invention of relation template;
Figure 13 shows that one kind carries out data mining based on personal mobile device, using verb-noun (value) chain data The verb filtering of one embodiment of the intelligence system invention of relation template and individual character template generation method flow chart;
Figure 14 shows all-purpose computer or microcontroller hardware and system structure diagram;
Specific implementation
For clarity and conciseness, all features of actual embodiment are not described in the description.But should Understand must be made during any this practical embodiments are developed much specific to embodiment decision to realize The objectives of developer, for example, meet those restrictive conditions related to system and business, and these restrictive conditions can It is able to can be changed with the difference of embodiment.Also need to explanation is a little in order to avoid the mould because of unnecessary details The present invention has been pasted, illustrate only in the accompanying drawings with being walked according to the closely related device structure of the solution of the present invention and/or processing Suddenly, the other details little with relation of the present invention are eliminated.
The specific embodiment party of a Chinese of the method invention of data correlation characteristic value pattern or template is made according to extraction Formula, with reference to shown in Fig. 2, plate performs following steps to illustrate verb-noun (value) the chain data correlation pattern of making or mould:
S201 step 1:Judge languages, sentence mark noun, verb, sentence mark subject and predicate, guest
ICTCLAS pairs of the Chinese lexical analysis system based on the hidden horse model of multilayer that the Computer Department of the Chinese Academy of Science can be used to develop Input document is segmented and noun, the verb of each sentence of part-of-speech tagging.
Syntactic analysis instrument simultaneously carries out syntactic analysis to text data, marks out the subject, predicate, object of each sentence.
The text of input is segmented, part-of-speech tagging, name Entity recognition and interdependent syntactic analysis etc. operation.It is wherein interdependent Syntactic analysis refers to a sentence being parsed into such a tree, and sentence center word aroused in interest, which is in, dominates other words Center, other words directly depend on a certain word, any one word depended on when all different it is two or more other Word.Name Entity recognition refers to identifying the word that real-life entitative concept is represented in text.Using reference resolution Method, the object entities such as pronoun are reduced, according to the different type main body of data resource, carry out subject noun or object name The reference resolution processing of word.
Wherein it is possible to passive voice subject is labeled as object;
Because these operations are not closely related with purport of the invention and prior art can be used to carry out, herein no longer It is described in detail.
S202 step 2:Extract overlapping subject noun, predicate verb, object noun, and master-meaning/meaning-guest's relation.
The overlapping phrase for being labeled as subject and noun in sentence set is extracted, it is overlapping in extraction sentence set to be labeled as predicate With the phrase of verb, the overlapping phrase for being labeled as object and noun is extracted, obtains the name set of words as subject respectively, predicate Verb set, and the name set of words as object, and they are respectively corresponding to subject-predicate/predicate-object in sentence Incidence relation, subject name set of words: predicate verb set/predicate verb set: object name set of words, i.e. noun: verb/dynamic Word: the phrase combination of noun and subject-predicate between them/predicate-object (one-to-one) linked character relation;
It can obtain such as the instance data of 2 table of table 1 below table 3:
Table 1
Table 2
Table 3
S203 step 3:Word frequency cumulative statistics, high frequency acquisition model/template
The word frequency of cumulative statistics subject noun respectively, predicate verb and object verb, mark out and be used as measurement subject name Set of words: predicate verb set/predicate verb set: object name set of words has the phrase weight feature of (one-to-one) incidence relation It is worth size,
That is, subject noun word frequency n: predicate verb word frequency v
Predicate verb word frequency v: object noun word frequency n2 (word frequency v, n, n2 are positive integers),
Obtain (main body) data resource of incidence relation weight
Noun: verb ≈ word frequency than n: v set and
Verb: noun ≈ word frequency is gathered than v: 2n,
It is verb-noun (value) chain data correlation pattern or template (set) that high-frequency phrase is chosen in set.
Word frequency mark instance data table 1, table 2, table 3 can obtain table 4 below, table 5, the phrase word frequency data (set) of table 6, Wherein, v represents the accumulative word frequency of verb, and n represents subject noun and adds up word frequency, and n2 represents object noun and adds up word frequency.
Part of speech Subject noun Predicate verb Object noun
Phrase Teacher Education Schoolboy
Word frequency N=120 V=2000 N2=150
Grammer Subject Predicate
Predicate Object
Table 4
Part of speech Subject noun Predicate verb Object noun
Phrase Schoolboy Like Time
Word frequency N=130 V=1 ten thousand N2=200
Grammer Subject Predicate
Predicate Object
Table 5
Part of speech Subject noun Predicate verb Object noun
Phrase Teacher Like Time
Word frequency N=120 V=1 ten thousand N2=200
Grammer Subject Predicate
Predicate Object
Table 6
Verb-noun (the valency of (main body) the data resource instance data table 4 of incidence relation weight can be obtained Value) chain data correlation pattern or template:
Table 7
Table 8
The table 8 of table 7 deploys verb-noun (value) chain data correlation pattern or template (set) example is expressed as:
Noun: verb=teacher: education ≈ word frequency is than n: v ≈ 120: 2000
Verb: noun=education: schoolboy ≈ word frequency is than n: v ≈ 2000: 150
Same method can obtain other verb-noun (value) chain data correlation patterns or template instances data, set Middle selection high-frequency phrase is verb-noun (value) chain data correlation pattern or template (phrase set).
Method according to extraction making data correlation characteristic value pattern or template invents the specific embodiment party of another Chinese Formula, with reference to shown in Fig. 3, make verb-noun (value) chain data correlation pattern or template performs following steps to illustrate:
Wherein, step S301 and Fig. 2 S201, step S302 and Fig. 2 S202, step S303 and Fig. 2 S203 phase Together;
S304 step 4:Merge noun of the same name
Merge noun: (one-to-one) the association phrase of fixation and verb of verb: the fixation (one-to-one) of noun associates phrase The noun phrase of front and rear repetition, and connect two words composition too many levels association phrase chain:
...
...: verb: identical noun
Identical noun: verb: ...
...
Merge table 4, table 5, the noun of the same name of table 6 and word frequency:
Verb Word frequency Noun Word frequency (master+guest)
Like V=1 ten thousand Time N+n2=100+200
Education V=2000 Teacher N+n2=120+300
Schoolboy N+n2=130+150
... ... ... ...
Table 9
Merge and connect noun of the same name, obtain ...: verb: (merging identical) noun: verb: (merging identical) noun: verb : ... too many levels phrase chain, so as to by the series connection formation of the man-to-man phrase chain (set) in part with verb/noun phrase Alternately be hinged node ...: verb: noun: verb: noun: ... too many levels various dimensions space verb/noun replace word Group chain, in some instances it may even be possible to form the too many levels closed loop association phrase of the verb/noun alternating phrase for the closed loop that head and the tail interlink Chain,
It is reciprocal to obtain alternate cycles ...: verb: noun: ... link association phrase weight ≈ word frequency ratios ...: v: (n+ n2)∶...
Verb-noun (value) chain data correlation pattern or template (word frequency v, n, n2 are positive integers)
Table 9 merges into table 10
Table 10
Wherein it is possible to utilize the natural language processing such as corpus, digital dictionary, ontologies storehouse aid total score Analysis, the adverbial word mark for the predicate verb of each sentence is carried out, accumulative word frequency is also counted when extracting adverbial word, obtains:
Adverbial word: verb: the adverbial word of the associated weights of noun: verb: noun ≈ word frequency is than a: v: n2
Obtainable instance data is:
It is strict: education: schoolboy's ≈ word frequency is than 1500: 2000: 150
Table 11
Or
Noun: adverbial word: verb-noun (value) chain data correlation pattern or template (word frequency of the verb ≈ word frequency than n: a: v V, n, n2 are that positive integer a is 0) natural number can be;
Obtainable instance data is:
Teacher: strict: education ≈ word frequency is than 120: 1500: 2000
Table 12
Or obtain the phrase chain of too many levels:
Alternate cycles are reciprocal ...: adverbial word: verb: noun: ... the associated weights of phrase ...: adverbial word: verb: noun : ... ≈ word frequency ratio ...: a: v: (n+n2): ... verb-noun (value) chain data correlation pattern or template (word frequency v, n, N2 is that positive integer a is 0) natural number can be;Wherein, adverbial word can be sky.
Instance data is:
...: strict: education: schoolboy: ... ≈ word frequency is than 1500: 2000: (130+150)
Table 13
According to the comparison method of verb-noun (value) chain data correlation pattern or template invention a Chinese it is specific Embodiment, with reference to shown in Fig. 4, following steps are performed to illustrate:
S401 step 1:Mutual comparison template (comparing phrase combination or the set of phrase chain)
Verb-noun (value) chain data correlation pattern that first, second, third, different (main body) data resource of fourth make or Template (phrase set or the set of phrase chain), the comparison of phrase is mutually carried out,
If without identical phrase, return starts;
S402 step 2:If identical phrase or phrase chain,
Obtain:
Identical verb,
Identical noun,
Identical verb and identical noun,
Identical noun and identical verb,
Or identical verb/noun alternating too many levels phrase chain,
Wherein, if adverbial word can also add the comparison of identical adverbial word, i.e. identical noun, identical adverbial word and identical dynamic Word, and identical adverbial word, identical verb and identical noun;
S403 step 3:Compare word frequency ratio;
Identical phrase or phrase chain carry out word frequency than comparison.
S404 step 4:If the word frequency of identical phrase or phrase chain is than equal:
Identical verb and identical noun: word frequency is more equal than v: n2
Instance data:Like: comparison of the time=word frequency than v: n2
Table 14
Sample result data:
1, template first:Verb is liked: noun time ≈ word frequency is than the ≈ 50: 1 of v: n2 ≈ 10,000: 200
Template second:Verb is liked: noun time ≈ word frequency is than the ≈ 120: 1 of v: n2 ≈ 120,000: 1000
Template third:Verb is liked: noun time ≈ word frequency is than the ≈ 50: 1 of v: n2 ≈ 1,000,000: 2 ten thousand
2, template first and the word frequency of template third ratio, like time degree the same, template successful match.
Identical noun and identical verb: word frequency is more equal than n: v
Instance data:Teacher: education=word frequency is than n: v
Table 15
Sample result data:
1, template first:
Noun teacher: verb educates ≈ word frequency than the ≈ 3: 25 of n: v ≈ 120: 2000
Template second:
Noun teacher: verb educates ≈ word frequency than the ≈ 1: 15 of n: v ≈ 20,000: 30 ten thousand
Template third:
Noun teacher: verb educates ≈ word frequency than the ≈ 3: 25 of n: v ≈ 4800: 4 ten thousand
2, template first and the word frequency of template third ratio, like time degree the same, template successful match.
Identical verb/noun replaces too many levels phrase, and word frequency is than identical:
The verb/noun alternating phrase chain of identical too many levels, word frequency ratio ...: v: (n+n2): v: ..
Instance data:...: teacher: education: schoolboy: like: the time: ...=word frequency ratio ...: v: (n+n2): v: ..
Table 16
Sample result data:
1, template first:
Teacher: ≈ word frequency schoolboy: education: is liked than n+n2: v: n+n2: v ≈ 120+300: 2000: 130+150: 1 ten thousand: 100+200≈420∶2000∶280∶10000∶300≈21∶100∶14∶500∶15
Template second:
Teacher: ≈ word frequency schoolboy: education: is liked than n+n2: v: n+n2: v ≈ 150+350: 1000: 120+130: 9000 ∶140+170≈500∶1000∶250∶9000∶310≈50∶100∶25∶900∶31
Template third:
Teacher: ≈ word frequency schoolboy: education: is liked than n+n2: v: n+n2: v ≈ 440+400: 4000: 210+350: 2 ten thousand: The ≈ 21: 100: 14: 500: 15 of 400+200 ≈ 840: 4000: 560: 2 ten thousand: 600
Template fourth:
Teacher: ≈ word frequency schoolboy: education: is liked than n+n2: v: n+n2: v ≈ 360+900: 6000: 390+450: 3 ten thousand: The ≈ 21: 100: 14: 500: 15 of 300+600 ≈ 1260: 6000: 840: 3 ten thousand: 900
As a result:
Template first is total flux matched with template third (n+n2)
Template first matches entirely with template fourth (n, n2) component
Template successful match.
S405:Word frequency with noun phrase is than result is:
Identical verb, word frequency ratio:
Embodiment data:The comparison of identical verb " liking "
Table 17
Sample result data:
1, object noun of the template first with template second without common " liking ", there is no the subject noun of " liking " yet;
2, template second is 100 times more than template first word frequency;
3, the related high frequency noun of template first verb " liking " is:
Teacher (120+300=420)/time (100+200=300)
Schoolboy (130+150=280)/art (n+n2) ...
4, the related high frequency noun of template second verb " liking " is:
Civil servant (2000+2000=4000)/film (3000+1300=4300)
White collar (1500+2000=3500)/net purchase (n+n2) ...
Identical noun, word frequency ratio:
Embodiment data:The comparison of identical noun " schoolboy "
Table 18
Sample result data:
1, template first is with template second without the identical verb for " schoolboy ";
2, template second is more 10 times than template first " schoolboy " word frequency;
3, the related high-frequency verb of the noun " schoolboy " of template first is respectively:
(v) .. is good in education (2000)/liking (10,000) to convince (v) by patient analysis/
4, the related high-frequency verb of the noun " schoolboy " of template second is respectively:
Temper (18000)/study (140,000) love (v)/and have deep love for (v) ...
Identical verb and identical noun, word frequency ratio:
The instance data of table 11:Like: comparison of the time=word frequency than v: n2
As a result:Template first and template second
Word frequency ratio is 50: 1 (it is the time that 50, which are liked the inside 1) and 120: 1 (it is the time that 120, which are liked the inside 1), is said Inside " the liking " of bright template first, " time " accounting is bigger;
I.e.:Two template noun word frequency respectively verb word frequency proportion with their word frequency than inversely proportional relation
Display instance data word frequency compares difference value
Template first word frequency ratio: 50 1
Phrase chain Like Time
Template second word frequency ratio: 120 1
Table 19
Identical noun and identical verb, word frequency ratio:
The embodiment data of table 12, teacher: education=word frequency is than n: v
Template first and template second
Word frequency ratio is 3: 25 (25 education the insides 3 are teachers) and 1: 15 (it is teacher that 15, which are educated the inside 1), explanation Template first education the inside, teacher's accounting are bigger;
I.e.:Two template noun word frequency compare direct proportionality in the proportion of verb word frequency with their word frequency
Display instance data word frequency compares difference value
Template first word frequency ratio: 3 25
Phrase chain Teacher Education
Template second word frequency ratio: 1 15
Table 17
Identical verb/noun replaces too many levels phrase, word frequency ratio:
The embodiment data of table 13,
The verb/noun alternating phrase chain of identical too many levels, word frequency ratio ...: v: (n+n2): v: ..
Example:
...: teacher: education: schoolboy: like: the time: ...=word frequency ratio ...: v: (n+n2): v: ..
Template first mismatches with template second:
Template first:Teacher: education: schoolboy: like: time ≈ 21: 100: 14: 500: 15
Template second:Teacher: education: schoolboy: like: time ≈ 50: 100: 25: 900: 31
Display instance data word frequency compares difference value:
Table 20
According to the Chinese invented using verb-noun (value) chain data correlation template construct class template method Embodiment, with reference to shown in Fig. 5, following steps are performed to illustrate:
S501 step 1:(the same class noun of classification/cluster, left and right is adjacent respectively to retain a verb, obtains template fragment)
By natural language processing instruments such as corpus, ontologies storehouses, to the verb of (main body first) data resource- In noun (value) chain data correlation template set (template a, template b, template c, template d, template e... can be included), own Noun phrase classify/cluster, and respectively retains at least one phrase (i.e. phase to template position where same class noun or so is adjacent Adjacent verb) principle, a part of verb-noun (value) chain data correlation template fragment is chosen, specifically, to same class noun Place template position or so is adjacent respectively retain several phrases can, but at least to retain an adjacent verb phrases, divide Go out the data correlation template fragment set of generic verb-noun (value) chain, form higher level classification/cluster template, That is, verb-noun (value) the chain data correlation template (segment) that each class template may include less chain is gathered, and right Classification is named:
Item name a ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... : verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... data correlation template fragment (word frequency V, n, n2 are positive integers) }
Table 21
Wherein, it is with class noun 1=schoolboys and with the common classification of class noun schoolgirl 2=:Student
In table 19, the same class noun of student (schoolboy and schoolgirl) left and right is adjacent, and respectively to retain at least one phrase (i.e. adjacent Verb) principle, choose a part of verb-noun (value) chain data correlation template piece:
Table 22
Handle regular phrase segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: dynamic Word 2: ...;... data correlation template phrase chain segment)
Characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v ∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word frequency ratio (word frequency v, N, n2 are positive integers))
According to the instance data of table 20 so as to obtaining:
Student's class template process data { education: schoolboy: likes ≈ word frequency than v: (n+n2): v: ≈ 2000: 130+ 150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand }
Wherein, same method, the identical similar synonym near synonym of verb phrases can also be used, divides the class of similar verb Other template, i.e. verb class template includes verb-noun (value) chain data correlation template set of multiple similar verbs.
S502 step 2:Form class template (data clues+processing rule+characteristic value collection)
Classification identical verb-noun (value) chain data correlation template fragment set, with original (verb, noun or pair Word) phrase gathers as a part of thread group zygonema rope (first), each retains so that template position where same class noun or so is adjacent The principle of at least one verb phrases obtains and matching template fragment approach is gathered as a part of regular rule of combination (second), A part of statistical characteristics parameter combination is used for with the word frequency of similar verb-noun (value) chain data correlation template fragment Characteristic value (3rd) is gathered,
It is simplified shown as:Class template (data clues+processing rule+characteristic value collection), expansion is expressed as:
Class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...)+ Handle regular phrase segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2 ∶...;... data correlation template phrase chain segment) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ words Frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v ∶...;... word frequency ratio (word frequency v, n, n2 are positive integers)) }
Table 23
With table 20, data creating student's class template of table 21:
Clue (first) is gathered:Education, schoolboy, like, cherish, schoolgirl, net purchase
Regular (second) set:{ with each reservation in same class noun (schoolboy, schoolgirl) left and right and matching one verb (religion Educate: schoolboy: like;Love: schoolgirl: net purchase) principle obtain and comparison template segment as rule
Characteristic value (3rd) is gathered: { education: schoolboy: likes ≈ word frequency than v: (n+n2): v: ≈ 2000: 130+150: 1 Ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand }
It is simplified shown as:Class template (data clues+processing rule+characteristic value collection)
Finally make:Student's class template data clues (education, schoolboy, like, cherish, schoolgirl, net purchase) + processing rule (education: schoolboy: is liked;Love: schoolgirl: net purchase)+characteristic value collection (and education: schoolboy: like ≈ words Frequency ratio v: (n+n2): v: ≈ 2000: 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand) }
Same method, other templates c, template d, template e... of (main body first) data resource can also make it is more other Class template, so as to collectively constitute the class template (set) of (main body first) data resource.
Wherein, other grouped datas such as geographical location information, temporal information can also be used, with verb-noun (value) Chain data correlation template classification rule together, collectively constitute class template rule, for example, setting receive geographical location information and when Between information:
Class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...;A <North latitude N1 ", east longitude E1 ">、B<North latitude N2 ", east longitude E2 ">、C<North latitude N3 ", east longitude E3 ">、D<North latitude N4 ", east longitude E4 ">)+ Processing rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2 ∶...;Place order rule A: D: C: B) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ... ∶v∶(n+n2)∶v∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word Group chain segment word frequency ratio (word frequency v, n, n2 are positive integers);Place time value A:Time1, B:Time2, C:Time3, D: time4)}
Embodiment data:
Beijing Polytechnical University's Tongzhou Students in Branch Schools class template data clues (education, schoolboy, like, cherish, schoolgirl, net purchase A<North latitude 39.8N1 ", east longitude 116.6E1 ">、B<North latitude 39.8N2 ", east longitude 116.6E2 ">、C<North latitude 39.8N3 ", east longitude 116.6E3”>、D<North latitude 39.8N4 ", east longitude 116.6E4 ">)+processing rule (education: schoolboy: is liked;Love: schoolgirl: Net purchase;Place order rule A: B: C: D)+characteristic value collection (education: schoolboy: like ≈ word frequency than v: (n+n2): v: ≈ 2000: 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand;Place Time value A:am8:00-pm16:30, B:pm16:30-18:30, C:pm18:30-17:30, D:pm22:00-am8:00)}
Wherein, A<North latitude 39.8N1 ", east longitude 116.6E1 ">、B<North latitude 39.8N2 ", east longitude 116.6E2 ">、C<North latitude 39.8N3 ", east longitude 116.6E3 ">、D<North latitude 39.8N4 ", east longitude 116.6E4 ">It is the teaching in Beijing Polytechnical University Tongzhou branch school respectively Building, sports ground, campus leisure field and dormitory geographical position longitude and latitude;Place order rule A: B: C: D, the conversion of student place Flow ordering rule;The time value A in place:am8:00-pm16:30, B:pm16:30-18:30, C:pm18:30-17:30, D:pm22:00-am8:00) it is to be counted in the related place residence time;
Wherein it is possible to verb-noun (value) chain data correlation template construct list for unit (department) data resource The class template of position (department);
Wherein it is possible to made for verb-noun (value) the chain data correlation template extraction of personal population data resource The class template of personal colony;
Wherein it is possible to verb-noun (value) chain data correlation template extraction for office complex sample data resource Make the class template of office complex;
Further, with reference to shown in Fig. 6 class template comparison method, perform following steps and illustrate:
Before starting comparison:
(main body first) data resource student's class template (education, schoolboy, like, cherish, schoolgirl, net purchase)+ (education: schoolboy: like;Love: schoolgirl: net purchase)+education: ≈ word frequency schoolboy: is liked than v: (n+n2): v: ≈ 2000 : 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand) }
The student's class template content for comparing object (main body second) data resource is unknown:Student's class template (data clues+ Handle rule+characteristic value collection).
Start:
S601 step 1:According to former class template identical rule, to object (main body second) data resource classification of comparison Verb-noun (value) chain data correlation template (set) of template, is extracted in the data clues of former class template with title word Group;I.e. (education, schoolboy, like, cherish, schoolgirl, net purchase) phrase;
If not extracting identical phrase, return starts;
If comparing and successfully having extracted the same noun phrase of above-mentioned whole, student's class of object (main body second) data resource is compared Template data clues=(education, schoolboy, like, cherish, schoolgirl, net purchase+processing rule+characteristic value collection
Into in next step;
S602 step 2:Classified according to the identical same classification/clustering rule of former class template, obtain same category word Group chain segment:
To in verb-noun (value) the chain data correlation template (set) of object (main body second) data resource of comparison, All noun phrases classify/cluster, and respectively retain identical individual phrase original to adjacent with template position where class noun or so Then, identical a part of verb-noun (value) chain data correlation template fragment is chosen, marks off generic verb-noun (valency Value) chain data correlation template fragment set, form higher level classification/cluster template, i.e. each class template may include Verb-noun (value) chain data correlation template (segment) set of less chain, and classification is named:
Compare object (main body second) data resource:
Object type title A rule { verbs 11: with class noun 11: verb 11;Verb 22: with class noun 22: verb 22 }
Compare former (main body first) data resource:
(main body first) data resource student's class template (education, schoolboy, like, cherish, schoolgirl, net purchase)+ (education: schoolboy: like;Love: schoolgirl: net purchase)+education: ≈ word frequency schoolboy: is liked than v: (n+n2): v: ≈ 2000 : 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand) }
In regular data relation template (phrase chain) segment sequence;
Data correlation template if (phrase chain) segment verb 11: with class noun 11: verb 11. and education: schoolboy: happiness It is joyous consistent;Verb 22: with class noun 22: verb 22 and love: schoolgirl: net purchase is consistent, it is consistent equal to handle rule;
Compare student's class template { data clues=(education, schoolboy, like, like of object (main body second) data resource Shield, schoolgirl, net purchase+processing rule=(item name:Student;Template (phrase chain) segment sorts:Verb 11: same to class noun 11: verb 11=education: schoolboy: like;Verb 22: with class noun 22: verb 22=loves: schoolgirl: net purchase)+feature Value set }
Into in next step;
If wherein there is unmatched phrase, return starts;
S603 step 3:Whether equal compare extraction word frequency ratio
Rule is handled according to former class template identical, compares that similar in two class templates to include same names identical The word frequency ratio of phrase chain segment sequence:
Carry out comparison of the word frequency than characteristic value:
Compare object (main body second) item name A characteristic value collection { verbs 11: with class noun 11: the ≈ word frequency of verb 11 compares v ∶(n+n2)∶v;Verb 22: with class noun 22: the ≈ word frequency of verb 22 is than v: (n+n2): v }
With
Former (main body first) data resource student's class template (education, schoolboy, like, cherish, schoolgirl, net purchase)+ (education: schoolboy: like;Love: schoolgirl: net purchase)+education: ≈ word frequency schoolboy: is liked than v: (n+n2): v: ≈ 2000 : 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand) }
It is compared;If
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ schoolboys: happiness Joyous ≈ word frequency is than v: (n+n2): v: ≈ 2000: 130+150: 1 ten thousand;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ is cherished: female learns It is raw: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand
Word frequency is than identical or be approximately equal to, and as the match is successful for class template,
In brief:
Compare object (main body second) data resource
Student's class template data clues=(education, schoolboy, like, cherish, schoolgirl, net purchase)+processing rule= (template (phrase chain) segment sorts:Verb 11: with class noun 11: verb 11=education: schoolboy: like;Verb 22: similar Noun 22: verb 22=loves: schoolgirl: net purchase)+characteristic value collection=(verb 11: with class noun 11: the ≈ word frequency of verb 11 Than v: (n+n2): v ≈ verbs 1: with class noun 1: the ≈ of verb 1 education: schoolboy: like ≈ word frequency than v: (n+n2): v: ≈ 2000: 130+150: 1 ten thousand;
Verb 22: with class noun 22: the word frequency of verb 22: ≈ is than v: (n+n2): v ≈ verbs 2: with class noun 2: the ≈ of verb 2 Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand)) }
The match is successful for the comparison of two (main body the first and second) student's class templates;
Otherwise word frequency is that it fails to match for two class templates than not waiting, and return starts;
Wherein, if setting other classifying rules, according to other rule determine matching compare success or not, all matching into Work(can be just the success of whole template matches, and the failure of any one local matching can all cause whole template matches to fail.
The Chinese invented according to the method using verb-noun (value) chain data correlation template construct custom built forms Embodiment, with reference to shown in Fig. 7, perform following steps and illustrate:
By the use of can be as verb-noun (value) chain number of unit (department) data resource of theme (first) data resource According to relation template, verb-noun (value) chain of the personal population data resource with target (second) data resource can be used as Data correlation template, production unit's supply-personal group need custom built forms:
(d) S701 step 1:In verb-noun (value) chain data correlation template set of unit (department) data resource In, choose the name word list of high word frequency;
Instance data is:Schoolboy, young man ...
(e) S702 step 2:With the noun " schoolboy " of the high frequency name word list, personal colony's number with example table 24 Match and compare according to the noun in verb-noun (value) chain data correlation template set of resource;
Table 24
(f) S703 step 3:Matched in verb-noun (value) chain data correlation template of personal population data resource Successful noun position of the same name, embodiment schoolboy is in verb-noun (value) chain data correlation template ...: training in rotation: teacher : education: noun schoolboy of the same name: like: the time: ... ≈ word frequency ratio ...: 3000: (120+300): 2000: (130+150): 1 Ten thousand: (100+200): ...), to the left, to the right or to the left and right sides verb/noun word alternating phrase chain on choose a verb and One noun;
Teacher: education ≈ word frequency ratios (120+300): 2000
Like: time ≈, word frequency was than 10,000: (100+200)
The verb-noun that the position is chosen to the left, to the right or to the left and right sides (not including noun schoolboy of the same name) is handed over Chain is associated for phrase:
Teacher: education ≈ word frequency ratios (120+300): 2000
Like: time ≈, word frequency was than 10,000: (100+200) turns into unit (department) data resource and provided with personal population data The data correlation custom built forms (set) in source.
According to it is a kind of based on personal mobile device carry out data mining intelligence system invent, a Chinese specific reality Mode is applied, the schematic diagram with reference to shown in Fig. 8, to illustrate intelligence system:
Intelligence system includes corpus, ontologies storehouse etc., it is characterised in that personal mobile device also includes individual Mobile device output input synchronization module, template characteristic extraction module,
(1) output of personal mobile device input synchronization module, specially the input method software data including smart mobile phone 1 Synchronization module 1-1 and geographical location information synchronization module 1-3, for will be provided to the input text personal data on smart mobile phone 1 The output datas such as the geographical location information of source and smart mobile phone navigation API, which synchronously replicate, collects, there is provided is extracted to 2-1 template characteristics Module is made to use;
Wherein it is possible to pretreatment module 1-2 is used, for carrying out data desensitization, filtering useless and repeat number to synchrodata According to;
(2) template characteristic extraction makes module 2-1, soft to input method on smart mobile phone 1 using natural language processing technique Personal data resource synchronous part data simultaneous module 1-1 carries out data mining, characteristics extraction, make personal data pattern or Template;
Wherein it is possible to make the data pattern or template of personal data resource in smart mobile phone 1;
Wherein it is possible to verb-noun (value) chain data correlation template and class template are made,
The step as shown in Fig. 2 Fig. 3 is performed, verb-noun (value) chain data correlation template is made, in page 28 It is described in detail to page 33, can be directly referenced next, it is not repeated to describe.
Step as shown in Figure 5 is continued executing with, makes and obtains class template (data clues+processing rule+characteristic value collection Close), it is described in detail at page 40 to page 44, can be directly referenced next, it is not repeated to describe.
Wherein, verb phrases can also be used identical close (synonym near synonym), is divided into the class template of similar verb;
Wherein, other homogenous characteristics data such as geographical position, time can also be extracted, with verb-noun (value) chain number According to relation template grouped data together, the more complicated class template of following form is formed:
Class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...;A <North latitude N1 ", east longitude E1 ">、B<North latitude N2 ", east longitude E2 ">、C<North latitude N3 ", east longitude E3 ">、D<North latitude N4 ", east longitude E4 ">)+ Processing rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2 ∶...;Place order rule A: D: C: B) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ... ∶v∶(n+n2)∶v∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word Group chain segment word frequency ratio (word frequency v, n, n2 are positive integers);Place time value A:Time1, B:Time2, C:Time3, D: time4)}
Embodiment data:
Beijing Polytechnical University's Tongzhou Students in Branch Schools class template data clues (education, schoolboy, like, cherish, schoolgirl, net purchase A<North latitude 39.8N1 ", east longitude 116.6E1 ">、B<North latitude 39.8N2 ", east longitude 116.6E2 ">、C<North latitude 39.8N3 ", east longitude 116.6E3”>、D<North latitude 39.8N4 ", east longitude 116.6E4 ">)+processing rule (education: schoolboy: is liked;Love: schoolgirl: Net purchase;Place order rule A: B: C: D)+characteristic value collection (education: schoolboy: like ≈ word frequency than v: (n+n2): v: ≈ 2000: 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand;Place Time value A:am8:00-pm16:30, B:pm16:30-18:30, C:pm18:30-17:30, D:pm22:00-am8:00)}
Wherein, A<North latitude 39.8N1 ", east longitude 116.6E1 ">、B<North latitude 39.8N2 ", east longitude 116.6E2 ">、C<North latitude 39.8N3 ", east longitude 116.6E3 ">、D<North latitude 39.8N4 ", east longitude 116.6E4 ">It is the teaching in Beijing Polytechnical University Tongzhou branch school respectively Building, sports ground, campus leisure field and dormitory;Place order rule A: B: C: D, the flow rules of student place conversion;
Place time value A:am8:00-pm16:30, B:pm16:30-18:30, C:pm18:30-17:30, D:pm22:00- am8:00) it is to be counted in the related place residence time;
The synchronous personal sample data resource of personal mobile device, obtains verb-noun (value) chain data correlation template The personal class template (data clues+processing rule+characteristic value collection) of composition;Wherein, due to being the first person, default Subject noun;
The personal population sample data resource of personal mobile device convergence, obtains numerous seriation verb-nouns (value) The personal demographic categories template (data clues+processing rule+characteristic value collection) of chain data correlation template composition.
Personal mobile device is based on according to one kind, carries out the specific reality of a Chinese of the intelligence system invention of data mining Mode is applied, the schematic diagram with reference to shown in Fig. 9, to illustrate intelligence system:
The server or PC 3 of unit include common data and excavate shared companion module 3-1, data mining common platform Server 2 includes:Template characteristic extraction module 2-1, template matching module 2-2, ATL 2-3, matching result feed back and disappeared Message communication module 2-4, wherein:
(1) output input personal data synchrodata module, specially input method software data simultaneous module 1-1, use Collection is replicated in personal data resource synchronous asynchronous will be inputted to input method on personal mobile device (i.e. smart mobile phone 1) etc., (or after pretreatment module 1-2 carries out data desensitization pretreatment to synchrodata) downloads to data mining common platform service Device 2;
(2) common data excavates shared companion module 3-1, for the unit on the server or PC 3 by unit (department) It is public to share to data mining for (department) data for electronic documents resource, particularly text data, in manual or automated manner convergence Platform Server 2, numerous unit data resource convergence composition office complex sample data resources;
Wherein, data prediction is carried out by pretreatment module 3-2 before sharing, exclude that content is identical or content is similar and Time identical electronic document;
(3) template characteristic extraction make module 2-1, for data resource carry out data mining, make data pattern or Template;
Wherein, the personal data resource synchronously downloaded on numerous personal mobile devices is pooled together to form personal colony's number According to data mining is carried out, characteristics extraction, personal population data model, pattern or template are made;Personal mobile device convergence Personal population sample data resource, obtain the individual crowd of numerous seriation verb-noun (value) chain data correlation template compositions Body class template (data clues+processing rule+characteristic value collection);
Wherein, to being set from unit (department) data resource, the office complex sample data resource of numerous units convergence, movement Standby synchronous personal data resource, personal population sample data resource, the personal population sample data resource of mobile device convergence With the blended data resource (being used for unit supply template and demands of individuals template matches), individual number of unit (department) data resource According to the blended data resource of resource and unit data resource (being used for individual speciality template and unit post template matches), whole numbers Data mining is carried out according to resource, makes data pattern or template;
Wherein it is possible to make verb-noun (value) chain data correlation template and class template, verb-noun is obtained (value) chain data correlation template and class template;
Perform as the step of Figure 10 can supply with production unit-class template of personal group need:It is personal to compare object Population data student's class template data clues (education, schoolboy, like, cherish, schoolgirl, net purchase)+processing rule (education: schoolboy: like;Love: schoolgirl: net purchase)+characteristic value collection (and education: schoolboy: like ≈ word frequency than v: (n+ N2): v: ≈ 2000: 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand) }
(f) S1001 step 1:The comparison of class template-item name:
Item name " student " class gathered with unit (department) class template (segment) of unit (department) data resource Not, the item name in personal demographic categories template (segment) set of personal population sample data resource is compared;
(g) S1002 step 2:Item name have it is identical, directly quote compare object class template content, obtain unit Supply-personal group need name class template:
The match is successful for " student " item name, obtains " student " item name of the same name and adheres to unit (department) classification mould separately Two set of plate (segment) and personal demographic categories template (segment);
The item name of above-mentioned " student " of the same name and affiliated comparison object individual's demographic categories template are directly quoted in selection (segment) is gathered, component unit supply-personal group need class template;Result is unit supply-personal group need classification Template=unit item name of the same name+personal colony class template of the same name (data clues+processing rule+characteristic value collection) is gathered Composition;" student " class template data clues (education, schoolboy, like, cherish, schoolgirl, net purchase)+processing rule (education : schoolboy: like;Love: schoolgirl: net purchase)+characteristic value collection (education: ≈ word frequency schoolboy: is liked than v: (n+n2): v: ≈ 2000: 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand) }
(4) template matching module 2-2, the characteristic value for the data template of different subjects data resource compare;
Wherein, the characteristic value comparison method of its verb-noun (value) chain data correlation template and class template is carried out, It can be used for verb-noun (value) the chain data correlation template and class template, office complex sample of unit (department) Verb-noun (value) chain data correlation template and class template, verb-noun (value) chain data correlation template of individual And class template, verb-noun (value) the chain data correlation template of personal colony and class template, holography verb- The mutually corresponding matching of characteristic value between noun (value) chain data correlation template and class template compares;
Wherein, the template matching of verb-noun (value) chain data correlation performs step as shown in Figure 4, the 34th The 1st row of page is described in detail to page 40 page 7, can be directly referenced, is not repeated to describe.
Wherein, according to data resource different type main body, descriptor can be counted by natural language processing and is used as The noun of subject or semi-artificial selected subject noun;
Wherein, verb-noun (value) chain data correlation template that personal data resource makes, can select certain name Word phrase (for example, interest, hobby, speciality etc.) is mutual to be compared, and matching obtains individual and matches approximate verb-noun value chain pass Connection relation;Or can be according to given noun, in verb-noun (value) chain data correlation template that personal data resource makes On, verb-noun (value) chain data correlation template matching with the population sample data resource making of individual, obtain personal exists (interest, hobby, speciality etc.) verb in (given) noun-verb action (value) chain link in population data resource entirety The difference value positioning scenarios and situation that word frequency degree compares;
Wherein, verb-noun (value) the chain data correlation template made from unit (department) data resource, word frequency is selected High noun phrase, compare from the office complex sample data resource of numerous unit set or obtained from the making of all data resources Verb-noun (value) the chain data correlation template taken, the unit (department) can be obtained and provided in office complex sample data The verb word frequency degree comparison in (high frequency) noun-verb action (value) chain link that source is overall or all data resources are overall Difference value positioning scenarios and situation;
Wherein, in verb-noun (value) the chain data correlation template made from unit (department) data resource, selected word Frequently high noun phrase, verb-noun (value) chain data correlation template from personal population sample data resource is compared;Matching Verb-noun (value) chain data correlation template where successful noun can be as (high frequency) noun of the unit (department) Verb-noun (value) the chain data correlation template for the matching relationship being the theme between supply and personal group need;
Wherein, class template (data clues+processing rule+characteristic value collection), which compares, performs step as shown in Figure 6, It is described in detail at page 44 to page 47, can be directly referenced next, it is not repeated to describe.
Wherein, template matching module can also be included on personal mobile device, is moved for personal mobile device and individual Point-to-point template matching is carried out between equipment, without by being carried out on server;
(5) ATL 2-3, the template of the data resource for preserving various main bodys,
Wherein, preserve unit (department) verb-noun (value) chain data correlation template and class template, handle official business it is comprehensive Close verb-noun (value) the chain data correlation template and class template, verb-noun (value) chain data of individual of sample Relation template and class template, verb-noun (value) the chain data correlation template of personal colony and class template, holography Verb-noun (value) chain data correlation template and class template, unit (department) and personal colony's blended data resource Verb-noun (value) chain data correlation template, the personal verb-noun with unit (department post) blended data resource (value) chain data correlation template, etc. template set;
(6) matching result feedback and messaging module 2-4, feed back for each template matching message data that the match is successful To corresponding each data resource main equipment, and for the interactive message communication between them;
Wherein, it is also used for verb-noun (value) chain data correlation template or class template (data clues+processing rule Then+characteristic value collection) the message data that the match is successful in comparing module feed back to corresponding to each data resource main equipment, it is special It is not and 1 corresponding matching result of smart mobile phone feedback and messaging module 1-4 interactive communications.
Personal mobile device is based on according to one kind, using the custom built forms of verb-noun (value) chain data correlation template Or the specific embodiment party of a Chinese of the intelligence system invention of class template (data clues+processing rule+characteristic value collection) Formula, the schematic diagram with reference to shown in Figure 11, to illustrate the intelligence system:
It is characterised by that personal mobile device smart mobile phone 1 includes:Output input data synchronization module, i.e. input method is soft Part data simultaneous module 1-1, template clue filtering module 1-2, template matches comparing module 1-3, personal cue library 1-6, template Storehouse 1-5, output display correspond to Agent Service content module 1-5;Wherein,
(1) input method software data simultaneous module 1-1, for will be synchronous to the input method input data on smart mobile phone 1 Replicate and collect;
(2) template clue filtering module 1-2, the data being collected into for above-mentioned input method software data simultaneous module 1-1, One by one successively specific filtration resistance in the custom built forms of verb-noun (value) chain data correlation template construct in ATL 1-5 Hints data in the phrases such as all verbs, noun or class template (data clues+processing rule+characteristic value collection), matching Successful result data recorded to obtain personal cue library 1-6, and record accumulative matching times;
(3) template matches comparing module 1-3, template and the template of personal cue library 1-6 extractions for ATL 1-5 Compare;
Wherein, the custom built forms of verb-noun (value) chain data correlation template in ATL 1-5, embodiment data For:Noun third: verb first ≈ word frequency is than n: v teacher ≈: education ≈ word frequency ratios (120+300): 2000 and verb fourth: noun fourth ≈ Word frequency is liked than v: n2 ≈: time ≈, word frequency was than 10,000: (100+200), compares respectively from personal cue library 1-6 and directly extracts system Verb-noun (value) the chain data correlation template of the same name made:Noun third: verb first teacher ≈: education word ≈ frequency ratios n: v and Verb fourth: noun fourth ≈ likes: time ≈, word frequency was than v: n2;
If the word frequency with noun phrase is than identical or be approximately equal to, that is, it is unit supply-personal group need custom built forms Otherwise it is that it fails to match with success;
Wherein, the method for making the custom built forms of verb-noun (value) chain data correlation template, using dynamic shown in Fig. 7 The step of word-noun (value) chain data correlation template construct custom built forms, had been described in.
Class template (data clues+processing rule+characteristic value collection), which compares, performs step as shown in Figure 6, the The row of page 44 the 18th is described in detail to page 47 page 9, can directly it is referenced come, be not repeated to describe.
Wherein, in view of class template=unit item name of the same name+personal colony of unit supply-personal group need is same Name class template (data clues+processing rule+characteristic value collection) collection is combined into, with unit supply-individual in ATL 1-5 Group need class template, compare made from personal cue library 1-6 extractions class template of the same name (data clues+processing rule+ Characteristic value collection).
The class template of the same name (data clues+processing rule+characteristic value collection) made from personal cue library 1-6 extractions is such as Step shown in Fig. 5, make and obtain class template (data clues+processing rule+characteristic value collection), in page 40 the 8th Go to page 44 page 17 and be described in detail, can be directly referenced next, it is not repeated to describe.
(4) output display corresponding data service content module 1-4, for the template in template matches comparing module 1-3 Compare after the match is successful, then the corresponding data, services content set by output display on smart mobile phone 1, general is the confession of unit To data;
(5) ATL 1-5, for preserving the unit supply-individual of verb-noun (value) chain data correlation template construct The set of group need custom built forms and unit supply-personal group need class template set;
Wherein, allow through smart mobile phone 1, can download renewal unit supply-personal group need custom built forms, individual- Team learning associates the templates such as custom built forms to ATL 1-5;
(6) personal cue library 1-6, for export input data synchronization module synchrodata filter verb-noun (valency Value) obtained after hints data in the phrase such as all verbs of custom built forms of chain data correlation template, noun and class template Data form personal cue library 1-6.
Data mining is carried out based on personal mobile device according to one kind, using verb-noun (value) chain data correlation mould The embodiment of one Chinese of the intelligence system invention of plate, the schematic diagram with reference to shown in Figure 12, to illustrate intelligence System:
It is characterised by that personal mobile device smart mobile phone 1 includes:Input method software data simultaneous module 1-1, verb mistake Filter and template generation module 1-2, template send management and matching result feedback interactive communication module 1-3, individual character ATL 1-4, Verb library 1-5;
Data mining common platform server 2 includes:Template receives management and matching result feedback interactive communication module 2- 1st, template matching module 2-2, ATL 2-3;
Wherein, personal mobile device smart mobile phone 1 includes:
(1) input method software data simultaneous module 1-1, the data for will be inputted to the input method on smart mobile phone 1 are same Step, which replicates, collects;
(2) verb filtering and template generation module 1-2, are collected into for above-mentioned input method software data simultaneous module 1-1 Data, compare the verb in filtering verb library 1-5 successively, following steps performed with reference to shown in Figure 13, describe generation in detail Property template:
(e) S1301 step 1:Verb library filter synchronous data
With all conventional verb set of 1-5 in verb library, including " liking ", input method software data syn-chronization is filtered successively The text data that module 1-1 is collected into;
(f) S1302 step 2:Sentence predicate verb where matching verb " liking " mark grammer
Part-of-speech tagging is carried out to sentence where the filtering verb " liking " that the match is successful, marks out the noun come in sentence;
Also Sentence Grammar where the filtering verb " liking " that the match is successful is analyzed, (trying one's best) marks out the master for carrying out sentence Language, predicate and object;
Whether the verb " liking " for judging to be filled into is predicate verb;Namely it is labeled as the filtering verb of predicate;
Wherein, or the text data that the input method of time-wise separation is collected into adds and matches somebody with somebody punctuation mark automatically, enters The processing of the reference resolution of row subject noun or object noun;
(g) S1303 step 3:If the verb " liking " being filled into is predicate verb, the overlapping mark of the sentence is extracted For predicate and the phrase of verb, extraction is overlapping to be labeled as subject and noun, extracts the overlapping phrase for being labeled as object and noun, with And they respectively in respective sentence corresponding subject-predicate/predicate-object one-to-one incidence relation:Subject noun: predicate Verb/predicate verb: object noun;It see the table below 25
Table 25
(h) S1304 step 4:If extract the noun of step 3: verb/verb: the phrase of noun combines and he Between subject-predicate/predicate-object fix (one-to-one) incidence relation, be saved in individual character ATL 1-4;
The data that input method data synchronization module 1-1 is collected into, compare successively described dynamic in filtering individual character ATL 1-4 Word: noun/noun: verb phrases combination, record the matching word frequency of each phrase and mark out, as measurement, two have a pair The index of the phrase weight size of one chain incidence relation, verb word frequency v: noun word frequency n2/ noun word frequency n: verb word frequency v (word frequency v, n, n2 are positive integers),
Obtain verb: noun ≈ word frequency is than v: n2/ noun: verb-noun (value) the chain data of verb ≈ word frequency than n: v Relation template (set);
It is according to the instance data of table 25:
Schoolboy: like ≈ word frequency than 130: 1 ten thousand/like: time ≈, word frequency was than 10,000: 200
(3) template sends management and matching result feedback interactive communication module 1-3, the user management for smart mobile phone 1 Individual character ATL 1-4 shows and sent template to data mining common platform server 2, receives management by template and matching is tied Fruit feeds back interactive communication module 2-1, is compared with the template of specified body data resource, and manages user and corresponding main body is mutual Dynamic communication;
(4) individual character ATL 1-4, for preserving the individual character template set of verb filtering and template generation module 1-2 generations;
(5) verb library 1-5, for preserving conventional verb;
Wherein, Chinese is often included, but are not limited to as follows with verb:
Represent action behavior:Say, see, walking, listening, laughing at, taking, circling in the air, running, eating, singing, drinking, striking, sitting, shouting, staring at, kicking, hearing, Touch, criticize, publicizing, safeguarding, learning, studying, carry out, start, stopping, forbidding
Represent that change be present disappears:, it is dead, have, be equal to, occur, develop, develop, grow, it is dead, exist, eliminate
Represent psychological activity:Think, like, hating, primary, miss, intend, liking, wishing, evil primary, worry, be disagreeable, feeling, thinking
Represent to judge:It is, be, is
Representing may wish necessity (auxiliary verb):Can, can, meeting, can with, be willing to, be ready, agree, dare, should, should, Match somebody with somebody, be worth, would rather
Represent to tend to (directional verb):It is upper and lower, into and out of, return, open, cross,, come up, get off, come in, come out, return Come, come, come, go, up, go down, enter, go out it is main, go back, open, the past
Represent development verb:As grown, withering, germinateing, result, spawning;
For plan, system, scheme, file etc.:
Work out, work out, draft, draft, authorize, audit, examine, transmit, deliver, submit, report, assign, put on record, deposit Shelves, present one's view
For information, data:
Investigate, study, collect, arrange, analyze, conclude, analyze, summarize, provide, report, feed back, pass on, notify, send out Cloth, maintenance management
On a certain work (higher level):
Preside over, organize, instructing, arranging, coordinating, indicating, supervising, managing, distributing, controlling, take the lead it is responsible, examination & approval, authorization, Sign and issue, ratify, assess
Thinking behavior:
Research, analysis, assess, development, suggest, proposal, participate in, recommend, plan
Direct action:
Organize, carry out, performing, instructing, leading, controlling, supervising, use, production, participate in, illustrate, explaining, providing, assisting
Higher level's behavior:
License, ratify, define, determining, instructing, establishing, planning, supervising, determining
Administration behaviour:
Reach, assess, control, coordinate, ensure, identify, keep, supervise
Expert's behavior:
Analyze, assist, promote, get in touch with, suggest, recommend, support, assess, evaluate
Subordinate's behavior:
Check, check, collect, obtain, submit, make
Other:
Maintain, keep, establish, exploitation, prepare, processing, perform, reception, arrange, monitoring, report, manage, confirm, concept Change, cooperate, cooperate, obtain, check, check, get in touch with, design, test, build, change, write, draft, guide, transmit, turn over Translate, operate, ensure, prevent, solve, introduce, pay, calculate, revise, undertake, negotiate, confer, interview, refuse, veto, supervise Depending on, predict, compare, delete, use
Wherein, data mining common platform server 2 includes:
(1) template receives management and matching result feedback interactive communication module 2-1, for receiving the cope plate of smart mobile phone 1 Send management and matching result feeds back the individual character template that interactive communication module 1-3 is sended over, template matching module is forwarded, with mould The specified template in plate storehouse is compared;
Matching result data feedback smart mobile phone 1, and it is interactive logical with comparison template main equipment to carry out smart mobile phone 1 Letter;
(2) template matching module 2-2, for receiving individual character template and the template for the body data resource specified from ATL Compare;And matching result is fed back into interactive communication module 2-1 by template receiving management and matching result and feeds back to smart mobile phone 1;
(3) ATL 2-3, for preserving the ATL of various body data resources;
Wherein, template includes but is not limited to, verb-noun (value) the chain data correlation template and class of unit (department) Other template, verb-noun (value) the chain data correlation template of office complex sample and class template, verb-name of individual Word (value) chain data correlation template and class template, personal colony verb-noun (value) chain data correlation template with And the template set such as class template, verb-noun (value) the chain data correlation template of holography and class template.
Although being described in conjunction with the accompanying several embodiments of the present invention, those of ordinary skill in the art can be with Various deformations or amendments are made within the scope of the appended claims.For example, other people's mobile devices can be converted, including intelligence Energy mobile phone, navigation equipment, car networking equipment, Internet of Things mobile device etc..
Personal mobile device, smart mobile phone, navigation equipment, car networking equipment, Internet of Things mobile device, data mining are public Platform Server, unit service device and PC, Cloud Server, corpus ontology knowledge base service etc., they are all that to include system total Line, CPU, conventional computer system, micro controller system or the embedded system structure of memory and input/output interface, such as scheme Shown by 14.
The embodiment of the simply invention described in this description, various illustrations are not in the essence of invention Appearance is construed as limiting, the specific reality that person of an ordinary skill in the technical field can be described after specification has been read to more than The mode of applying is made an amendment or deformed, without departing from the spirit and scope of invention.
Above in association with specific embodiment describe the present invention general principle but it is to be noted that skill to this area It will be appreciated that the whole or any steps or part of methods and apparatus of the present invention can calculate dress any for art personnel (including processor, storage medium etc.) is put either in the network of computing device in the form of hardware, firmware, software or its combination Realize, this be those skilled in the art in the case where having read description of the invention using its basic circuit design knowledge or The basic programming skill of person can be achieved with.And the invention also provides the journey of several instruction codes for being stored with machine-readable Sequence system product.The instruction code can perform above-mentioned method according to embodiments of the present invention when being read and performed by machine.Phase The storage medium of program product of the ground for carrying the above-mentioned instruction code for being stored with machine-readable is answered to be also included within the present invention Disclosure in.The storage medium includes but is not limited to floppy disk, CD, magneto-optic disk, storage card, memory stick etc..By soft Part or firmware are realized in the case of the present invention from storage medium or network to computer (such as Figure 14 with specialized hardware structure Shown all-purpose computer) installation forms the program computer of the software and various work(is able to carry out when being provided with various programs Can etc..

Claims (10)

1. one kind is based on natural language processing technique, extraction makes the method for data correlation characteristic value pattern or template, its feature It is that making verb-noun (value) chain data correlation pattern or template performs following steps:
(a) step 1:The text data of (main body) data resource is carried out judging languages pretreatment, part-of-speech tagging, marked out Noun, the verb of each sentence;
And syntactic analysis is carried out to text data, mark out the subject, predicate, object of each sentence;
Wherein it is possible to passive voice subject is labeled as object;
Wherein, the reference resolution processing of subject noun or object noun is carried out;
(b) step 2:The overlapping phrase for being labeled as subject and noun in sentence set is extracted, extracts overlapping mark in sentence set For predicate and the phrase of verb, the overlapping phrase for being labeled as object and noun is extracted, obtains the name set of words as subject respectively, As the verb set of predicate, and the name set of words as object, and they respectively subject-predicate/predicate in sentence- Linked character relation corresponding to object, subject name set of words: predicate verb set/predicate verb set: object name set of words, That is, noun: verb/verb: the phrase combination of noun and (one-to-one) association of subject-predicate/predicate-object between them Characteristic relation;
(c) step 3:The accumulative word frequency of subject noun, predicate verb and object verb is counted respectively, is marked out and is used as measurement Subject name set of words: predicate verb set/predicate verb set: object name set of words contains the phrase of (one-to-one) incidence relation Weight characteristic value size, i.e.
Subject noun word frequency n: predicate verb word frequency v/
Predicate verb word frequency v: object noun word frequency n2 (word frequency v, n, n2 are positive integers),
Obtain (main body) data resource of incidence relation weight:
Noun: verb ≈ word frequency than n: v set and
Verb: noun ≈ word frequency is gathered than v: 2n,
The word frequency phrase of high frequency is chosen in set turns into verb-noun (value) chain data correlation pattern or template (phrase collection Close).
2. according to claim 1 be based on natural language processing technique, verb-noun (value) chain data correlation mould is made The method of formula or template, it is characterised in that:
Wherein, step 4:Merge noun: (one-to-one) the association phrase of verb is scolded with dynamic: (one-to-one) of noun associates phrase The noun phrase and word frequency of front and rear repetition of the same name:
...
...: verb: identical noun
Identical noun: verb: ...
...
The word of noun two composition too many levels association phrase chain of the same name is connected, is obtained ...: verb: (merging identical) noun: verb: (merge identical) noun: verb: ... too many levels phrase chain, so as to by man-to-man phrase chain (set) series connection in part Formed with verb/noun phrase alternately for hinged node ...: verb: noun: verb: noun: ... too many levels various dimensions Verb/noun replaces phrase chain, in some instances it may even be possible to forms the polycyclic of the verb/noun alternating phrase for the closed loop that head and the tail interlink Save closed loop association phrase chain;
Merge the word frequency n+n2 of subject noun and object noun, it is reciprocal to obtain alternate cycles ...: verb: noun: ... link Associate phrase weight ...: verb: noun: ... ≈ word frequency ratio ...: v: (n+n2): ... verb-noun (value) chain data Association mode or template (phrase set, word frequency v, n, n2 are positive integers),
That is, verb-noun (value) chain data correlation pattern or template (phrase set), both can be man-to-man words Group chain (noun: verb ≈ word frequency is than n: v or verb: noun ≈ word frequency is than v: n2) can also be the phrase chain of too many levels (...: verb: noun: verb ... ≈ word frequency ratios ...: v: (n+n2): v: ...) two kinds of forms.
3. according to claim 1 or 2 be based on natural language processing technique, make and obtain verb-noun (value) chain number According to association mode or the method for template, it is characterised in that:
Wherein it is possible to using the natural language processing such as corpus, digital dictionary, ontologies storehouse aid comprehensive analysis, enter The hand-manipulating of needle is marked to the adverbial word of the predicate verb of each sentence, and cumulative statistics its word frequency is gone back when extracting adverbial word, is obtained:
Adverbial word: verb: the adverbial word of the associated weights of noun: verb: noun ≈ word frequency is than a: v: n2 or noun: adverbial word: verb ≈ (word frequency v, n, n2 are that positive integer a is natural number for verb-noun (value) chain data correlation pattern of the word frequency than n: a: v or template Can be 0);
Or obtain the phrase chain of too many levels:
Alternate cycles are reciprocal ...: adverbial word: verb: noun: adverbial word: ... the associated weights of phrase ...: adverbial word: verb: noun : ... ≈ word frequency ratio ...: a: v: (n+n2): a: ... verb-noun (value) chain data correlation pattern or template (word frequency v, N, n2 are that positive integer a is natural number and can be 0);Wherein, adverbial word can be sky.
A kind of 4. comparison method of verb-noun (value) chain data correlation pattern or template (phrase set), it is characterised in that Perform following steps:
(a) step 1:Verb-noun (value) the chain data correlation pattern or mould that two difference (main body) data resources make Plate (set of phrase chain), the comparison of phrase is mutually carried out,
(b) step 2:If comparison result obtains:
Identical verb,
Identical noun,
Identical verb and identical noun,
Identical noun and identical verb,
Or identical verb/noun alternating too many levels phrase chain,
Wherein, if adverbial word can also add the comparison of identical adverbial word, i.e.
Identical noun, identical adverbial word and identical verb,
Identical adverbial word, identical verb and identical noun,
Into in next step;
(c) step 3:Identical phrase carry out word frequency than comparison;
(d) step 4:Output result:
One, word frequency is than equal result:
Identical verb and identical noun: word frequency is more equal than v: n2, template successful match;
Identical noun and identical verb: word frequency is more equal than n: v, template successful match;
Identical verb/noun replaces too many levels phrase:Word frequency ratio ...: v: (n+n2): v: ...,
1 (n, n2) component is entirely equal;
2 (n+n2) total amounts are equal;
Template successful match;
Two, result of the word frequency than not grade:
Identical verb:
Show the sequence of associated high-frequency noun;
Identical noun:
Show the sequence of associated high-frequency verb;
Identical verb and identical noun:
Two template noun word frequency are poorer than inversely proportional relation, display word frequency ratio in the proportion of verb word frequency and their word frequency respectively Different value;
Identical noun and identical verb:
Two template noun word frequency compare difference with their word frequency in the proportion of verb word frequency than direct proportionality, display word frequency Value;
Identical verb/noun replaces too many levels phrase:
Noun, verb word frequency compare difference value;
Wherein it is possible in verb-noun (value) the chain data correlation template made from unit (department) data resource, selected word Frequently high noun phrase (set), verb-noun (value) chain data correlation mould from personal population sample data resource is compared Plate (set);Verb-noun (value) chain data correlation template where the identical noun that the match is successful can be used as the unit The unit that (high frequency) noun of (department) is the theme supplies the verb-noun (valency of the matching relationship between personal group need Value) chain data correlation template;
Wherein, verb-noun (value) the chain data correlation template made from unit (department) data resource, it is high to select word frequency Noun phrase, compare from the office complex sample data resource of numerous unit set or make what is obtained from all data resources Verb-noun (value) chain data correlation template, it is whole in office complex sample data resource can to obtain the unit (department) It is relative in verb word frequency degree in (high frequency) noun-verb action (value) chain link of body or all data resources entirety The difference value positioning scenarios and situation compared.
5. one kind uses verb-noun (value) chain data correlation template construct class template, and the ratio other side of class template Method,
Characterized in that,
One, the extraction of class template (data clues+processing rule+characteristic value collection), which makes, performs following steps:
(a) step 1:By natural language processing instruments such as corpus, ontologies storehouses, to (main body) data resource In verb-noun (value) chain data correlation template (set), all noun phrases classify/cluster, and to same class noun The adjacent principle for respectively retaining at least one phrase (i.e. adjacent verb) of place template position or so, a selection part (including it is similar Noun) composition verb-noun (value) chain data correlation template fragment, mark off generic verb-noun (value) chain Segment (set), form classification/cluster template, i.e. each class template may include the verb-noun of less phrase chain (value) chain segment (set), and classification is named:
Item name a ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: it is dynamic Word 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... verb-noun (value) chain segment (word Frequency v, n, n2 are positive integers) }
Wherein, same method, the identical similar synonym near synonym of verb phrases can also be used, divides the classification mould of similar verb Plate:
(b) step 2:Classification identical verb-noun (value) chain segment set, with original phrase (all verbs of segment, name Word) a part of clue is used as, combination clue (first) is gathered,
Verb-noun (value) is obtained and matches respectively to retain (identical) principle of at least one verb phrases with class noun or so Chain segment ordering rule is gathered as part rule, rule of combination (second),
Word frequency with verb-noun (value) chain segment of class noun is used for a part of statistical characteristics, assemblage characteristic value ( Three) gather,
Simplify and represent and be developed in details respectively:
Class template (data clues+processing rule+characteristic value collection)=class template data clues (... verb 1, same to class name Word 1, verb 1, verb 2, with class noun 2, verb 2...)+processing rule:Phrase chain segment ordering rule (...: verb 1: same Class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2: ...)+characteristic value collection (...: verb 1: same to class name Word 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency Than ...: v: (n+n2): v: ...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers)) }
Wherein, other classifying rules can also be set, together with verb-noun (value) chain data correlation template classification rule, Form class template rule;
Wherein it is possible to verb-noun (value) chain data correlation template construct unit (portion for unit (department) data resource Door) class template;
Wherein it is possible to make individual for verb-noun (value) the chain data correlation template extraction of personal population data resource Demographic categories template;
Wherein it is possible to made for verb-noun (value) the chain data correlation template extraction of office complex sample data resource Office complex class template;
Two, the comparison of class template (data clues+processing rule+characteristic value collection) performs following steps:
Former class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...)+place Reason rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2 : ...)+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...∶ Verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... phrase chain segment word frequency ratio (word frequency V, n, n2 are positive integers)) }
The object type template (data clues+processing rule+characteristic value collection) of comparison;
(a) step 1:According to former class template identical rule, from the object type template to be compared (data clues+processing rule Then+characteristic value collection) data clues in, extract the same noun phrase in former class template data clues;
If not extracting identical same noun phrase, return starts;
If extracting the same noun phrase of above-mentioned whole, data clues (... verb 1, with class noun 1, verb 1, verb 2, same to class name Word 2, verb 2...) the match is successful for data clues, into next step;
(b) step 2:Equally classify according to former class template/clustering rule classified, obtain same category phrase chain segment Ordering rule:
To in verb-noun (value) the chain data correlation template (set) of the object to be compared (main body) data resource, owning Noun phrase classify/cluster, and respectively retains identical individual phrase principle, choosing to adjacent with template position where class noun or so Identical a part of verb-noun (value) chain segment is taken, marks off generic verb-noun (value) chain segment set, group Constituent class/cluster template, classification is named, obtained:
Object type title A phrase sequence processing regular (phrase chain segment) ...: verb 11: with class noun 11: verb 11 ∶...;...: verb 22: with class noun 22: verb 22: ...;... phrase chain segment }
Compare
Former item name a phrase segment sequence processing rule (...: verb 1: with class noun 1: verb 1: ...;...: verb 2 : with class noun 2: verb 2: ...;... data correlation template phrase chain segment)
In phrase chain segment sequence;
If the phrase chain segment sequence ... of processing rule: verb 11: with class noun 11: verb 11: ... with ...: verb 1: With class noun 1: verb 1: ... matching is consistent;...: verb 22: with class noun 22: verb 22: ... with ...: verb 2: similar Noun 2: verb 2: ... matching is consistent;... handling regular phrase chain segment sequence by that analogy, all unanimously, entrance is next for matching Step;
If phrase chain segment sequence processing rule mismatches, return starts;
(c) step 3:Handle rule according to former class template identical, compare in two class templates it is similar include it is mutually of the same name Claim the word frequency ratio with the sequence of identical phrase chain segment:
Comparison object type template characteristic value set ...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2)∶v∶...;...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word Frequency ratio (word frequency v, n, n2 are positive integers) }
With
Former class template characteristic value collection ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v ∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word frequency ratio (word frequency v, N, n2 are positive integers) }
It is compared;
If:
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 1: same Class noun 1: verb 1: ...;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 2: same Class noun 2: verb 2: ...
... characteristic value word frequency is than all identical or be approximately equal to by that analogy, and as the match is successful for characteristic value collection,
The match is successful for data clues in class template (data clues+processing rule+characteristic value collection), handles the phrase of rule The sequence of chain segment is consistent, and for the word frequency of characteristic value collection than equal, result is that the match is successful for two class templates;
If:Characteristic value word frequency is that it fails to match for class template than not waiting;
Wherein, if also using other classifying rules, determine that matching compares success or not according to other rules, all the match is successful Can be just the success of whole template matches, the failure of any one local matching can all cause whole template matches to fail.
6. the method that one kind uses verb-noun (value) chain data correlation template construct custom built forms, it is characterised in that perform Following steps:
(a) step 1:In verb-noun (value) chain data correlation template set of theme (first) data resource, choose The name word list (for example, noun first, noun second ...) of high word frequency;
(b) step 2:With the noun of the high frequency name word list, the verb-noun (value) with target (second) data resource Noun matching in chain data correlation template set compares;
(for example, noun first with ...: verb third: noun third: verb first: noun first: verb fourth: noun fourth: ... ≈ word frequency Than ...: v: (n+n2): v: (n+n2): v: (n+n2): ... noun third, noun first, noun third compares;)
(c) step 3:What the match is successful in verb-noun (value) chain data correlation template of target (second) data resource Noun position of the same name is (for example, noun first of the same name is in verb-noun (value) chain data correlation template ...: noun third: verb First: noun first position of the same name: verb fourth: noun fourth: .. ≈ word frequency ratios ...: (n+n2): v: (n+n2): v: (n+n2): ...), To the left, at least one verb and a noun are chosen on verb/noun word alternating phrase chain to the right or to the left and right sides;
The verb-noun chosen to the left, the to the right or to the left and right sides alternating phrase of the position (not including noun of the same name) closes Connection chain turns into (for example, noun third: verb first ≈ word frequency ratios (n+n2): v and verb fourth: noun fourth ≈ word frequency is than v: (n+n2)) The data correlation custom built forms (set) of theme (first) data resource and target (second) data resource;
Wherein, by the use of can be as verb-noun (value) chain of unit (department) data resource of theme (first) data resource Data correlation template, the verb-noun (value) of the personal population data resource with target (second) data resource can be used as Chain data correlation template, production unit's supply-personal group need custom built forms;
Wherein, by the use of can be as verb-noun (value) chain of unit (department) data resource of theme (first) data resource Data correlation template, the verb-noun (valency of the office complex sample data resource with target (second) data resource can be used as Value) chain data correlation template, production unit's supply-office complex value chain supply chain custom built forms;
Wherein, by the use of can as the personal data resource of theme (first) data resource verb-noun (value) chain data close Gang mould plate, verb-noun (value) chain data of the personal population data resource with target (second) data resource can be used as Relation template, make individual-team learning contacts custom built forms.
7. a kind of intelligence system that data mining is carried out based on personal mobile device, includes corpus, ontologies storehouse etc., its Be characterised by, personal mobile device include the output of personal mobile device input synchronization module, template characteristic extraction module, its In:
(1) output of personal mobile device input synchronization module, for by the input method on personal mobile device, camera, altogether Internal memory, caching, temporary file caching, application APP record is enjoyed to be saved in local temporary file, network opening interface, lead Navigate the output such as API input personal data resource synchronous asynchronous replicate and collect, there is provided give template characteristic extraction to make module and use, Data desensitization bleaching or image extraction characteristic value are carried out to synchrodata wherein it is possible to pre-process;
(2) template characteristic extraction makes module, and data mining is carried out to personal data resource synchronous on personal mobile device, special Value indicative is extracted, and makes personal model of data, pattern or template;
Wherein it is possible to data model, pattern or the template of personal data resource are made on personal mobile device;
Wherein it is possible to make verb-noun (value) chain data correlation template and class template;
Perform following steps and make verb-noun (value) chain data correlation template:
(a) step 1:Text data is carried out to judge languages Preprocessing, part-of-speech tagging, marks out the name for carrying out each sentence Word, verb;
And syntactic analysis is carried out, mark out subject, predicate, the object for carrying out each sentence;
Wherein it is possible to passive voice subject is labeled as object;
Wherein, according to the different type main body of data resource, the reference resolution processing of subject noun or object noun is carried out;
(b) step 2:The overlapping phrase for being labeled as subject and noun is extracted, extracts the overlapping phrase for being labeled as predicate and verb, Extract the overlapping phrase for being labeled as object and noun, and their subject-predicates and predicate-object in respective sentence respectively Corresponding incidence relation:
The phrase set of the noun as subject is obtained respectively, as the phrase set of the verb of predicate, and the name as object The phrase set of word -- subject name set of words: predicate verb set/predicate verb set: object name set of words, i.e.
Noun: verb/
Verb: the phrase combination of noun and subject-predicate between them/predicate-object (one-to-one) incidence relation;
(c) step 3:The word frequency of each phrase of (during extraction) word frequency statisticses simultaneously marks out, and has one-to-one chain to close as measurement The index of the phrase weight size of connection relation,
Noun n: verb v and
Verb v: noun n2 (word frequency v, n, n2 are positive integers);
(d) step 4:Merge noun: the one-to-one chain association phrase and verb of verb: the one-to-one chain conjunctive word of noun The front and rear noun phrase repeated of group and word frequency, and connect two words composition too many levels association chain:
...
...: verb: identical noun
Identical noun: verb: ...
...
Obtain ...: verb: (merge identical) noun: verb: ... too many levels phrase chain, formed and replaced with verb-noun For link phrase node, i.e. ...: verb: noun: verb: noun: ... too many levels various dimensions the alternate word of verb/noun Group chain, so as to by verb: noun and noun: verb, which associate phrase link and moved in circles, to be together in series, or even is formed mutual from beginning to end The alternate too many levels closed loop association phrase chain of verb/noun of the closed loop of link,
It is reciprocal to obtain alternate cycles ...: verb: noun: ... link association phrase weight index ...: verb: noun : ... ≈ word frequency ratio ...: v: (n+n2): ... (word frequency v, n, n2 are just whole to verb-noun (value) chain data correlation template Number);
Wherein, following steps are continued executing with and make acquisition class template (data clues+processing rule+characteristic value collection):
(a) step 1:By natural language processing instruments such as corpus, ontologies storehouses, to personal (colony) data resource In verb-noun (value) chain data correlation template (set), all noun phrases classify/cluster, and to same class noun The adjacent principle for respectively retaining at least one phrase (i.e. adjacent verb) of place template position or so, a selection part (including it is similar Noun) composition verb-noun (value) chain data correlation template fragment, mark off generic verb-noun (value) chain Segment (set), classification/cluster template is formed, and classification is named:
Item name a ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: it is dynamic Word 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... verb-noun (value) chain segment (word Frequency v, n, n2 are positive integers) }
Wherein, same method, the identical similar synonym near synonym of verb phrases can also be used, divides the classification mould of similar verb Plate:
(b) step 2:Classification identical verb-noun (value) chain segment set, with original phrase (all verbs of segment, name Word) a part of clue is used as, composition clue (first) is gathered,
Obtained with adjacent (identical) principle for respectively retaining at least one verb phrases with template position where class noun or so and Gather with verb-noun (value) chain segment ordering rule as part rule, composition rule (second),
A part of statistical characteristics, composition characteristic value are used for the word frequency of verb-noun (value) chain segment of same class noun (the 3rd) gather;
Simplify expression and expression is developed in details and be respectively:
Class template (data clues+processing rule+characteristic value collection)=class template data clues (... verb 1, same to class name Word 1, verb 1, verb 2, with class noun 2, verb 2...)+processing rule:Phrase chain segment ordering rule (...: verb 1: same Class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2: ...)+characteristic value collection (...: verb 1: same to class name Word 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency Than ...: v: (n+n2): v: ...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers)) }
Wherein, other homogenous characteristics data such as geographical position, time can also be extracted, are closed with verb-noun (value) chain data Gang mould plate grouped data together, forms the more complicated class template of following form:
Class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...;A<North Latitude N1 ", east longitude E1 ">、B<North latitude N2 ", east longitude E2 ">、C<North latitude N3 ", east longitude E3 ">、D<North latitude N4 ", east longitude E4 ">...)+ Processing rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2 ∶...;Place ordering rule A: B: C: D...) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency Than ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v ∶...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers);Place temporal characteristics value A:Time1, B:Time2, C: Time3, D:time4...)};
Wherein, verb phrases can also be used identical close (synonym near synonym), marks off the class template of similar verb;
The synchronous personal sample data resource of personal mobile device, obtain verb-noun (value) chain data correlation template composition Personal class template (data clues+processing rule+characteristic value collection);Wherein, due to being the first person, default subject Noun.
8. a kind of intelligence system that data mining is carried out based on personal mobile device according to claim 7, its feature are existed In data mining common platform server includes:Template characteristic extraction module, template matching module, ATL, matching result Feedback and messaging module, wherein it is possible to according to input method, camera, shared drive, caching, temporary file caching, application Program APP records be saved in local temporary file, network opening interface, navigation API etc. outputs input personal data resource with The different degree of individual subject's correlation, the different weights of the word frequency of extraction are set;Wherein, the server of unit or PC include Common data excavates shared companion's module:
(1) common data excavates shared companion's module, for the unit (department) on the server or PC of unit (department) is electric Subdocument data resource, particularly text data, in manual or automated manner convergence share to data mining common platform service Device, numerous unit data resource convergence composition office complex sample data resources;
Wherein, data prediction is carried out before sharing, excludes that content is identical or content is similar and time identical electronic document, can Data desensitization is carried out to synchrodata with pretreatment;
(2) the template characteristic extraction on data mining common platform server makes module, and data digging is carried out for data resource Pick, make data model, pattern or template;
Wherein, the personal data resource synchronously downloaded on numerous personal mobile devices is pooled together and forms personal population data and enter Row data mining, characteristics extraction, make personal population data model, pattern or template;The individual of personal mobile device convergence Population sample data resource, obtain personal colony's class of numerous seriation verb-noun (value) chain data correlation template compositions Other template (data clues+processing rule+characteristic value collection);
Wherein, it is synchronous to unit (department) data resource, the office complex sample data resource of numerous units convergence, mobile device Personal data resource, mobile device convergence personal population sample data resource, personal population sample data resource and unit The blended data resource (being used for unit supply template and demands of individuals template matches), personal data resource of (department) data resource Provided with blended data resource (being used for individual speciality template and unit post template matches), the total data of unit data resource Source, data mining is carried out, make data pattern or template;
Wherein it is possible to make verb-noun (value) chain data correlation template and class template, verb-noun (valency is obtained Value) chain data correlation template and class template;
Wherein, following steps are performed the-class template of personal group need can be supplied with production unit:
(d) step 1:The item name gathered with unit (department) class template (segment) of unit (department) data resource, than To the item name in personal demographic categories template (segment) set as the personal population sample data resource for comparing object;
(e) step 2:If the match is successful for item name, obtain item name of the same name and adhere to unit (department) class template separately Two set of (segment) and personal demographic categories template (segment);
Above-mentioned item name of the same name and affiliated personal demographic categories template (segment) set for comparing object are directly quoted in selection, Component unit supply-personal group need class template;
Result is unit supply-personal group need class template=unit item name of the same name+personal colony classification mould of the same name Plate (data clues+processing rule+characteristic value collection) collection is combined into;
(3) template matching module, the characteristic value for the data template of different subjects data resource compare;
Wherein, the characteristic value comparison method of its verb-noun (value) chain data correlation template and class template is carried out, can be with The verb of verb-noun (value) chain data correlation template and class template, office complex sample for unit (department)- Noun (value) chain data correlation template and class template, individual verb-noun (value) chain data correlation template and Class template, verb-noun (value) the chain data correlation template of personal colony and class template, the verb-noun of holography The mutually corresponding matching of characteristic value between (value) chain data correlation template and class template compares;
Wherein, the template matching of verb-noun (value) chain data correlation performs following steps:
(a) step 1:Verb-noun (value) the chain data correlation pattern or mould that two difference (main body) data resources make Plate (phrase set), the comparison of phrase is mutually carried out,
(b) step 2:If the identical comparison result of phrase obtains:Identical verb, identical noun, identical verb and identical noun, Identical noun and identical verb or identical verb/noun alternating too many levels phrase chain, wherein, if adverbial word can also add Enter the comparison of identical adverbial word, i.e. identical noun, identical adverbial word and identical verb, identical adverbial word, identical verb and identical noun;
(c) step 3:Identical phrase carry out word frequency than comparison;
(d) step 4:Output result:
One, identical phrase word frequency is than equal result:
Identical verb and identical noun: word frequency is more equal than v: n2;
Identical noun and identical verb: word frequency is more equal than n: v;
Identical verb/noun replaces too many levels phrase:Word frequency ratio ...: v: (n+n2): v: ...,
1 (n, n2) component is entirely equal;
2 (n+n2) total amounts are equal;
The match is successful for the template matching of verb-noun (value) chain data correlation;
Two, result of the identical phrase word frequency than not grade:
Identical verb:
Show the sequence of associated high-frequency noun;
Identical noun:
Show the sequence of associated high-frequency verb;
Identical verb and identical noun:
Noun word frequency is inversely proportional in verb word frequency proportion and word frequency ratio, and display word frequency compares difference value;
Identical noun and identical verb:
For noun word frequency in verb word frequency proportion and word frequency than directly proportional, display word frequency compares difference value;
Identical verb/noun replaces too many levels phrase:
Noun, verb word frequency compare difference value;
Wherein, verb-noun (value) chain data correlation template that personal data resource makes, the name of certain topic can be selected Word (for example, interest, hobby, speciality etc.) mutually compares, and matching obtains the approximate verb-noun value chain of personal matching theme and closed Connection relation;
Or can according to given noun, in verb-noun (value) chain data correlation template that personal data resource makes, Verb-noun (value) the chain data correlation template matching made with the population sample data resource of individual, individual is obtained in group (interest, hobby, speciality etc.) verb word in (given) noun-verb action (value) chain link in volume data resource entirety The difference value positioning scenarios and situation that sound interval degree compares;
Wherein, verb-noun (value) the chain data correlation template made from unit (department) data resource, it is high to select word frequency Noun phrase, compare from the office complex sample data resource of numerous unit set or make what is obtained from all data resources Verb-noun (value) chain data correlation template, it is whole in office complex sample data resource can to obtain the unit (department) The difference that verb word frequency degree in (high frequency) noun-verb action (value) chain link of body or all data resources entirety compares Different value positioning scenarios and situation;
Wherein, in verb-noun (value) the chain data correlation template made from unit (department) data resource, it is high to select word frequency Noun phrase, compare verb-noun (value) chain data correlation template from personal population sample data resource;The match is successful Noun where verb-noun (value) chain data correlation template can as the unit (department) (high frequency) noun based on Verb-noun (value) chain data correlation template of matching relationship between topic supply and personal group need;
Wherein,
Former class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...)+place Reason rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2 : ...)+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...∶ Verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... phrase chain segment word frequency ratio (word frequency V, n, n2 are positive integers)) }
The object type template (data clues+processing rule+characteristic value collection) of comparison;
Class template (data clues+processing rule+characteristic value collection), which compares, performs following steps:
(a) step 1:According to former class template identical rule, to the object to be compared (main body) object type template (data Clue+processing rule+characteristic value collection) data clues in, extraction compares the same noun phrase in former class template data clues;
If not extracting identical same noun phrase, return starts;
If extracting the same noun phrase of above-mentioned whole, data clues (... verb 1, with class noun 1, verb 1, verb 2, same to class name Word 2, verb 2...) the match is successful for data clues, into next step;
(b) step 2:Equally classify according to former class template/clustering rule classified, obtain same category phrase chain segment Sequence:
To in verb-noun (value) the chain data correlation template (set) of the object to be compared (main body) data resource, owning Noun phrase classify/cluster, and respectively retains identical individual phrase principle, choosing to adjacent with template position where class noun or so Identical a part of verb-noun (value) chain segment is taken, is divided into generic verb-noun (value) chain segment set, group Constituent class/cluster template, classification is named, obtained:
Object type title A processing regular (phrase chain segment) ...: verb 11: with class noun 11: verb 11: ...;...∶ Verb 22: with class noun 22: verb 22: ...;... phrase chain segment }
Compare
Former item name a processing rule phrase segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: same Class noun 2: verb 2: ...;... data correlation template phrase chain segment)
In phrase chain segment sequence processing rule;
If phrase chain segment sequence processing rule ...: verb 11: with class noun 11: verb 11: ... with ...: verb 1: With class noun 1: verb 1: ... matching is consistent;...: verb 22: with class noun 22: verb 22: ... with ...: verb 2: similar Noun 2: verb 2: ... matching is consistent;... all matchings of phrase chain segment sequence processing rule are consistent by that analogy, and entrance is next Step;
If processing rule phrase chain segment have it is unmatched, return start;
(c) step 3:Sorted processing rule according to former class template identical, comparing similar in two class templates includes phase With the word frequency ratio of title same words group chain segment sequence:
Comparison object type template characteristic value set ...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v∶(n+n2)∶v∶...;...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... Word frequency ratio (word frequency v, n, n2 are positive integers) }
With
Former class template characteristic value collection ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v ∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word frequency ratio (word frequency v, N, n2 are positive integers) }
It is compared;
If:
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 1: same Class noun 1: verb 1: ...;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 2: same Class noun 2: verb 2: ...
... characteristic value word frequency is than all identical or be approximately equal to by that analogy, and as the match is successful for characteristic value collection,
The match is successful for data clues in class template (data clues+processing rule+characteristic value collection), the sequence of phrase chain segment Processing rule is consistent, and for the word frequency of characteristic value collection than equal, final result is that the match is successful for two class templates;
If:Characteristic value word frequency is that it fails to match for class template than not waiting;
Wherein, if setting other classifying rules to extract other characteristics, according to other rule determine matching compares successfully and No, it can be just the success of whole template matches that all the match is successful, and the failure of any one local matching can all cause whole template It fails to match;
Wherein, template matching module can also be included on personal mobile device;
(4) ATL, the template of the data resource for preserving various main bodys,
Wherein, verb-noun (value) the chain data correlation template and class template, office complex sample of unit (department) are preserved This verb-noun (value) chain data correlation template and class template, verb-noun (value) chain data correlation of individual Template and class template, verb-noun (value) the chain data correlation template of personal colony and class template, holography it is dynamic Word-noun (value) chain data correlation template and class template, unit (department) are dynamic with personal colony's blended data resource Word-noun (value) chain data correlation template, the personal verb-noun (value) with unit (department post) blended data resource Chain data correlation template, etc. template set;
(5) matching result feedback and messaging module, feed back to correspondingly for each template matching message data that the match is successful Each data resource main equipment, and for the interactive message communication between them;Wherein, it is also used for verb-noun In (value) chain data correlation template or class template (data clues+processing rule+characteristic value collection) comparing module matching into The message data of work(feeds back to corresponding each data resource main equipment;
It is wherein it is possible to defeated using personal mobile device is run on single (safety chip) processor on personal mobile device Go out input synchronization module and template characteristic extraction make module.
9. one kind is based on personal mobile device, using the custom built forms or classification of verb-noun (value) chain data correlation template The intelligence system of template (data clues+processing rule+characteristic value collection), it is characterised in that personal mobile device includes:It is defeated Go out input data synchronization module, template clue filtering module, template matches comparing module, personal cue library, ATL, output The corresponding Agent Service content module of display;Wherein,
(1) output input data synchronization module, for by the input method on personal mobile device, camera, shared drive, Caching, temporary file caching, application APP record are saved in local temporary file, network opening interface, navigation API etc. Output input data synchronously replicate collect;
Data desensitization bleaching or image extraction characteristic value are carried out to synchrodata wherein it is possible to pre-process;
(2) template clue filtering module, for above-mentioned output the data that are collected into of input data synchronization module, mistake successively one by one Filter all verbs, the noun in the custom built forms of verb-noun (value) chain data correlation template construct in comparison template storehouse Deng the hints data in phrase or class template (data clues+processing rule+characteristic value collection), result data note that the match is successful Record to personal cue library, and record accumulative matching times;
Wherein it is possible to preserved according to input method, camera, shared drive, caching, temporary file caching, application APP record To the temporary file of local, network opening interface, navigation API etc. outputs input personal data resource and individual subject's correlation Different degree, the different weights of the data of filtering are set to add up word frequency or record;
(3) template matches comparing module, the comparison for the template and the template of personal cue library extraction of ATL;
Wherein, verb-noun (value) chain of unit supply-personal group need custom built forms is included but is not limited in ATL Data correlation template custom built forms (for example, noun third: verb first ≈ word frequency is than n: v and verb fourth: noun fourth ≈ word frequency is than v: N2), verb-noun (value) the chain data correlation template of the same name from the extraction making of personal cue library is compared (for example, noun third : verb first ≈ word frequency is than n: v ' and verb fourth: noun fourth ≈ word frequency is than v: n2);If the word frequency with noun phrase is than identical or about It is equal to, that is, is unit supply-the match is successful for personal group need custom built forms, is otherwise that it fails to match;
Wherein, in view of class template=unit item name of the same name of unit supply-personal group need+personal colony class of the same name Other template (data clues+processing rule+characteristic value collection) collection is combined into, and is needed with the unit supply in ATL-personal colony Class template is sought, compares the class template of the same name (data clues+processing rule+characteristic value collection made from the extraction of personal cue library Close),
Unit supply-personal group need class template data clues (... verb 1, with class noun 1, verb 1, verb 2, With class noun 2, verb 2...)+processing rule:The sequence of phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;... : verb 2: with class noun 2: verb 2: ...)+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency Than ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v ∶...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers)) }
The class template of the same name (data clues+processing rule+characteristic value collection) that the object individual cue library extraction of comparison makes;
Perform following class template and compare step:
(a) step 1:- class template identical the rule of personal group need is supplied according to unit, it is (main to the object to be compared Body) object type template (data clues+processing rule+characteristic value collection) data clues in, extraction compare unit supply-it is individual Same noun phrase in the data clues of the class template of people's group need;
If not extracting identical same noun phrase, return starts;
If extracting the same noun phrase of above-mentioned whole, data clues (... verb 1, with class noun 1, verb 1, verb 2, same to class name Word 2, verb 2...) the match is successful for data clues, into next step;
(b) step 2:Supply according to unit-class template of personal group need equally classifies/clustering rule classified, obtain Obtain same category phrase chain segment:
To in verb-noun (value) the chain data correlation template (set) of the object to be compared (main body) data resource, owning Noun phrase classify/cluster, and respectively retains identical individual phrase principle, choosing to adjacent with template position where class noun or so Identical a part of verb-noun (value) chain segment is taken, is divided into generic verb-noun (value) chain segment set, group Constituent class/cluster template, classification is named, obtained:
Object type title A processing regular (phrase chain segment) ...: verb 11: with class noun 11: verb 11: ...;...∶ Verb 22: with class noun 22: verb 22: ...;... phrase chain segment }
Compare
The class template a of unit supply-personal group need processing rule phrase segment (...: verb 1: with class noun 1: dynamic Word 1: ...;...: verb 2: with class noun 2: verb 2: ...;... data correlation template phrase chain segment)
In phrase chain segment sequence processing rule;
If the phrase chain segment sequence ... of processing rule: verb 11: with class noun 11: verb 11: ... with ...: verb 1: With class noun 1: verb 1: ... matching is consistent;...: verb 22: with class noun 22: verb 22: ... with ...: verb 2: similar Noun 2: verb 2: ... matching is consistent;... handling regular phrase chain segment sequence by that analogy, all unanimously, entrance is next for matching Step;
If processing rule phrase chain segment sequence have it is unmatched, return start;
(c) step 3:Supply according to unit-the class template identical of personal group need handles rule, compare two classifications The similar word frequency ratio for including the identical phrase chain segment sequence of same names in template:
Comparison object type template characteristic value set ...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v∶(n+n2)∶v∶...;...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... Word frequency ratio (word frequency v, n, n2 are positive integers) }
With
Unit supply-personal group need class template characteristic value collection ...: verb 1: with class noun 1: verb 1: ... ≈ Word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v ∶...;... word frequency ratio (word frequency v, n, n2 are positive integers) }
It is compared;
If:
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 1: same Class noun 1: verb 1: ...;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 2: same Class noun 2: verb 2: ...
... characteristic value word frequency is than all identical or be approximately equal to by that analogy, and as the match is successful for characteristic value collection,
Data wire in the class template of the same name (data clues+processing rule+characteristic value collection) that personal cue library extraction makes The match is successful for rope, and the phrase chain segment for handling rule sorts unanimously, and for the word frequency of characteristic value collection than equal, final result is two The match is successful for class template;
If:Characteristic value word frequency is that it fails to match for class template than not waiting;
Unit supply-personal group need custom built forms the match is successful or personal colony class template of the same name the match is successful all forms Template matches compare successfully;
(4) output display corresponding data service content module, in template matches comparing module template matching matching into After work(, then the corresponding data, services content set by output display on personal mobile device;
(5) ATL, unit supply-personal colony for preserving verb-noun (value) chain data correlation template construct need Ask custom built forms set and unit supply-personal group need class template set;
Wherein, allow through personal mobile device, renewal unit supply-personal group need custom built forms, individual-group can be downloaded Body study associates the templates such as custom built forms to ATL;
(6) personal cue library, for export input data synchronization module synchrodata filter verb-noun (value) chain number According to the data group obtained after hints data in the phrases such as all verbs of the custom built forms of relation template, noun and class template Into personal cue library.
10. one kind carries out data mining based on personal mobile device, using the intelligence of verb-noun (value) chain data correlation template Energy system, it is characterised in that personal mobile device includes:Input method software data simultaneous module, verb filtering and template generation Module, template send management and matching result feedback interactive communication module, individual character ATL, verb library;
Data mining common platform server includes:Template receives management and matching result feedback interactive communication module, template ratio To module, ATL;
Wherein, personal mobile device includes:
(1) input method software data simultaneous module, for will synchronously be replicated to the input method input data on personal mobile device Collect;
(2) verb filtering and template generation module, the data being collected into for above-mentioned input method software data simultaneous module, successively The verb in filtering verb library is compared, performs following steps generation individual character template:
(a) step 1:With all conventional verb set in verb library, input method software data simultaneous module is filtered successively and is collected The text data arrived;
(b) step 2:Part-of-speech tagging is carried out to sentence where the filtering verb that the match is successful, marks out the noun come in sentence;
Also Sentence Grammar where the filtering verb that the match is successful is analyzed, (trying one's best) marks out the subject for carrying out sentence, predicate and guest Language;
Whether the verb for judging to be filled into is predicate verb;
Wherein, or the text data that the input method of time-wise separation is collected into adds and matches somebody with somebody punctuation mark automatically, is led The processing of the reference resolution of language noun or object noun;
(c) step 3:If the verb being filled into is predicate verb, the overlapping word for being labeled as predicate and verb of the sentence is extracted Group, extraction is overlapping to be labeled as subject and noun, extracts the overlapping phrase for being labeled as object and noun, and they are respectively respective The one-to-one incidence relation of corresponding subject-predicate/predicate-object in sentence:Subject noun: predicate verb/predicate verb: guest Language noun;
(d) step 4:If extract step 3 noun: verb/verb: the phrase combination of noun and the master between them Language-predicate/predicate-object (one-to-one) incidence relation, is saved in individual character ATL;
The data that input method software data simultaneous module is collected into, the verb in filtering individual character ATL: name is compared successively Word/noun: verb phrases combination, record the matching word frequency of each phrase and mark out, as measurement, two have one-to-one chain The index of the phrase weight size of bar incidence relation, verb word frequency v: noun word frequency n2/ noun word frequency n: verb word frequency v (word frequency V, n, n2 are positive integers), obtain verb: noun ≈ word frequency is than v: n2/ noun: verb-noun (valency of the verb ≈ word frequency than n: v Value) chain data correlation template (set);
(3) template sends management and matching result feedback interactive communication module, the user management individual character for personal mobile device ATL, which shows and managed, sends template to the progress of the template of data mining common platform server and specified body data resource Compare, and manage user and communicated with corresponding Parties ' Mutual;
(4) individual character ATL, for preserving the individual character template set of verb filtering and template generation module generation;
(5) verb library, for preserving conventional verb;
Wherein, Chinese is often included, but are not limited to as follows with verb:
Represent action behavior:Say, see, walking, listening, laughing at, taking, circling in the air, running, eating, singing, drinking, striking, sitting, shouting, staring at, kicking, hearing, touching, criticizing Comment, publicize, safeguarding, learning, studying, carry out, start, stopping, forbidding representing that change be present disappears:, it is dead, have, be equal to, send out Raw, differentiation, development, grow, be dead, exist, eliminate
Represent psychological activity:Think, like, hating, primary, miss, intend, liking, wishing, evil primary, worry, be disagreeable, feeling, thinking
Represent to judge:It is, be, is
Representing may wish necessity (auxiliary verb):Can, can, meeting, can with, be willing to, be ready, agree, dare, should, should, match somebody with somebody, be worth Must, would rather
Represent to tend to (directional verb):It is upper and lower, into and out of, return, open, cross,, come up, get off, come in, come out, return, open Come, come, getting up, going, up, go down, enter, go out it is main, go back, open, the past
Represent development verb:As grown, withering, germinateing, result, spawning;
For plan, system, scheme, file etc.:
Work out, work out, draft, draft, authorize, audit, examine, transmit, deliver, submit, report, assign, put on record, achieve, carry Go out opinion
For information, data:
Investigate, study, collect, arrange, analyze, conclude, analyze, summarize, provide, report, feed back, pass on, notify, issue, tie up Pillar is managed
On a certain work (higher level):
Preside over, organize, instructing, arranging, coordinating, indicating, supervising, managing, distributing, controlling, take the lead responsible, examination & approval, authorization, label Hair, approval, assess
Thinking behavior:
Research, analysis, assess, development, suggest, proposal, participate in, recommend, plan
Direct action:
Organize, carry out, performing, instructing, leading, controlling, supervising, use, production, participate in, illustrate, explaining, providing, assisting
Higher level's behavior:
License, ratify, define, determining, instructing, establishing, planning, supervising, determining
Administration behaviour:
Reach, assess, control, coordinate, ensure, identify, keep, supervise
Expert's behavior:
Analyze, assist, promote, get in touch with, suggest, recommend, support, assess, evaluate
Subordinate's behavior:
Check, check, collect, obtain, submit, make
Other:
Maintenance, holding, foundation, exploitation, preparation, processing, execution, reception, arrangement, monitoring, report, operation, confirmation, generalities, Cooperate, cooperate, obtain, check, check, get in touch with, design, test, build, change, write, draft, guide, transmit, translate, grasp Make, ensure, preventing, solving, introducing, paying, calculating, revising, undertaking, negotiating, conferring, interviewing, refusing, vetoing, monitoring, in advance Survey, compare, delete, use
Wherein, data mining common platform server includes:
(1) template receives management and matching result feedback interactive communication module, for receiving to send on personal mobile device Individual character template, forward template matching module, be compared with the specified template of ATL;
Matching result data feedback personal device, and personal device and comparison template main equipment interactive communication can be carried out;
(2) template matching module, for receiving individual character template and the template matching for the body data resource specified from ATL;And Matching result is received into management by template and matching result feeds back interactive communication module feedback to personal device;
(3) ATL, for preserving the ATL of various body data resources;
Wherein, template includes but is not limited to, verb-noun (value) the chain data correlation template and classification mould of unit (department) Plate, verb-noun (value) the chain data correlation template of office complex sample and class template, the verb-noun (valency of individual Value) chain data correlation template and class template, verb-noun (value) the chain data correlation template and classification of personal colony The template set such as template, holographic verb-noun (value) chain data correlation template and class template.
CN201710303290.0A 2017-05-03 2017-05-03 Method for extracting data feature template and method and system for applying template Active CN107341171B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710303290.0A CN107341171B (en) 2017-05-03 2017-05-03 Method for extracting data feature template and method and system for applying template

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710303290.0A CN107341171B (en) 2017-05-03 2017-05-03 Method for extracting data feature template and method and system for applying template

Publications (2)

Publication Number Publication Date
CN107341171A true CN107341171A (en) 2017-11-10
CN107341171B CN107341171B (en) 2021-07-27

Family

ID=60220082

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710303290.0A Active CN107341171B (en) 2017-05-03 2017-05-03 Method for extracting data feature template and method and system for applying template

Country Status (1)

Country Link
CN (1) CN107341171B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109657013A (en) * 2018-11-30 2019-04-19 杭州数澜科技有限公司 A kind of systematization generates the method and system of label
CN110471597A (en) * 2019-07-25 2019-11-19 北京明略软件系统有限公司 A kind of data mask method and device, computer readable storage medium
CN110738033A (en) * 2018-07-03 2020-01-31 百度在线网络技术(北京)有限公司 Report template generation method, device and storage medium
CN111428508A (en) * 2018-12-24 2020-07-17 微软技术许可有限责任公司 Style customizable text generation
CN111859858A (en) * 2020-07-22 2020-10-30 智者四海(北京)技术有限公司 Method and device for extracting relationship from text
US20210019569A1 (en) * 2019-07-16 2021-01-21 Ancestry.Com Operations Inc. Extraction of genealogy data from obituaries

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030055625A1 (en) * 2001-05-31 2003-03-20 Tatiana Korelsky Linguistic assistant for domain analysis methodology
CN101814067A (en) * 2009-01-07 2010-08-25 张光盛 System and methods for quantitative assessment of information in natural language contents
CN103186633A (en) * 2011-12-31 2013-07-03 北京百度网讯科技有限公司 Method for extracting structured information as well as method and device for searching structured information
CN106104524A (en) * 2013-12-20 2016-11-09 国立研究开发法人情报通信研究机构 Complex predicate template collection device and be used for its computer program
CN106484675A (en) * 2016-09-29 2017-03-08 北京理工大学 Fusion distributed semantic and the character relation abstracting method of sentence justice feature

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030055625A1 (en) * 2001-05-31 2003-03-20 Tatiana Korelsky Linguistic assistant for domain analysis methodology
CN101814067A (en) * 2009-01-07 2010-08-25 张光盛 System and methods for quantitative assessment of information in natural language contents
CN103186633A (en) * 2011-12-31 2013-07-03 北京百度网讯科技有限公司 Method for extracting structured information as well as method and device for searching structured information
CN106104524A (en) * 2013-12-20 2016-11-09 国立研究开发法人情报通信研究机构 Complex predicate template collection device and be used for its computer program
CN106484675A (en) * 2016-09-29 2017-03-08 北京理工大学 Fusion distributed semantic and the character relation abstracting method of sentence justice feature

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110738033A (en) * 2018-07-03 2020-01-31 百度在线网络技术(北京)有限公司 Report template generation method, device and storage medium
CN110738033B (en) * 2018-07-03 2023-09-19 百度在线网络技术(北京)有限公司 Report template generation method, device and storage medium
CN109657013A (en) * 2018-11-30 2019-04-19 杭州数澜科技有限公司 A kind of systematization generates the method and system of label
CN111428508A (en) * 2018-12-24 2020-07-17 微软技术许可有限责任公司 Style customizable text generation
US20210019569A1 (en) * 2019-07-16 2021-01-21 Ancestry.Com Operations Inc. Extraction of genealogy data from obituaries
US11537816B2 (en) * 2019-07-16 2022-12-27 Ancestry.Com Operations Inc. Extraction of genealogy data from obituaries
US20230109073A1 (en) * 2019-07-16 2023-04-06 Ancestry.Com Operations Inc. Extraction of genealogy data from obituaries
US11797774B2 (en) * 2019-07-16 2023-10-24 Ancestry.Com Operations Inc. Extraction of genealogy data from obituaries
CN110471597A (en) * 2019-07-25 2019-11-19 北京明略软件系统有限公司 A kind of data mask method and device, computer readable storage medium
CN111859858A (en) * 2020-07-22 2020-10-30 智者四海(北京)技术有限公司 Method and device for extracting relationship from text
CN111859858B (en) * 2020-07-22 2024-03-01 智者四海(北京)技术有限公司 Method and device for extracting relation from text

Also Published As

Publication number Publication date
CN107341171B (en) 2021-07-27

Similar Documents

Publication Publication Date Title
Lucy et al. Content analysis of textbooks via natural language processing: Findings on gender, race, and ethnicity in Texas US history textbooks
CN107341171A (en) Extract the method and system of data (gene) feature templates method and application template
CN109478205B (en) Architecture and method for computer learning and understanding
Neumann et al. Chatbots as a tool to scale mentoring processes: Individually supporting self-study in higher education
Abdulwahid et al. Library Management System Using Artificial Intelligence
Ermel et al. Literature reviews: modern methods for investigating scientific and technological knowledge
CN110309114A (en) Processing method, device, storage medium and the electronic device of media information
Pepinsky et al. Silver anniversary: The Journal of Counseling Psychology as a matter of policies.
Zweig Awkward intelligence: Where AI goes wrong, why it matters, and what we can do about it
Grossman Reading for gender in the Damascus Document
Watts A study of alternative frameworks in school science
O'Halloran A posthumanist pedagogy using digital text analysis to enhance critical thinking in higher education
Utami et al. The analysis of denotative and connotative meaning of Indonesian sexist metaphors
Sinag Dance Ethnography: An Analysis on Aeta Ambala Tribe of Barangay Tubo-tubo, Bataan
Kasprzik Automating subject indexing at ZBW: making research results stick in practice
Tong et al. Automating Psychological Hypothesis Generation with AI: Large Language Models Meet Causal Graph
Jatain et al. A Hybrid Bio-inspired Fuzzy Feature Selection Approach for Opinion Mining of Learner Comments
Kunifuji et al. Knowledge, Information and Creativity Support Systems: Selected Papers from KICSS’2014-9th International Conference, held in Limassol, Cyprus, on November 6-8, 2014
Ujwal et al. A Hybrid Weight based Feature Selection Algorithm for Predicting Students’ Academic Advancement by Employing Data Science Approaches
Nelson Begging for money: technology commercialization and the genre of the business pitch
Tenzin et al. Sentiment Analysis
Mohamed et al. Enhancing the Performance of Educational Systems Using Efficient Opinion Mining Techniques.
Bhardwaj et al. A NARRATIVE METHOD OF BRAIN ENCOURAGED EMOTION INVESTIGATION
Gullerud Leveraging LSTM and Language Embeddings for Age Group Estimation in Child Language Data
Pacer Mind as theory engine: causation, explanation and time

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 101100 room 1901, unit 1, Beijing ONE3 building, Zhongshan Avenue, Tongzhou District, Beijing.

Applicant after: Liu Hongli

Address before: 101100 Beijing Tongzhou District Xinhua Street East End Gucheng Road West Garden District 1 District 7 Building 342 room.

Applicant before: Liu Hongli

DD01 Delivery of document by public notice
DD01 Delivery of document by public notice

Addressee: Liu Hongli

Document name: Notification of Passing Examination on Formalities

GR01 Patent grant
GR01 Patent grant