CN107341171A - Extract the method and system of data (gene) feature templates method and application template - Google Patents
Extract the method and system of data (gene) feature templates method and application template Download PDFInfo
- Publication number
- CN107341171A CN107341171A CN201710303290.0A CN201710303290A CN107341171A CN 107341171 A CN107341171 A CN 107341171A CN 201710303290 A CN201710303290 A CN 201710303290A CN 107341171 A CN107341171 A CN 107341171A
- Authority
- CN
- China
- Prior art keywords
- verb
- noun
- template
- class
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2452—Query translation
- G06F16/24522—Translation of natural language queries to structured queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
Extract data correlation characteristic value pattern or template method:One:Data resource is carried out to judge that languages pre-process, part-of-speech tagging, mark out noun, the verb of each sentence;And syntactic analysis is carried out, mark out the subject, predicate, object of each sentence;Two:Extract the overlapping phrase for being labeled as subject and noun in sentence set, the overlapping phrase for being labeled as predicate and verb, the overlapping phrase for being labeled as object and noun, the name set of words as subject is obtained respectively, verb set as predicate, with the name set of words as object, and their Subject-Verb in sentence/linked character relations corresponding to predicate object respectively, three:The accumulative word frequency of subject noun, predicate verb and object verb is counted respectively, mark out and be used as measurement subject name set of words: predicate verb set/predicate verb set: object noun is containing relevant phrase weight characteristic value size, noun: verb ≈ word frequency is than n: v verb: noun ≈ word frequency is than v: 2n.
Description
Technical field
The present invention relates to fields such as data mining, text mining, natural language processing, artificial intelligence, one is related specifically to
Kind makes the method using data correlation feature mode or template based on natural language processing, text mining, and utilizes application
The Intelligent Business of the template, the method and system of intelligent social.
Background technology
" data are blasts, and information is but very poor ".Briefly, data are exactly symbol.Data are in itself without any meaning
Justice, the implication of data is exactly semantic (semantic).The data for being only endowed implication can be used, and at this time data are just
For information, the implication of data is exactly semantic for conversion.Semanteme is the approach for contacting computer representation and real world.
Internet resources environment also develops to semantization, structuring and intelligent direction in itself.
Human development is stored to bulk information today with the mode of electronic document and these electronics over nearly ten or twenty year
The quantity of document shows explosive growth.Join according to Mrrrill Lynch (Merrill Lynch) and high Dana Corporation (Gartner)
Close the investigation carried out and show that 85% business data is more or less that storage is collected in a manner of unordered.Meanwhile investigate
Claim that these rambling data double for every 18 months.Text is most basic, the most frequently used information carrier, includes people
Class knowledge accumulation processes progressive process and is related to the daily social activities of people, state affairs, public service, business activity, society
Turn over a finished item dynamic core value.It is particularly important in the processing of Computer Language Processing work Chinese version with treatment technology.Now
All circles follow " knowledge is power ", Knowledge Source in data and information, if society, government, enterprises and individuals can efficiently and
Effectively excavate the value of text data behind, it becomes possible to make more preferable decision-making, improve operating efficiency, lift quality of the life.
Natural language processing technique has been achieved for major progress in morphology and syntactical research.Comparatively speaking, it is right
The research of semantic, pragmatic and Contextual Knowledge is always a bottleneck for being difficult to cross over, and it is even whole that difficult point concentrates on elimination sentence
The semantic ambiguity of piece article rank.It is probably to be regarded by the application of initial machine translation (application still not very success so far) thinking
Wild influence, understanding and assurance of the people during natural language processing to messenger particle degree have certain deviation.Need
Illustrate be for many sentences in text be not each sentence can be formed with value, significant material.
The content of the invention
Although the present invention extracts the critical data element characteristic in text using the phrase of sentence as messenger particle degree, it is
The object of cognition of data mining is used as using the function of social function main body and value granularity, it is believed that the mankind (including create economical
The unit of value activities and personal income) thinking/behavior produce information source, producing scientific research commercial activity daily with people and disappear
Take the active maps such as pastime culture and positively related value relation, only means and result are to generate to accord with by digitlization means
Number change corresponding data amount, modern society seldom carries out information activities with papery writing words, in ecommerce and shifting
The digitlization of people's almost all carries out social information's exchange activity under dynamic internet environment;In turn, it should can be by people
The valuing characteristic element of original original text data (not secondary statistics) and the Internet of Things data of behavior excavate,
Find producing scientific research commercial activity daily with people and the digitlization consumption pastime activity such as culture have the social thinking of corresponding relation/
Behavior memory driving data (gene) information.
Economical is substantially a set of value system, and cultural essence is a set of Value System, and data are to be based on thinking row
For record or synchronization:Information source → natural language data for electronic documents resource.
The main body of the society and its data resource schematic diagram caused by Fig. 1 display informations source,
Each mainly function of the main body of the society and value are respectively:
1, individual has thinking work function and value;
2, personal colony has study consumption function and value;
3, unit has manufacture production/living resources/Service Source function and value;
4, comprehensive function and value of the office with manufacture from the resources of production to living resources.
The daily routines and action of the mankind are main in information and in data by value and values driving, thinking behavior
It is to be come out with the semantic representation of verb:Thinking/behavior memory (sight) driving → information source → verb (value driving) → natural language
Say data for electronic documents resource.Function and value information transitive relation are presented as in data between these main bodies of the society:It is different
The verb-noun association phrase of (value), passes through the subject of natural language sentences, meaning based on the verb of use intensity (frequency)
Language, the method structure of object language, which is realized, transmits verb-noun chain, describes data value chain (gene) information.
The information activities of the main body of the society produce following data resource:
1, personal data resource;
2, personal population sample data resource;
3, unit (department) data resource;
4, office complex sample data resource;
Wherein, numerous personal data resources are pooled together as personal population sample data resource -- data common property;It is many
More unit (department) data resources pool together composition office complex sample data resource -- data sharing.Text mining is from big
Measure in the set C of text and find implicit pattern p.If regarding C as inputs, regard p as outputs, then the process of text mining
Exactly from a mapping for being input to output:C-p.Information extraction technology is exactly extra large from these main bodies of the society using computer technology
The spy for meeting main body of the society demand or meeting main body of the society value demand is found in the electronic document and behavior Internet of Things data of amount
Value indicative (data value gene), and personal (privacy) data will be related to and carry out desensitization bleaching, such as by personal email:embbiz@
126.com desensitizes:e***iz@126.com.
Recognize based on more than, for achieving the above object, according to the first aspect of the present invention,
1. the method that one kind is based on natural language processing technique, extraction making data correlation feature mode or template is provided,
It is characterized in that verb-noun (value) the chain data correlation pattern of making or template perform following steps:
(a) step 1:Languages judgement pretreatment, part-of-speech tagging, mark are carried out to the text data of (main body) data resource
Outpour noun, the verb of each sentence;
And syntactic analysis is carried out to text data, the subject, predicate, object of each sentence are marked out wherein it is possible to will be by
Dynamic voice subject is labeled as object;
Wherein, the reference resolution processing of subject noun or object noun is carried out;
(b) step 2:The overlapping phrase for being labeled as subject and noun in sentence set is extracted, is extracted overlapping in sentence set
The phrase of predicate and verb is labeled as, extracts the overlapping phrase for being labeled as object and noun, obtains the noun as subject respectively
Set, as the verb set of predicate, and the name set of words as object, and they respectively subject-predicate in sentence/
Linked character relation corresponding to predicate-object, subject name set of words: predicate verb set/predicate verb set: object name word set
Close, i.e. noun: verb/verb: the phrase combination of noun and subject-predicate/predicate-object (one-to-one) between them
Linked character relation;
(c) step 3:The accumulative word frequency of subject noun, predicate verb and object verb is counted respectively, is marked out and is used as
Measure subject name set of words: predicate verb set/predicate verb set: object name set of words contains (one-to-one) incidence relation
Phrase weight characteristic value size, i.e.
Subject noun word frequency n: predicate verb word frequency v/
Predicate verb word frequency v: object noun word frequency n2 (word frequency v, n, n2 are positive integers),
Obtain (main body) data resource of incidence relation weight:
Noun: verb ≈ word frequency than n: v set and
Verb: noun ≈ word frequency is gathered than v: 2n,
The word frequency phrase of high frequency is chosen in set turns into verb-noun (value) chain data correlation pattern or template (phrase
Set).
2. it is based on described method providing a preferred scheme, it is characterised in that:
Wherein, step 4:Merge noun: (one-to-one) the association phrase and verb of verb: (one-to-one) of noun associates
The noun phrase and word frequency of repetition of the same name before and after phrase:
...
...: verb: identical noun
Identical noun: verb: ...
...
The word of noun two composition too many levels association phrase chain of the same name is connected, is obtained ...: verb: (merging identical) noun: dynamic
Word: (merge identical) noun: verb: ... too many levels phrase chain, so as to by man-to-man phrase chain (set) string in part
Connection is formed with verb/noun phrase alternately for hinged node ...: verb: noun: verb: noun: ... too many levels various dimensions
Verb/noun alternating phrase chain, in some instances it may even be possible to formed head and the tail interlink closed loop verb/noun alternating phrase it is more
Link closed loop associates phrase chain;
Merge the word frequency n+n2 of subject noun and object noun, it is reciprocal to obtain alternate cycles ...: verb: noun: ...
Link association phrase weight ...: verb: noun: ... ≈ word frequency ratio ...: v: (n+n2): ... verb-noun (value) chain
Data correlation pattern or template (phrase set, word frequency v, n, n2 are positive integers),
That is, verb-noun (value) chain data correlation pattern or template (phrase set), both can be one-to-one
Phrase chain (noun: verb ≈ word frequency is than n: v or verb: noun ≈ word frequency is than v: n2) can also be too many levels phrase chain
Bar (...: verb: noun: verb ... ≈ word frequency ratios ...: v: (n+n2): v: ...) two kinds of forms.
3. a preferred scheme is further provided based on the above method, it is characterised in that:
Wherein it is possible to utilize the natural language processing such as corpus, digital dictionary, ontologies storehouse aid total score
Analysis, the adverbial word mark for the predicate verb of each sentence is carried out, cumulative statistics its word frequency is gone back when extracting adverbial word, is obtained:
Adverbial word: verb: the adverbial word of the associated weights of noun: verb: noun ≈ word frequency is than a: v: n2 or noun: adverbial word: dynamic
(word frequency v, n, n2 are that positive integer a is certainly for verb-noun (value) chain data correlation pattern of the word ≈ word frequency than n: a: v or template
0) right number can be;
Or obtain the phrase chain of too many levels:
Alternate cycles are reciprocal ...: adverbial word: verb: noun: adverbial word: ... the associated weights of phrase ...: adverbial word: verb:
Noun: ... ≈ word frequency ratio ...: a: v: (n+n2): a: ... verb-noun (value) chain data correlation pattern or template (word
Frequency v, n, n2 are that positive integer a is natural number and can be 0);Wherein, adverbial word can be sky.
4. according to the second aspect of the present invention, there is provided a kind of verb-noun (value) chain data correlation pattern or template (word
Group set) comparison method, it is characterised in that perform following steps:
(a) step 1:Verb-noun (value) chain data correlation pattern that two difference (main body) data resources make
Or template (set of phrase chain), the comparison of phrase is mutually carried out,
(b) step 2:If comparison result obtains:
Identical verb,
Identical noun,
Identical verb and identical noun,
Identical noun and identical verb,
Or identical verb/noun alternating too many levels phrase chain,
Wherein, if adverbial word can also add the comparison of identical adverbial word, i.e.
Identical noun, identical adverbial word and identical verb,
Identical adverbial word, identical verb and identical noun,
Into in next step;
(c) step 3:Identical phrase carry out word frequency than comparison;
(d) step 4:Output result:
One, word frequency is than equal result:
Identical verb and identical noun: word frequency is more equal than v: n2, template successful match;
Identical noun and identical verb: word frequency is more equal than n: v, template successful match;
Identical verb/noun alternating too many levels phrase: word frequency ratio ...: v: (n+n2): v: ...,
1 (n, n2) component is entirely equal;
2 (n+n2) total amounts are equal;
, template successful match;
Two, result of the word frequency than not grade:
Identical verb:
Show the sequence of associated high-frequency noun;
Identical noun:
Show the sequence of associated high-frequency verb;
Identical verb and identical noun:
Two template noun word frequency respectively verb word frequency proportion with their word frequency than inversely proportional relation, show word frequency
Compare difference value;
Identical noun and identical verb:
Two template noun word frequency verb word frequency proportion and their word frequency than direct proportionality,
Display word frequency compares difference value;
Identical verb/noun replaces too many levels phrase:
Noun, verb word frequency compare difference value;
Wherein it is possible in verb-noun (value) the chain data correlation template made from unit (department) data resource, choosing
Determine the high noun phrase (set) of word frequency, compare from verb-noun (value) the chain data pass of personal population sample data resource
Gang mould plate (set);Verb-noun (value) chain data correlation template where the identical noun that the match is successful can be used as described
The unit that (high frequency) noun of unit (department) is the theme supplies the verb-noun of the matching relationship between personal group need
(value) chain data correlation template;
Wherein, verb-noun (value) the chain data correlation template made from unit (department) data resource, word frequency is selected
High noun phrase, compare from the office complex sample data resource of numerous unit set or obtained from the making of all data resources
Verb-noun (value) the chain data correlation template taken, the unit (department) can be obtained and provided in office complex sample data
In the verb word frequency degree in (high frequency) noun-verb action (value) chain link that source is overall or all data resources are overall
Difference value positioning scenarios relatively and situation.
5. according to the third aspect of the present invention, there is provided one kind uses verb-noun (value) chain data correlation template construct
Class template, and the comparison method of class template, it is characterised in that:
One, the extraction of class template (data clues+processing rule+characteristic value collection), which makes, performs following steps:
(a) step 1:By natural language processing instruments such as corpus, ontologies storehouses, (main body) data are provided
In verb-noun (value) the chain data correlation template (set) in source, all noun phrases classify/cluster, and to similar
The adjacent principle for respectively retaining at least one phrase (i.e. adjacent verb) of template position or so where noun, a selection part (including
Same class noun) composition verb-noun (value) chain data correlation template fragment, mark off generic verb-noun (valency
Value) chain segment (set), form classification/cluster template, i.e. each class template may the verb including less phrase chain-
Noun (value) chain segment (set), and classification is named:
Item name a ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...
: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... verb-noun (value) chain segment
(word frequency v, n, n2 are positive integers) }
Wherein, same method, the identical similar synonym near synonym of verb phrases can also be used, divides the class of similar verb
Other template:
(b) step 2:Classification identical verb-noun (value) chain segment set, with original phrase, (segment is all dynamic
Word, noun) a part of clue is used as, combination clue (first) is gathered,
Verb-noun is obtained and matches respectively to retain (identical) principle of at least one verb phrases with class noun or so
(value) chain segment ordering rule is gathered as part rule, rule of combination (second),
Word frequency with verb-noun (value) chain segment of class noun is used for a part of statistical characteristics, assemblage characteristic
It is worth (3rd) set,
Simplify and represent and be developed in details respectively:
Class template (data clues+processing rule+characteristic value collection)=class template data clues (... it is verb 1, same
Class noun 1, verb 1, verb 2, with class noun 2, verb 2...)+processing rule:Phrase chain segment ordering rule (...: verb 1
: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2: ...)+characteristic value collection (...: verb 1: similar
Noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency
Than ...: v: (n+n2): v: ...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers)) }
Wherein, other classifying rules can also be set, with verb-noun (value) chain data correlation template classification rule one
Rise, composition class template rule;
Wherein it is possible to verb-noun (value) chain data correlation template construct list for unit (department) data resource
Position (department) class template;
Wherein it is possible to made for verb-noun (value) the chain data correlation template extraction of personal population data resource
Personal demographic categories template;
Wherein it is possible to verb-noun (value) chain data correlation template extraction for office complex sample data resource
Make office complex class template;
Two, the comparison of class template (data clues+processing rule+characteristic value collection) performs following steps:
Former class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb
2...)+processing rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2:
Verb 2: ...) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v
∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... phrase chain piece hyphenation
Frequency ratio (word frequency v, n, n2 are positive integers)) }
The object type template (data clues+processing rule+characteristic value collection) of comparison;
(a) step 1:According to former class template identical rule, from the object type template to be compared (data clues+from
Manage rule+characteristic value collection) data clues in, extract the same noun phrase in former class template data clues;
If not extracting identical same noun phrase, return starts;
If extracting the same noun phrase of above-mentioned whole, data clues (... verb 1, with class noun 1, verb 1, verb 2, same
Class noun 2, verb 2...), the match is successful for data clues, into next step;
(b) step 2:Equally classify according to former class template/clustering rule classified, obtain same category phrase chain
Segment ordering rule:
To in verb-noun (value) the chain data correlation template (set) of the object to be compared (main body) data resource,
All noun phrases classify/cluster, and respectively retain identical individual phrase original to adjacent with template position where class noun or so
Then, identical a part of verb-noun (value) chain segment is chosen, marks off generic verb-noun (value) chain segment collection
Close, form classification/cluster template, classification is named, obtained:
Object type title A phrase sequence processing regular (sequence of phrase chain segment) ...: verb 11: with class noun 11
: verb 11: ...;...: verb 22: with class noun 22: verb 22: ...;... phrase chain segment }
Compare
Former item name a phrase segment sequence processing rule (...: verb 1: with class noun 1: verb 1: ...;...∶
Verb 2: with class noun 2: verb 2: ...;... data correlation template phrase chain segment)
In phrase chain segment sequence;
If the phrase chain segment sequence ... of processing rule: verb 11: with class noun 11: verb 11: ... with ...: it is dynamic
Word 1: with class noun 1: verb 1: ... matching is consistent;...: verb 22: with class noun 22: verb 22: ... with ...: verb 2:
With class noun 2: verb 2: ... matching is consistent;... regular phrase chain segment sequence all matchings are handled by that analogy unanimously, to be entered
In next step;
If phrase chain segment sequence processing rule mismatches, return starts;
(c) step 3:Rule is handled according to former class template identical, comparing similar in two class templates includes phase
The word frequency ratio to be sorted with title and identical phrase chain segment:
Comparison object type template characteristic value set ...: verb 11: with class noun 11: verb 11: ... ≈ word frequency
Than ...: v: (n+n2): v: ...;...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v
∶...;... word frequency ratio (word frequency v, n, n2 are positive integers) }
With
Former class template characteristic value collection ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+
n2)∶v∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word frequency ratio
(word frequency v, n, n2 are positive integers) }
It is compared;If:
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 1
: with class noun 1: verb 1: ...;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 2
: with class noun 2: verb 2: ...
... characteristic value word frequency is than all identical or be approximately equal to by that analogy, and as the match is successful for characteristic value collection,
The match is successful for data clues in class template (data clues+processing rule+characteristic value collection), handles rule
The sequence of phrase chain segment is consistent, and for the word frequency of characteristic value collection than equal, result is that the match is successful for two class templates;
If:Characteristic value word frequency is that it fails to match for class template than not waiting;
Wherein, if also using other classifying rules, determine that matching compares success or not according to other rules, all matchings
Success can be just the success of whole template matches, and the failure of any one local matching can all cause whole template matches to fail.
6. according to the fourth aspect of the present invention, there is provided one kind uses verb-noun (value) chain data correlation template construct
The method of custom built forms, it is characterised in that perform following steps:
(a) step 1:In verb-noun (value) chain data correlation template set of theme (first) data resource,
Choose the name word list (for example, noun first, noun second ...) of high word frequency;
(b) step 2:With the noun of the high frequency name word list, the verb-noun (valency with target (second) data resource
Value) the noun matching in chain data correlation template set compares;
(for example, noun first with ...: verb third: noun third: verb first: noun first: verb fourth: noun fourth: ... ≈ word frequency
Than ...: v: (n+n2): v: (n+n2): v: (n+n2): ... noun third, noun first, noun third compares;)
(c) step 3:In verb-noun (value) chain data correlation template of target (second) data resource matching into
Work(noun position of the same name (for example, noun first of the same name is in verb-noun (value) chain data correlation template ...: noun third:
Verb first: noun first position of the same name: verb fourth: noun fourth: .. ≈ word frequency ratios ...: (n+n2): v: (n+n2): v: (n+n2)
: ...), choose at least one verb and a noun on verb/noun word alternating phrase chain to the left, to the right or to the left and right sides;
The verb-noun chosen to the left, the to the right or to the left and right sides alternating word of the position (not including noun of the same name)
Group association chain (for example, noun third: verb first ≈ word frequency ratios (n+n2): v and verb fourth: noun fourth ≈ word frequency is than v: (n+n2))
As theme (first) data resource and the data correlation custom built forms (set) of target (second) data resource;
Wherein, by the use of can be as the verb-noun (valency of unit (department) data resource of theme (first) data resource
Value) chain data correlation template, the verb-noun of the personal population data resource with target (second) data resource can be used as
(value) chain data correlation template, production unit's supply-personal group need custom built forms;
Wherein, by the use of can be as the verb-noun (valency of unit (department) data resource of theme (first) data resource
Value) chain data correlation template, verb-name of the office complex sample data resource with target (second) data resource can be used as
Word (value) chain data correlation template, production unit's supply-office complex value chain supply chain custom built forms;
Wherein, by the use of can be as verb-noun (value) chain number of the personal data resource of theme (first) data resource
According to relation template, verb-noun (value) chain of the personal population data resource with target (second) data resource can be used as
Data correlation template, make individual-team learning contacts custom built forms.
7. according to the fifth aspect of the present invention, there is provided a kind of intelligence system that data mining is carried out based on personal mobile device
System, includes corpus, ontologies storehouse etc., it is characterised in that personal mobile device include the output of personal mobile device it is defeated
Enter synchronization module, template characteristic extraction module, wherein:
(1) output of personal mobile device input synchronization module, for by the input method on personal mobile device, shooting
Head, shared drive, caching, temporary file caching, application APP record and are saved in local temporary file, network opening connects
Mouthful, navigation API etc. outputs input personal data resource synchronous asynchronous replicate and collect, there is provided give template characteristic extraction to make module
Use, data desensitization bleaching or image extraction characteristic value are carried out to synchrodata wherein it is possible to pre-process;
(2) template characteristic extraction makes module, and data digging is carried out to personal data resource synchronous on personal mobile device
Pick, characteristics extraction, make personal model of data, pattern or template;
Wherein it is possible to extraction makes data model, pattern or the template of personal data resource on personal mobile device;
Verb-noun (value) chain data correlation template and class template are made wherein it is possible to extract;
Perform following steps and make verb-noun (value) chain data correlation template:
(a) step 1:Text data is carried out to judge languages Preprocessing, part-of-speech tagging, marks out and carrys out each sentence
Noun, verb;
And syntactic analysis is carried out, mark out subject, predicate, the object for carrying out each sentence;
Wherein it is possible to passive voice subject is labeled as object;
Wherein, according to the different type main body of data resource, the reference resolution processing of subject noun or object noun is carried out;
(b) step 2:The overlapping phrase for being labeled as subject and noun is extracted, extracts the overlapping word for being labeled as predicate and verb
Group, extracts the overlapping phrase for being labeled as object and noun, and they respectively subject-predicate in respective sentence and predicate-
Incidence relation corresponding to object:
The phrase set of the noun as subject is obtained respectively, as the phrase set of the verb of predicate, and as object
Noun phrase collection unification subject name set of words: predicate verb set/predicate verb set: object name set of words, i.e.
Noun: verb/
Verb: noun
Phrase combination and subject-predicate between them/predicate-object (one-to-one) incidence relation;
(c) step 3:The word frequency of each phrase of (during extraction) word frequency statisticses simultaneously marks out, and has one-to-one chain as measurement
The index of the phrase weight size of bar incidence relation,
Noun n: verb v and
Verb v: noun n2 (word frequency v, n, n2 are positive integers);
(d) step 4:Merge noun: the one-to-one chain association phrase and verb of verb: the one-to-one chain of noun closes
Join the front and rear noun phrase repeated of phrase and word frequency, and connect two words composition too many levels association chain:
...
...: verb: identical noun
Identical noun: verb: ...
...
Obtain ...: verb: (merge identical) noun: verb: ... too many levels phrase chain, formed with verb-noun
Alternately for link phrase node, i.e. ...: verb: noun: verb: noun: ... too many levels various dimensions verb/noun replace
Phrase chain, so as to by verb: noun and noun: verb, which associate phrase link and moved in circles, to be together in series, or even is formed from beginning to end
The alternate too many levels closed loop association phrase chain of verb/noun of the closed loop to interlink,
It is reciprocal to obtain alternate cycles ...: verb: noun: ... link association phrase weight index ...: verb: name
Word: ... ≈ word frequency ratio ...: v: (n+n2): ... (word frequency v, n, n2 are just whole to verb-noun (value) chain data correlation template
Number);
Wherein, following steps are continued executing with and make acquisition class template (data clues+processing rule+characteristic value collection):
(a) step 1:By natural language processing instruments such as corpus, ontologies storehouses, to personal (colony) data money
In verb-noun (value) the chain data correlation template (set) in source, all noun phrases classify/cluster, and to similar
The adjacent principle for respectively retaining at least one phrase (i.e. adjacent verb) of template position or so where noun, a selection part (including
Same class noun) composition verb-noun (value) chain data correlation template fragment, mark off generic verb-noun (valency
Value) chain segment (set), classification/cluster template is formed, and classification is named:
Item name a ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...
: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... verb-noun (value) chain segment
(word frequency v, n, n2 are positive integers) }
Wherein, same method, the identical similar synonym near synonym of verb phrases can also be used, divides the class of similar verb
Other template:
(b) step 2:Classification identical verb-noun (value) chain segment set, with original phrase, (segment is all dynamic
Word, noun) a part of clue is used as, composition clue (first) is gathered,
Obtained with adjacent (identical) principle for respectively retaining at least one verb phrases with template position where class noun or so
Gather with matching verb-noun (value) chain segment ordering rule as part rule, composition rule (second),
A part of statistical characteristics is used for the word frequency of verb-noun (value) chain segment of same class noun, composition is special
Value indicative (3rd) is gathered;
Simplify expression and expression is developed in details and be respectively:
Class template (data clues+processing rule+characteristic value collection)=class template data clues (... it is verb 1, same
Class noun 1, verb 1, verb 2, with class noun 2, verb 2...)+processing rule:Phrase chain segment ordering rule (...: verb 1
: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2: ...)+characteristic value collection (...: verb 1: similar
Noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency
Than ...: v: (n+n2): v: ...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers)) }
Wherein, other homogenous characteristics data such as geographical position, time can also be extracted, with verb-noun (value) chain number
According to relation template grouped data together, the more complicated class template of following form is formed:
Class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...;A
<North latitude N1 ", east longitude E1 ">、B<North latitude N2 ", east longitude E2 ">、C<North latitude N3 ", east longitude E3 ">、D<North latitude N4 ", east longitude E4 "
>...)+processing rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2:
Verb 2: ...;Place ordering rule A: B: C: D...) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈
Word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v
∶...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers);Place temporal characteristics value A:Time1, B:Time2, C:
Time3, D:time4...)};
Wherein, verb phrases can also be used identical close (synonym near synonym), marks off the class template of similar verb;
The synchronous personal sample data resource of personal mobile device, obtains verb-noun (value) chain data correlation template
The personal class template (data clues+processing rule+characteristic value collection) of composition;Wherein, due to being the first person, default
Subject noun.
8. based on above-mentioned identity attribute matching Compare System, it is further provided a preferred scheme, it is characterised in that number
Include according to common platform server is excavated:Template characteristic extraction module, template matching module, ATL, matching result feedback
And messaging module, wherein it is possible to according to input method, camera, shared drive, caching, temporary file caching, application program
APP records be saved in local temporary file, network opening interface, navigation API etc. outputs input personal data resource with it is personal
The different degree of main body correlation, the different weights of the word frequency of extraction are set;Wherein, the server of unit or PC include public
Companion's module is shared in data mining:
(1) common data excavates shared companion's module, for by the unit (portion on the server or PC of unit (department)
Door) data for electronic documents resource, particularly text data, converge share to data mining common platform in manual or automated manner
Server, numerous unit data resource convergence composition office complex sample data resources;
Wherein, data prediction is carried out before sharing, exclusion content is identical or content is similar and time identical electronics is literary
Shelves, it can also pre-process and data desensitization is carried out to synchrodata;
(2) the template characteristic extraction of data mining common platform server makes module, and data are carried out for data resource
Excavate, make data model, pattern or template;
Wherein, the personal data resource synchronously downloaded on numerous personal mobile devices is pooled together to form personal colony's number
According to data mining is carried out, characteristics extraction, personal population data model, pattern or template are made;Personal mobile device convergence
Personal population sample data resource, obtain the individual crowd of numerous seriation verb-noun (value) chain data correlation template compositions
Body class template (data clues+processing rule+characteristic value collection);
Wherein, to unit (department) data resource, the office complex sample data resource of numerous units convergence, mobile device
Synchronous personal data resource, the personal population sample data resource of mobile device convergence, personal population sample data resource with
The blended data resource (being used for unit supply template and demands of individuals template matches) of unit (department) data resource, personal data
Blended data resource (being used for individual speciality template and unit post template matches), the total data of resource and unit data resource
Resource, data mining is carried out, make data pattern or template;
Wherein it is possible to make verb-noun (value) chain data correlation template and class template, verb-noun is obtained
(value) chain data correlation template and class template;
Wherein, following steps are performed the-class template of personal group need can be supplied with production unit:
(d) step 1:The class name gathered with unit (department) class template (segment) of unit (department) data resource
Claim, compare the classification in personal demographic categories template (segment) set as the personal population sample data resource for comparing object
Title;
(e) step 2:If the match is successful for item name, obtain item name of the same name and adhere to unit (department) classification separately
Two set of template (segment) and personal demographic categories template (segment);Selection directly quote above-mentioned item name of the same name and
Affiliated personal demographic categories template (segment) set for comparing object, component unit supply-personal group need class template;
Result is unit supply-personal group need class template=unit item name of the same name+personal colony class of the same name
Other template (data clues+processing rule+characteristic value collection) collection is combined into;
(3) template matching module, the characteristic value for the data template of different subjects data resource compare;
Wherein, the characteristic value comparison method of its verb-noun (value) chain data correlation template and class template is carried out,
It can be used for verb-noun (value) the chain data correlation template and class template, office complex sample of unit (department)
Verb-noun (value) chain data correlation template and class template, verb-noun (value) chain data correlation template of individual
And class template, verb-noun (value) the chain data correlation template of personal colony and class template, holography verb-
The mutually corresponding matching of characteristic value between noun (value) chain data correlation template and class template compares;
Wherein, the template matching of verb-noun (value) chain data correlation performs following steps:
(a) step 1:Verb-noun (value) chain data correlation pattern that two difference (main body) data resources make
Or template (phrase set), the comparison of phrase is mutually carried out,
(b) step 2:If the identical comparison result of phrase obtains:Identical verb, identical noun, identical verb and mutually of the same name
Word, identical noun and identical verb or identical verb/noun alternating too many levels phrase chain, wherein, if adverbial word can be with
Add the comparison of identical adverbial word, i.e. identical noun, identical adverbial word and identical verb, identical adverbial word, identical verb and mutually of the same name
Word;
(c) step 3:Identical phrase carry out word frequency than comparison;
(d) step 4:Output result:
One, identical phrase word frequency is than equal result:
Identical verb and identical noun: word frequency is more equal than v: n2;
Identical noun and identical verb: word frequency is more equal than n: v;
Identical verb/noun alternating too many levels phrase: word frequency ratio ...: v: (n+n2): v: ...,
1 (n, n2) component is entirely equal;
2 (n+n2) total amounts are equal;
The match is successful for the template matching of verb-noun (value) chain data correlation;
Two, result of the identical phrase word frequency than not grade:
Identical verb:
Show the sequence of associated high-frequency noun;
Identical noun:
Show the sequence of associated high-frequency verb;
Identical verb and identical noun:
Noun word frequency is inversely proportional in verb word frequency proportion and word frequency ratio, and display word frequency compares difference value;
Identical noun and identical verb:
For noun word frequency in verb word frequency proportion and word frequency than directly proportional, display word frequency compares difference value;
Identical verb/noun replaces too many levels phrase:
Noun, verb word frequency compare difference value;
Wherein, verb-noun (value) chain data correlation template that personal data resource makes, can select certain topic
Noun (for example, interest, hobby, speciality etc.) mutually compare, matching obtains the approximate verb-noun value of personal matching theme
Chain incidence relation;
Or can be according to given noun, in verb-noun (value) chain data correlation mould that personal data resource makes
On plate, verb-noun (value) chain data correlation template matching with the population sample data resource making of individual, obtain personal
(interest, hobby, speciality etc.) in (given) noun-verb action (value) chain link in population data resource entirety is dynamic
The difference value positioning scenarios and situation that word word frequency degree compares;
Wherein, verb-noun (value) the chain data correlation template made from unit (department) data resource, word frequency is selected
High noun phrase, compare from the office complex sample data resource of numerous unit set or obtained from the making of all data resources
Verb-noun (value) the chain data correlation template taken, the unit (department) can be obtained and provided in office complex sample data
The verb word frequency degree in (high frequency) noun-verb action (value) chain link that source is overall or all data resources are overall compares
Difference value positioning scenarios and situation;
Wherein, in verb-noun (value) the chain data correlation template made from unit (department) data resource, selected word
Frequently high noun phrase, verb-noun (value) chain data correlation template from personal population sample data resource is compared;Matching
Verb-noun (value) chain data correlation template where successful noun can be as (high frequency) noun of the unit (department)
Verb-noun (value) the chain data correlation template for the matching relationship being the theme between supply and personal group need;
Wherein, former class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, dynamic
Word 2...)+processing rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2
: verb 2: ...) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v
∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... phrase chain piece hyphenation
Frequency ratio (word frequency v, n, n2 are positive integers)) }
The object type template (data clues+processing rule+characteristic value collection) of comparison;
Class template (data clues+processing rule+characteristic value collection), which compares, performs following steps:
(a) step 1:According to former class template identical rule, to the object to be compared (main body) object type template
In the data clues of (data clues+processing rule+characteristic value collection), extraction compares of the same name in former class template data clues
Phrase;
If not extracting identical same noun phrase, return starts;
If extracting the same noun phrase of above-mentioned whole, data clues (... verb 1, with class noun 1, verb 1, verb 2, same
Class noun 2, verb 2...) the match is successful for data clues, into next step;
(b) step 2:Equally classify according to former class template/clustering rule classified, obtain same category phrase chain
Segment sorts:
To in verb-noun (value) the chain data correlation template (set) of the object to be compared (main body) data resource,
All noun phrases classify/cluster, and respectively retain identical individual phrase original to adjacent with template position where class noun or so
Then, identical a part of verb-noun (value) chain segment is chosen, is divided into generic verb-noun (value) chain segment collection
Close, form classification/cluster template, classification is named, obtained:
Object type title A processing regular (phrase chain segment) ...: verb 11: with class noun 11: verb 11
∶...;...: verb 22: with class noun 22: verb 22: ...;... phrase chain segment }
Compare
Former item name a processing rule phrase segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2
: with class noun 2: verb 2: ...;... data correlation template phrase chain segment)
In phrase chain segment sequence processing rule;
If phrase chain segment sequence processing rule ...: verb 11: with class noun 11: verb 11: ... with ...: it is dynamic
Word 1: with class noun 1: verb 1: ... matching is consistent;...: verb 22: with class noun 22: verb 22: ... with ...: verb 2:
With class noun 2: verb 2: ... matching is consistent;... all matchings of phrase chain segment sequence processing rule are consistent by that analogy, enter
In next step;
If processing rule phrase chain segment have it is unmatched, return start;
(c) step 3:Sorted according to former class template identical and handle rule, compare bag similar in two class templates
Include the word frequency ratio of the identical phrase chain segment sequence of same names:
Comparison object type template characteristic value set ...: verb 11: with class noun 11: verb 11: ... ≈ word frequency
Than ...: v: (n+n2): v: ...;...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v
∶...;... word frequency ratio (word frequency v, n, n2 are positive integers) }
With
Former class template characteristic value collection ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+
n2)∶v∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word frequency ratio
(word frequency v, n, n2 are positive integers) }
It is compared;If:
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 1
: with class noun 1: verb 1: ...;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 2
: with class noun 2: verb 2: ...
... characteristic value word frequency is than all identical or be approximately equal to by that analogy, and as the match is successful for characteristic value collection,
The match is successful for data clues in class template (data clues+processing rule+characteristic value collection), phrase chain segment
Sequence processing rule is consistent, and for the word frequency of characteristic value collection than equal, final result is that the match is successful for two class templates;
If:Characteristic value word frequency is that it fails to match for class template than not waiting;
Wherein, if setting other classifying rules to extract other characteristics, according to other rule determine matching compare into
Whether is work(, and it can be just the success of whole template matches that all the match is successful, and the failure of any one local matching can all cause whole
Template matches fail;
Wherein, template matching module can also be included on personal mobile device;
(4) ATL, the template of the data resource for preserving various main bodys,
Wherein, preserve unit (department) verb-noun (value) chain data correlation template and class template, handle official business it is comprehensive
Close verb-noun (value) the chain data correlation template and class template, verb-noun (value) chain data of individual of sample
Relation template and class template, verb-noun (value) the chain data correlation template of personal colony and class template, holography
Verb-noun (value) chain data correlation template and class template, unit (department) and personal colony's blended data resource
Verb-noun (value) chain data correlation template, the personal verb-noun with unit (department post) blended data resource
(value) chain data correlation template, etc. template set;
(5) matching result feedback and messaging module, feed back to for each template matching message data that the match is successful
Corresponding each data resource main equipment, and for the interactive message communication between them;Wherein, be also used for by verb-
In noun (value) chain data correlation template or class template (data clues+processing rule+characteristic value collection) comparing module
Corresponding each data resource main equipment is fed back to successful message data;
Personal mobile device is run wherein it is possible to be used on personal mobile device on independent (safety) chip processor
Output input synchronization module and template characteristic extraction make module.
9. according to the sixth aspect of the present invention, there is provided one kind is based on personal mobile device, using verb-noun (value) chain
The intelligence system of the custom built forms or class template (data clues+processing rule+characteristic value collection) of data correlation template, it is special
Sign is that personal mobile device includes:Output input data synchronization module, template clue filtering module, template matches compare
Module, personal cue library, ATL, output display correspond to Agent Service content module;Wherein,
(1) output input data synchronization module, for by the input method on personal mobile device, camera, it is shared in
Deposit, cache, temporary file caching, application APP record are saved in local temporary file, network opening interface, navigation API
Deng output input data synchronously replicate collect;
Data desensitization bleaching or image extraction characteristic value are carried out to synchrodata wherein it is possible to pre-process;
(2) template clue filtering module, for above-mentioned output the data that are collected into of input data synchronization module, one by one according to
Secondary specific filtration resistance to all verbs in the custom built forms of verb-noun (value) chain data correlation template construct in ATL,
Hints data in the phrases such as noun or class template (data clues+processing rule+characteristic value collection), number of results that the match is successful
According to recorded to obtain personal cue library, and record accumulative matching times;
Wherein it is possible to recorded according to input method, camera, shared drive, caching, temporary file caching, application APP
Be saved in local temporary file, network opening interface, navigation API etc. outputs input personal data resource and individual subject's phase
The different degree of closing property, the different weights of the data of filtering are set to add up word frequency or record;
(3) template matches comparing module, the comparison for the template and the template of personal cue library extraction of ATL;
Wherein, the verb-noun (valency of unit supply-personal group need custom built forms is included but is not limited in ATL
Value) chain data correlation template custom built forms (for example, noun third: verb first ≈ word frequency is than n: v and verb fourth: noun fourth ≈ words
Frequency ratio v: n2), compare made from the extraction of personal cue library verb-noun (value) chain data correlation template of the same name (for example,
Noun third: verb first ≈ word frequency is than n: v and verb fourth: noun fourth ≈ word frequency is than v: n2);If the word frequency with noun phrase is than identical
Or be approximately equal to, that is, it is unit supply-the match is successful for personal group need custom built forms, is otherwise that it fails to match;
Wherein, in view of class template=unit item name of the same name+personal colony of unit supply-personal group need is same
Name class template (data clues+processing rule+characteristic value collection) collection is combined into, with unit supply-individual crowd in ATL
Body demand class template, compare the class template of the same name (data clues+processing rule+feature made from the extraction of personal cue library
Value set),
Unit supply-personal group need class template data clues (... verb 1, with class noun 1, verb 1, dynamic
Word 2, with class noun 2, verb 2...)+processing rule:The sequence of phrase chain segment (...: verb 1: with class noun 1: verb 1
∶...;...: verb 2: with class noun 2: verb 2: ...)+characteristic value collection (...: verb 1: with class noun 1: verb 1: ...
≈ word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2):
v∶...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers)) }
Class template of the same name (the data clues+processing rule+characteristic value collection that the object individual cue library extraction of comparison makes
Close);
Perform following class template and compare step:
(a) step 1:- class template identical the rule of personal group need is supplied according to unit, to pair to be compared
As (main body) object type template (data clues+processing rule+characteristic value collection) data clues in, extraction compare unit supply
To the same noun phrase in the data clues of the class template of-personal group need;
If not extracting identical same noun phrase, return starts;
If extracting the same noun phrase of above-mentioned whole, data clues (... verb 1, with class noun 1, verb 1, verb 2, same
Class noun 2, verb 2...) the match is successful for data clues, into next step;
(b) step 2:Supply according to unit-class template of personal group need equally classifies/clustering rule divided
Class, obtain same category phrase chain segment:
To in verb-noun (value) the chain data correlation template (set) of the object to be compared (main body) data resource,
All noun phrases classify/cluster, and respectively retain identical individual phrase original to adjacent with template position where class noun or so
Then, identical a part of verb-noun (value) chain segment is chosen, is divided into generic verb-noun (value) chain segment collection
Close, form classification/cluster template, classification is named, obtained:
Object type title A processing regular (phrase chain segment) ...: verb 11: with class noun 11: verb 11
∶...;...: verb 22: with class noun 22: verb 22: ...;... phrase chain segment }
Compare
The class template a of unit supply-personal group need processing rule phrase segment (...: verb 1: same to class noun
1: verb 1: ...;...: verb 2: with class noun 2: verb 2: ...;... data correlation template phrase chain segment)
In phrase chain segment sequence processing rule;
If the phrase chain segment sequence ... of processing rule: verb 11: with class noun 11: verb 11: ... with ...: it is dynamic
Word 1: with class noun 1: verb 1: ... matching is consistent;...: verb 22: with class noun 22: verb 22: ... with ...: verb 2:
With class noun 2: verb 2: ... matching is consistent;... regular phrase chain segment sequence all matchings are handled by that analogy unanimously, to be entered
In next step;
If processing rule phrase chain segment sequence have it is unmatched, return start;
(c) step 3:Supply according to unit-the class template identical of personal group need handles rule, compare two
The similar word frequency ratio for including the identical phrase chain segment sequence of same names in class template:
Comparison object type template characteristic value set ...: verb 11: with class noun 11: verb 11: ... ≈ word frequency
Than ...: v: (n+n2): v: ...;...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v
∶...;... word frequency ratio (word frequency v, n, n2 are positive integers) }
With
Unit supply-personal group need class template characteristic value collection ...: verb 1: with class noun 1: verb 1
: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+
n2)∶v∶...;... word frequency ratio (word frequency v, n, n2 are positive integers) }
It is compared;
If:
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 1
: with class noun 1: verb 1: ...;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 2
: with class noun 2: verb 2: ...
... characteristic value word frequency is than all identical or be approximately equal to by that analogy, and as the match is successful for characteristic value collection,
Number in the class template of the same name (data clues+processing rule+characteristic value collection) that personal cue library extraction makes
According to clue, the match is successful, and the phrase chain segment for handling rule sorts unanimously, and than equal, final result is the word frequency of characteristic value collection
The match is successful for two class templates;
If:Characteristic value word frequency is that it fails to match for class template than not waiting;
Unit supply-personal group need custom built forms the match is successful or personal colony class template of the same name the match is successful all
Template matches are formed to compare successfully;
(4) output display corresponding data service content module, for the template matching in template matches comparing module
After success, then the corresponding data, services content set by output display on personal mobile device;
(5) ATL, for preserving the unit supply-individual crowd of verb-noun (value) chain data correlation template construct
Body demand custom built forms set and unit supply-personal group need class template set;
Wherein, allow through personal mobile device, renewal unit supply-personal group need custom built forms, individual can be downloaded
People-team learning associates the templates such as custom built forms to ATL;
(6) personal cue library, for export input data synchronization module synchrodata filtering verb-noun (value)
The number obtained in the phrases such as all verbs of the custom built forms of chain data correlation template, noun and class template after hints data
According to the personal cue library of composition.
10. according to the seventh aspect of the present invention, there is provided one kind based on personal mobile device carry out data mining, using verb-
The intelligence system of noun (value) chain data correlation template, it is characterised in that personal mobile device includes:Input method software number
Management and matching result feedback interactive communication module, individual character are sent according to synchronization module, verb filtering and template generation module, template
ATL, verb library;
Data mining common platform server includes:Template receives management and matching result feedback interactive communication module, mould
Plate comparing module, ATL;
Wherein, personal mobile device includes:
(1) input method software data simultaneous module, for will be synchronous to the input method input data on personal mobile device
Replicate and collect;
(2) verb filtering and template generation module, the data being collected into for above-mentioned input method software data simultaneous module,
The verb in filtering verb library is compared successively, performs following steps generation individual character template:
(a) step 1:With all conventional verb set in verb library, input method software data simultaneous module is filtered successively
The text data being collected into;
(b) step 2:Part-of-speech tagging is carried out to sentence where the filtering verb that the match is successful, marks out the name come in sentence
Word;
Also Sentence Grammar where the filtering verb that the match is successful is analyzed, (trying one's best) marks out the subject for carrying out sentence, predicate
And object;
Whether the verb for judging to be filled into is predicate verb;
Wherein, or the text data that the input method of time-wise separation is collected into adds and matches somebody with somebody punctuation mark automatically, enters
The processing of the reference resolution of row subject noun or object noun;
(c) step 3:If the verb being filled into is predicate verb, extract that the sentence is overlapping to be labeled as predicate and verb
Phrase, extraction is overlapping to be labeled as subject and noun, extracts the overlapping phrase for being labeled as object and noun, and they exist respectively
The one-to-one incidence relation of corresponding subject-predicate/predicate-object in respective sentence:Subject noun: predicate verb/predicate moves
Word: object noun;
(d) step 4:If extract step 3 noun: verb/verb: noun phrase combination and they between
Subject-predicate/predicate-object (one-to-one) incidence relation, is saved in individual character ATL;
The data that input method software data simultaneous module is collected into, the verb in filtering individual character ATL is compared successively
: noun/noun: verb phrases combination, record the matching word frequency of each phrase and mark out come, as measurement two have it is one-to-one
The index of the phrase weight size of chain incidence relation, verb word frequency v: noun word frequency n2/ noun word frequency n: verb word frequency v (words
Frequency v, n, n2 are positive integers), obtain verb: noun ≈ word frequency is than v: n2/ noun: verb-noun of the verb ≈ word frequency than n: v
(value) chain data correlation template (set);
(3) template sends management and matching result feedback interactive communication module, the user management for personal mobile device
Individual character ATL, which shows and managed, sends template to data mining common platform server and the template of specified body data resource
It is compared, and manages user and communicated with corresponding Parties ' Mutual;
(4) individual character ATL, for preserving the individual character template set of verb filtering and template generation module generation;
(5) verb library, for preserving conventional verb;
Wherein, Chinese is often included, but are not limited to as follows with verb:
Represent action behavior:Say, see, walking, listening, laughing at, taking, circling in the air, running, eating, singing, drinking, striking, sitting, shouting, staring at, kicking, hearing,
Touch, criticize, publicizing, safeguarding, learning, studying, carry out, start, stopping, forbidding
Represent that change be present disappears:, it is dead, have, be equal to, occur, develop, develop, grow, it is dead, exist, eliminate
Represent psychological activity:Think, like, hating, primary, miss, intend, liking, wishing, evil primary, worry, be disagreeable, feeling, thinking
Represent to judge:It is, be, is
Representing may wish necessity (auxiliary verb):Can, can, meeting, can with, be willing to, be ready, agree, dare, should, should,
Match somebody with somebody, be worth, would rather
Represent to tend to (directional verb):It is upper and lower, into and out of, return, open, cross,, come up, get off, come in, come out, return
Come, come, come, go, up, go down, enter, go out it is main, go back, open, the past
Represent development verb:As grown, withering, germinateing, result, spawning;
For plan, system, scheme, file etc.:
Work out, work out, draft, draft, authorize, audit, examine, transmit, deliver, submit, report, assign, put on record, deposit
Shelves, present one's view
For information, data:
Investigate, study, collect, arrange, analyze, conclude, analyze, summarize, provide, report, feed back, pass on, notify, send out
Cloth, maintenance management
On a certain work (higher level):
Preside over, organize, instructing, arranging, coordinating, indicating, supervising, managing, distributing, controlling, take the lead it is responsible, examination & approval, authorization,
Sign and issue, ratify, assess
Thinking behavior:
Research, analysis, assess, development, suggest, proposal, participate in, recommend, plan
Direct action:
Organize, carry out, performing, instructing, leading, controlling, supervising, use, production, participate in, illustrate, explaining, providing, assisting
Higher level's behavior:
License, ratify, define, determining, instructing, establishing, planning, supervising, determining
Administration behaviour:
Reach, assess, control, coordinate, ensure, identify, keep, supervise
Expert's behavior:
Analyze, assist, promote, get in touch with, suggest, recommend, support, assess, evaluate
Subordinate's behavior:
Check, check, collect, obtain, submit, make
Other:Maintain, keep, establish, exploitation, prepare, processing, perform, reception, arrange, monitoring, report, manage, really
Recognize, generalities, cooperation, cooperation, acquisition, verification, inspection, contact, design, test, construction, change, write, drafting, guiding, passing
Pass, translate, operating, ensureing, preventing, solving, introducing, paying, calculating, revising, undertaking, negotiating, conferring, interviewing, refusing, be no
Certainly, monitor, predict, compare, delete, use
Wherein, data mining common platform server includes:
(1) template receives management and matching result feedback interactive communication module, for receiving to send on personal mobile device
The individual character template to come over, template matching module is forwarded, is compared with the specified template of ATL;
Matching result data feedback personal device, and it is interactive logical with comparison template main equipment to carry out personal device
Letter;
(2) template matching module, for receiving individual character template and the template ratio for the body data resource specified from ATL
It is right;And matching result is received into management and matching result feedback interactive communication module feedback to personal device by template;
(3) ATL, for preserving the ATL of various body data resources;
Wherein, template includes but is not limited to, verb-noun (value) the chain data correlation template and class of unit (department)
Other template, verb-noun (value) the chain data correlation template of office complex sample and class template, verb-name of individual
Word (value) chain data correlation template and class template, personal colony verb-noun (value) chain data correlation template with
And the template set such as class template, verb-noun (value) the chain data correlation template of holography and class template.
The present invention effect be:Data value, information intelligent.
1, commercially largely structureless data (such as Word, PDF, this paper document extracts, XML file etc.) are collected with mechanism
Exit pattern (namely valuable information and knowledge) is refined, establishes the distribution of upstream and downstream and periphery supply chain and extension value chain
Model, pattern, template, to instruct business or the strategic transformation of mechanism progress internet thinking to provide simply and easily data branch
Support.
2, for non-professional medium-sized and small enterprises personnel and the wound visitor of double wounds, the training for carrying out professional knowledge is generally required, is learned
Specialized market investigation is practised, product analysis relevant knowledge, further extends the time for holding the market demand.And with middle petty trade or
Mechanism demand is the theme, and data mining is carried out to its user or service object, obtains user or unit supply-individual crowd of service
Body demand characteristic template, so as in the wider product kimonos for carrying out intelligent intelligence analysis and targetedly recommending process of consumption
Business, plays a part of marketing expert data support system.
3, information source body creates main body integrated personal intelligent mobile terminal with consumption demand and extracts the data of oneself
Pattern, the magnanimity that carries out socialization with associated mechanisms or other people is shared match, realize the accurate speciality of acquisition, interest, hobby,
Value, service or study, social, collaboration collaboration object, and under personal " common property data " value chain support of magnanimity, intellectuality carries
Enter a higher school to practise social life the Culture Character and improve work and cooperate with exchange economy benefit.
Brief description of the drawings
With reference to accompanying drawing, other features and advantage of the invention, the principle of the present invention can be carried out by following citing
Explain, and become more apparent upon from the explanation of preferred embodiment.
Social function main body and its data resource schematic diagram caused by Fig. 1 display informations source;
Fig. 2 shows that extraction makes the flow of data correlation characteristic value pattern or an embodiment of template method invention
Figure;
Fig. 3 shows that extraction makes another embodiment flow of data correlation characteristic value pattern or template inventive method
Figure;
Fig. 4 shows the flow of an embodiment of the comparison method invention of data correlation characteristic value pattern or template
Figure;
Fig. 5 shows one invented using verb-noun (value) chain data correlation template construct class template method
The flow chart of embodiment;
Fig. 6 shows an embodiment flow chart of the comparison method invention of class template;
Fig. 7 shows that one embodiment of the present invention is customized using verb-noun (value) chain data correlation template construct
The method flow diagram of template;
Fig. 8 shows an a kind of embodiment of the intelligence system invention that data mining is carried out based on personal mobile device
System structure diagram;
Fig. 9 shows a kind of another embodiment party for the intelligence system invention that data mining is carried out based on personal mobile device
The system structure diagram of formula;
Figure 10 shows a kind of another implementation for the intelligence system invention that data mining is carried out based on personal mobile device
The class template method flow diagram of mode production unit supply-personal group need;
Figure 11 shows that one kind is based on personal mobile device, using determining for verb-noun (value) chain data correlation template
One embodiment of the intelligence system of pallet or class template (data clues+processing rule+characteristic value collection) invention
System structure diagram;
Figure 12 shows that one kind carries out data mining based on personal mobile device, using verb-noun (value) chain data
The system structure diagram of one embodiment of the intelligence system invention of relation template;
Figure 13 shows that one kind carries out data mining based on personal mobile device, using verb-noun (value) chain data
The verb filtering of one embodiment of the intelligence system invention of relation template and individual character template generation method flow chart;
Figure 14 shows all-purpose computer or microcontroller hardware and system structure diagram;
Specific implementation
For clarity and conciseness, all features of actual embodiment are not described in the description.But should
Understand must be made during any this practical embodiments are developed much specific to embodiment decision to realize
The objectives of developer, for example, meet those restrictive conditions related to system and business, and these restrictive conditions can
It is able to can be changed with the difference of embodiment.Also need to explanation is a little in order to avoid the mould because of unnecessary details
The present invention has been pasted, illustrate only in the accompanying drawings with being walked according to the closely related device structure of the solution of the present invention and/or processing
Suddenly, the other details little with relation of the present invention are eliminated.
The specific embodiment party of a Chinese of the method invention of data correlation characteristic value pattern or template is made according to extraction
Formula, with reference to shown in Fig. 2, plate performs following steps to illustrate verb-noun (value) the chain data correlation pattern of making or mould:
S201 step 1:Judge languages, sentence mark noun, verb, sentence mark subject and predicate, guest
ICTCLAS pairs of the Chinese lexical analysis system based on the hidden horse model of multilayer that the Computer Department of the Chinese Academy of Science can be used to develop
Input document is segmented and noun, the verb of each sentence of part-of-speech tagging.
Syntactic analysis instrument simultaneously carries out syntactic analysis to text data, marks out the subject, predicate, object of each sentence.
The text of input is segmented, part-of-speech tagging, name Entity recognition and interdependent syntactic analysis etc. operation.It is wherein interdependent
Syntactic analysis refers to a sentence being parsed into such a tree, and sentence center word aroused in interest, which is in, dominates other words
Center, other words directly depend on a certain word, any one word depended on when all different it is two or more other
Word.Name Entity recognition refers to identifying the word that real-life entitative concept is represented in text.Using reference resolution
Method, the object entities such as pronoun are reduced, according to the different type main body of data resource, carry out subject noun or object name
The reference resolution processing of word.
Wherein it is possible to passive voice subject is labeled as object;
Because these operations are not closely related with purport of the invention and prior art can be used to carry out, herein no longer
It is described in detail.
S202 step 2:Extract overlapping subject noun, predicate verb, object noun, and master-meaning/meaning-guest's relation.
The overlapping phrase for being labeled as subject and noun in sentence set is extracted, it is overlapping in extraction sentence set to be labeled as predicate
With the phrase of verb, the overlapping phrase for being labeled as object and noun is extracted, obtains the name set of words as subject respectively, predicate
Verb set, and the name set of words as object, and they are respectively corresponding to subject-predicate/predicate-object in sentence
Incidence relation, subject name set of words: predicate verb set/predicate verb set: object name set of words, i.e. noun: verb/dynamic
Word: the phrase combination of noun and subject-predicate between them/predicate-object (one-to-one) linked character relation;
It can obtain such as the instance data of 2 table of table 1 below table 3:
Table 1
Table 2
Table 3
S203 step 3:Word frequency cumulative statistics, high frequency acquisition model/template
The word frequency of cumulative statistics subject noun respectively, predicate verb and object verb, mark out and be used as measurement subject name
Set of words: predicate verb set/predicate verb set: object name set of words has the phrase weight feature of (one-to-one) incidence relation
It is worth size,
That is, subject noun word frequency n: predicate verb word frequency v
Predicate verb word frequency v: object noun word frequency n2 (word frequency v, n, n2 are positive integers),
Obtain (main body) data resource of incidence relation weight
Noun: verb ≈ word frequency than n: v set and
Verb: noun ≈ word frequency is gathered than v: 2n,
It is verb-noun (value) chain data correlation pattern or template (set) that high-frequency phrase is chosen in set.
Word frequency mark instance data table 1, table 2, table 3 can obtain table 4 below, table 5, the phrase word frequency data (set) of table 6,
Wherein, v represents the accumulative word frequency of verb, and n represents subject noun and adds up word frequency, and n2 represents object noun and adds up word frequency.
Part of speech | Subject noun | Predicate verb | Object noun |
Phrase | Teacher | Education | Schoolboy |
Word frequency | N=120 | V=2000 | N2=150 |
Grammer | Subject | Predicate | |
Predicate | Object |
Table 4
Part of speech | Subject noun | Predicate verb | Object noun |
Phrase | Schoolboy | Like | Time |
Word frequency | N=130 | V=1 ten thousand | N2=200 |
Grammer | Subject | Predicate | |
Predicate | Object |
Table 5
Part of speech | Subject noun | Predicate verb | Object noun |
Phrase | Teacher | Like | Time |
Word frequency | N=120 | V=1 ten thousand | N2=200 |
Grammer | Subject | Predicate | |
Predicate | Object |
Table 6
Verb-noun (the valency of (main body) the data resource instance data table 4 of incidence relation weight can be obtained
Value) chain data correlation pattern or template:
Table 7
Table 8
The table 8 of table 7 deploys verb-noun (value) chain data correlation pattern or template (set) example is expressed as:
Noun: verb=teacher: education ≈ word frequency is than n: v ≈ 120: 2000
Verb: noun=education: schoolboy ≈ word frequency is than n: v ≈ 2000: 150
Same method can obtain other verb-noun (value) chain data correlation patterns or template instances data, set
Middle selection high-frequency phrase is verb-noun (value) chain data correlation pattern or template (phrase set).
Method according to extraction making data correlation characteristic value pattern or template invents the specific embodiment party of another Chinese
Formula, with reference to shown in Fig. 3, make verb-noun (value) chain data correlation pattern or template performs following steps to illustrate:
Wherein, step S301 and Fig. 2 S201, step S302 and Fig. 2 S202, step S303 and Fig. 2 S203 phase
Together;
S304 step 4:Merge noun of the same name
Merge noun: (one-to-one) the association phrase of fixation and verb of verb: the fixation (one-to-one) of noun associates phrase
The noun phrase of front and rear repetition, and connect two words composition too many levels association phrase chain:
...
...: verb: identical noun
Identical noun: verb: ...
...
Merge table 4, table 5, the noun of the same name of table 6 and word frequency:
Verb | Word frequency | Noun | Word frequency (master+guest) |
Like | V=1 ten thousand | Time | N+n2=100+200 |
Education | V=2000 | Teacher | N+n2=120+300 |
Schoolboy | N+n2=130+150 | ||
... | ... | ... | ... |
Table 9
Merge and connect noun of the same name, obtain ...: verb: (merging identical) noun: verb: (merging identical) noun: verb
: ... too many levels phrase chain, so as to by the series connection formation of the man-to-man phrase chain (set) in part with verb/noun phrase
Alternately be hinged node ...: verb: noun: verb: noun: ... too many levels various dimensions space verb/noun replace word
Group chain, in some instances it may even be possible to form the too many levels closed loop association phrase of the verb/noun alternating phrase for the closed loop that head and the tail interlink
Chain,
It is reciprocal to obtain alternate cycles ...: verb: noun: ... link association phrase weight ≈ word frequency ratios ...: v: (n+
n2)∶...
Verb-noun (value) chain data correlation pattern or template (word frequency v, n, n2 are positive integers)
Table 9 merges into table 10
Table 10
Wherein it is possible to utilize the natural language processing such as corpus, digital dictionary, ontologies storehouse aid total score
Analysis, the adverbial word mark for the predicate verb of each sentence is carried out, accumulative word frequency is also counted when extracting adverbial word, obtains:
Adverbial word: verb: the adverbial word of the associated weights of noun: verb: noun ≈ word frequency is than a: v: n2
Obtainable instance data is:
It is strict: education: schoolboy's ≈ word frequency is than 1500: 2000: 150
Table 11
Or
Noun: adverbial word: verb-noun (value) chain data correlation pattern or template (word frequency of the verb ≈ word frequency than n: a: v
V, n, n2 are that positive integer a is 0) natural number can be;
Obtainable instance data is:
Teacher: strict: education ≈ word frequency is than 120: 1500: 2000
Table 12
Or obtain the phrase chain of too many levels:
Alternate cycles are reciprocal ...: adverbial word: verb: noun: ... the associated weights of phrase ...: adverbial word: verb: noun
: ... ≈ word frequency ratio ...: a: v: (n+n2): ... verb-noun (value) chain data correlation pattern or template (word frequency v, n,
N2 is that positive integer a is 0) natural number can be;Wherein, adverbial word can be sky.
Instance data is:
...: strict: education: schoolboy: ... ≈ word frequency is than 1500: 2000: (130+150)
Table 13
According to the comparison method of verb-noun (value) chain data correlation pattern or template invention a Chinese it is specific
Embodiment, with reference to shown in Fig. 4, following steps are performed to illustrate:
S401 step 1:Mutual comparison template (comparing phrase combination or the set of phrase chain)
Verb-noun (value) chain data correlation pattern that first, second, third, different (main body) data resource of fourth make or
Template (phrase set or the set of phrase chain), the comparison of phrase is mutually carried out,
If without identical phrase, return starts;
S402 step 2:If identical phrase or phrase chain,
Obtain:
Identical verb,
Identical noun,
Identical verb and identical noun,
Identical noun and identical verb,
Or identical verb/noun alternating too many levels phrase chain,
Wherein, if adverbial word can also add the comparison of identical adverbial word, i.e. identical noun, identical adverbial word and identical dynamic
Word, and identical adverbial word, identical verb and identical noun;
S403 step 3:Compare word frequency ratio;
Identical phrase or phrase chain carry out word frequency than comparison.
S404 step 4:If the word frequency of identical phrase or phrase chain is than equal:
Identical verb and identical noun: word frequency is more equal than v: n2
Instance data:Like: comparison of the time=word frequency than v: n2
Table 14
Sample result data:
1, template first:Verb is liked: noun time ≈ word frequency is than the ≈ 50: 1 of v: n2 ≈ 10,000: 200
Template second:Verb is liked: noun time ≈ word frequency is than the ≈ 120: 1 of v: n2 ≈ 120,000: 1000
Template third:Verb is liked: noun time ≈ word frequency is than the ≈ 50: 1 of v: n2 ≈ 1,000,000: 2 ten thousand
2, template first and the word frequency of template third ratio, like time degree the same, template successful match.
Identical noun and identical verb: word frequency is more equal than n: v
Instance data:Teacher: education=word frequency is than n: v
Table 15
Sample result data:
1, template first:
Noun teacher: verb educates ≈ word frequency than the ≈ 3: 25 of n: v ≈ 120: 2000
Template second:
Noun teacher: verb educates ≈ word frequency than the ≈ 1: 15 of n: v ≈ 20,000: 30 ten thousand
Template third:
Noun teacher: verb educates ≈ word frequency than the ≈ 3: 25 of n: v ≈ 4800: 4 ten thousand
2, template first and the word frequency of template third ratio, like time degree the same, template successful match.
Identical verb/noun replaces too many levels phrase, and word frequency is than identical:
The verb/noun alternating phrase chain of identical too many levels, word frequency ratio ...: v: (n+n2): v: ..
Instance data:...: teacher: education: schoolboy: like: the time: ...=word frequency ratio ...: v: (n+n2): v: ..
Table 16
Sample result data:
1, template first:
Teacher: ≈ word frequency schoolboy: education: is liked than n+n2: v: n+n2: v ≈ 120+300: 2000: 130+150: 1 ten thousand:
100+200≈420∶2000∶280∶10000∶300≈21∶100∶14∶500∶15
Template second:
Teacher: ≈ word frequency schoolboy: education: is liked than n+n2: v: n+n2: v ≈ 150+350: 1000: 120+130: 9000
∶140+170≈500∶1000∶250∶9000∶310≈50∶100∶25∶900∶31
Template third:
Teacher: ≈ word frequency schoolboy: education: is liked than n+n2: v: n+n2: v ≈ 440+400: 4000: 210+350: 2 ten thousand:
The ≈ 21: 100: 14: 500: 15 of 400+200 ≈ 840: 4000: 560: 2 ten thousand: 600
Template fourth:
Teacher: ≈ word frequency schoolboy: education: is liked than n+n2: v: n+n2: v ≈ 360+900: 6000: 390+450: 3 ten thousand:
The ≈ 21: 100: 14: 500: 15 of 300+600 ≈ 1260: 6000: 840: 3 ten thousand: 900
As a result:
Template first is total flux matched with template third (n+n2)
Template first matches entirely with template fourth (n, n2) component
Template successful match.
S405:Word frequency with noun phrase is than result is:
Identical verb, word frequency ratio:
Embodiment data:The comparison of identical verb " liking "
Table 17
Sample result data:
1, object noun of the template first with template second without common " liking ", there is no the subject noun of " liking " yet;
2, template second is 100 times more than template first word frequency;
3, the related high frequency noun of template first verb " liking " is:
Teacher (120+300=420)/time (100+200=300)
Schoolboy (130+150=280)/art (n+n2) ...
4, the related high frequency noun of template second verb " liking " is:
Civil servant (2000+2000=4000)/film (3000+1300=4300)
White collar (1500+2000=3500)/net purchase (n+n2) ...
Identical noun, word frequency ratio:
Embodiment data:The comparison of identical noun " schoolboy "
Table 18
Sample result data:
1, template first is with template second without the identical verb for " schoolboy ";
2, template second is more 10 times than template first " schoolboy " word frequency;
3, the related high-frequency verb of the noun " schoolboy " of template first is respectively:
(v) .. is good in education (2000)/liking (10,000) to convince (v) by patient analysis/
4, the related high-frequency verb of the noun " schoolboy " of template second is respectively:
Temper (18000)/study (140,000) love (v)/and have deep love for (v) ...
Identical verb and identical noun, word frequency ratio:
The instance data of table 11:Like: comparison of the time=word frequency than v: n2
As a result:Template first and template second
Word frequency ratio is 50: 1 (it is the time that 50, which are liked the inside 1) and 120: 1 (it is the time that 120, which are liked the inside 1), is said
Inside " the liking " of bright template first, " time " accounting is bigger;
I.e.:Two template noun word frequency respectively verb word frequency proportion with their word frequency than inversely proportional relation
Display instance data word frequency compares difference value
Template first word frequency ratio: | 50 | 1 |
Phrase chain | Like | Time |
Template second word frequency ratio: | 120 | 1 |
Table 19
Identical noun and identical verb, word frequency ratio:
The embodiment data of table 12, teacher: education=word frequency is than n: v
Template first and template second
Word frequency ratio is 3: 25 (25 education the insides 3 are teachers) and 1: 15 (it is teacher that 15, which are educated the inside 1), explanation
Template first education the inside, teacher's accounting are bigger;
I.e.:Two template noun word frequency compare direct proportionality in the proportion of verb word frequency with their word frequency
Display instance data word frequency compares difference value
Template first word frequency ratio: | 3 | 25 |
Phrase chain | Teacher | Education |
Template second word frequency ratio: | 1 | 15 |
Table 17
Identical verb/noun replaces too many levels phrase, word frequency ratio:
The embodiment data of table 13,
The verb/noun alternating phrase chain of identical too many levels, word frequency ratio ...: v: (n+n2): v: ..
Example:
...: teacher: education: schoolboy: like: the time: ...=word frequency ratio ...: v: (n+n2): v: ..
Template first mismatches with template second:
Template first:Teacher: education: schoolboy: like: time ≈ 21: 100: 14: 500: 15
Template second:Teacher: education: schoolboy: like: time ≈ 50: 100: 25: 900: 31
Display instance data word frequency compares difference value:
Table 20
According to the Chinese invented using verb-noun (value) chain data correlation template construct class template method
Embodiment, with reference to shown in Fig. 5, following steps are performed to illustrate:
S501 step 1:(the same class noun of classification/cluster, left and right is adjacent respectively to retain a verb, obtains template fragment)
By natural language processing instruments such as corpus, ontologies storehouses, to the verb of (main body first) data resource-
In noun (value) chain data correlation template set (template a, template b, template c, template d, template e... can be included), own
Noun phrase classify/cluster, and respectively retains at least one phrase (i.e. phase to template position where same class noun or so is adjacent
Adjacent verb) principle, a part of verb-noun (value) chain data correlation template fragment is chosen, specifically, to same class noun
Place template position or so is adjacent respectively retain several phrases can, but at least to retain an adjacent verb phrases, divide
Go out the data correlation template fragment set of generic verb-noun (value) chain, form higher level classification/cluster template,
That is, verb-noun (value) the chain data correlation template (segment) that each class template may include less chain is gathered, and right
Classification is named:
Item name a ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...
: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... data correlation template fragment (word frequency
V, n, n2 are positive integers) }
Table 21
Wherein, it is with class noun 1=schoolboys and with the common classification of class noun schoolgirl 2=:Student
In table 19, the same class noun of student (schoolboy and schoolgirl) left and right is adjacent, and respectively to retain at least one phrase (i.e. adjacent
Verb) principle, choose a part of verb-noun (value) chain data correlation template piece:
Table 22
Handle regular phrase segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: dynamic
Word 2: ...;... data correlation template phrase chain segment)
Characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v
∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word frequency ratio (word frequency v,
N, n2 are positive integers))
According to the instance data of table 20 so as to obtaining:
Student's class template process data { education: schoolboy: likes ≈ word frequency than v: (n+n2): v: ≈ 2000: 130+
150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand }
Wherein, same method, the identical similar synonym near synonym of verb phrases can also be used, divides the class of similar verb
Other template, i.e. verb class template includes verb-noun (value) chain data correlation template set of multiple similar verbs.
S502 step 2:Form class template (data clues+processing rule+characteristic value collection)
Classification identical verb-noun (value) chain data correlation template fragment set, with original (verb, noun or pair
Word) phrase gathers as a part of thread group zygonema rope (first), each retains so that template position where same class noun or so is adjacent
The principle of at least one verb phrases obtains and matching template fragment approach is gathered as a part of regular rule of combination (second),
A part of statistical characteristics parameter combination is used for with the word frequency of similar verb-noun (value) chain data correlation template fragment
Characteristic value (3rd) is gathered,
It is simplified shown as:Class template (data clues+processing rule+characteristic value collection), expansion is expressed as:
Class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...)+
Handle regular phrase segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2
∶...;... data correlation template phrase chain segment) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ words
Frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v
∶...;... word frequency ratio (word frequency v, n, n2 are positive integers)) }
Table 23
With table 20, data creating student's class template of table 21:
Clue (first) is gathered:Education, schoolboy, like, cherish, schoolgirl, net purchase
Regular (second) set:{ with each reservation in same class noun (schoolboy, schoolgirl) left and right and matching one verb (religion
Educate: schoolboy: like;Love: schoolgirl: net purchase) principle obtain and comparison template segment as rule
Characteristic value (3rd) is gathered: { education: schoolboy: likes ≈ word frequency than v: (n+n2): v: ≈ 2000: 130+150: 1
Ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand }
It is simplified shown as:Class template (data clues+processing rule+characteristic value collection)
Finally make:Student's class template data clues (education, schoolboy, like, cherish, schoolgirl, net purchase)
+ processing rule (education: schoolboy: is liked;Love: schoolgirl: net purchase)+characteristic value collection (and education: schoolboy: like ≈ words
Frequency ratio v: (n+n2): v: ≈ 2000: 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900:
190+200: 1.3 ten thousand) }
Same method, other templates c, template d, template e... of (main body first) data resource can also make it is more other
Class template, so as to collectively constitute the class template (set) of (main body first) data resource.
Wherein, other grouped datas such as geographical location information, temporal information can also be used, with verb-noun (value)
Chain data correlation template classification rule together, collectively constitute class template rule, for example, setting receive geographical location information and when
Between information:
Class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...;A
<North latitude N1 ", east longitude E1 ">、B<North latitude N2 ", east longitude E2 ">、C<North latitude N3 ", east longitude E3 ">、D<North latitude N4 ", east longitude E4 ">)+
Processing rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2
∶...;Place order rule A: D: C: B) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...
∶v∶(n+n2)∶v∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word
Group chain segment word frequency ratio (word frequency v, n, n2 are positive integers);Place time value A:Time1, B:Time2, C:Time3, D:
time4)}
Embodiment data:
Beijing Polytechnical University's Tongzhou Students in Branch Schools class template data clues (education, schoolboy, like, cherish, schoolgirl, net purchase
A<North latitude 39.8N1 ", east longitude 116.6E1 ">、B<North latitude 39.8N2 ", east longitude 116.6E2 ">、C<North latitude 39.8N3 ", east longitude
116.6E3”>、D<North latitude 39.8N4 ", east longitude 116.6E4 ">)+processing rule (education: schoolboy: is liked;Love: schoolgirl:
Net purchase;Place order rule A: B: C: D)+characteristic value collection (education: schoolboy: like ≈ word frequency than v: (n+n2): v: ≈
2000: 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand;Place
Time value A:am8:00-pm16:30, B:pm16:30-18:30, C:pm18:30-17:30, D:pm22:00-am8:00)}
Wherein, A<North latitude 39.8N1 ", east longitude 116.6E1 ">、B<North latitude 39.8N2 ", east longitude 116.6E2 ">、C<North latitude
39.8N3 ", east longitude 116.6E3 ">、D<North latitude 39.8N4 ", east longitude 116.6E4 ">It is the teaching in Beijing Polytechnical University Tongzhou branch school respectively
Building, sports ground, campus leisure field and dormitory geographical position longitude and latitude;Place order rule A: B: C: D, the conversion of student place
Flow ordering rule;The time value A in place:am8:00-pm16:30, B:pm16:30-18:30, C:pm18:30-17:30,
D:pm22:00-am8:00) it is to be counted in the related place residence time;
Wherein it is possible to verb-noun (value) chain data correlation template construct list for unit (department) data resource
The class template of position (department);
Wherein it is possible to made for verb-noun (value) the chain data correlation template extraction of personal population data resource
The class template of personal colony;
Wherein it is possible to verb-noun (value) chain data correlation template extraction for office complex sample data resource
Make the class template of office complex;
Further, with reference to shown in Fig. 6 class template comparison method, perform following steps and illustrate:
Before starting comparison:
(main body first) data resource student's class template (education, schoolboy, like, cherish, schoolgirl, net purchase)+
(education: schoolboy: like;Love: schoolgirl: net purchase)+education: ≈ word frequency schoolboy: is liked than v: (n+n2): v: ≈ 2000
: 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand) }
The student's class template content for comparing object (main body second) data resource is unknown:Student's class template (data clues+
Handle rule+characteristic value collection).
Start:
S601 step 1:According to former class template identical rule, to object (main body second) data resource classification of comparison
Verb-noun (value) chain data correlation template (set) of template, is extracted in the data clues of former class template with title word
Group;I.e. (education, schoolboy, like, cherish, schoolgirl, net purchase) phrase;
If not extracting identical phrase, return starts;
If comparing and successfully having extracted the same noun phrase of above-mentioned whole, student's class of object (main body second) data resource is compared
Template data clues=(education, schoolboy, like, cherish, schoolgirl, net purchase+processing rule+characteristic value collection
Into in next step;
S602 step 2:Classified according to the identical same classification/clustering rule of former class template, obtain same category word
Group chain segment:
To in verb-noun (value) the chain data correlation template (set) of object (main body second) data resource of comparison,
All noun phrases classify/cluster, and respectively retain identical individual phrase original to adjacent with template position where class noun or so
Then, identical a part of verb-noun (value) chain data correlation template fragment is chosen, marks off generic verb-noun (valency
Value) chain data correlation template fragment set, form higher level classification/cluster template, i.e. each class template may include
Verb-noun (value) chain data correlation template (segment) set of less chain, and classification is named:
Compare object (main body second) data resource:
Object type title A rule { verbs 11: with class noun 11: verb 11;Verb 22: with class noun 22: verb 22 }
Compare former (main body first) data resource:
(main body first) data resource student's class template (education, schoolboy, like, cherish, schoolgirl, net purchase)+
(education: schoolboy: like;Love: schoolgirl: net purchase)+education: ≈ word frequency schoolboy: is liked than v: (n+n2): v: ≈ 2000
: 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand) }
In regular data relation template (phrase chain) segment sequence;
Data correlation template if (phrase chain) segment verb 11: with class noun 11: verb 11. and education: schoolboy: happiness
It is joyous consistent;Verb 22: with class noun 22: verb 22 and love: schoolgirl: net purchase is consistent, it is consistent equal to handle rule;
Compare student's class template { data clues=(education, schoolboy, like, like of object (main body second) data resource
Shield, schoolgirl, net purchase+processing rule=(item name:Student;Template (phrase chain) segment sorts:Verb 11: same to class noun
11: verb 11=education: schoolboy: like;Verb 22: with class noun 22: verb 22=loves: schoolgirl: net purchase)+feature
Value set }
Into in next step;
If wherein there is unmatched phrase, return starts;
S603 step 3:Whether equal compare extraction word frequency ratio
Rule is handled according to former class template identical, compares that similar in two class templates to include same names identical
The word frequency ratio of phrase chain segment sequence:
Carry out comparison of the word frequency than characteristic value:
Compare object (main body second) item name A characteristic value collection { verbs 11: with class noun 11: the ≈ word frequency of verb 11 compares v
∶(n+n2)∶v;Verb 22: with class noun 22: the ≈ word frequency of verb 22 is than v: (n+n2): v }
With
Former (main body first) data resource student's class template (education, schoolboy, like, cherish, schoolgirl, net purchase)+
(education: schoolboy: like;Love: schoolgirl: net purchase)+education: ≈ word frequency schoolboy: is liked than v: (n+n2): v: ≈ 2000
: 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand) }
It is compared;If
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ schoolboys: happiness
Joyous ≈ word frequency is than v: (n+n2): v: ≈ 2000: 130+150: 1 ten thousand;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ is cherished: female learns
It is raw: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand
Word frequency is than identical or be approximately equal to, and as the match is successful for class template,
In brief:
Compare object (main body second) data resource
Student's class template data clues=(education, schoolboy, like, cherish, schoolgirl, net purchase)+processing rule=
(template (phrase chain) segment sorts:Verb 11: with class noun 11: verb 11=education: schoolboy: like;Verb 22: similar
Noun 22: verb 22=loves: schoolgirl: net purchase)+characteristic value collection=(verb 11: with class noun 11: the ≈ word frequency of verb 11
Than v: (n+n2): v ≈ verbs 1: with class noun 1: the ≈ of verb 1 education: schoolboy: like ≈ word frequency than v: (n+n2): v: ≈
2000: 130+150: 1 ten thousand;
Verb 22: with class noun 22: the word frequency of verb 22: ≈ is than v: (n+n2): v ≈ verbs 2: with class noun 2: the ≈ of verb 2
Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand)) }
The match is successful for the comparison of two (main body the first and second) student's class templates;
Otherwise word frequency is that it fails to match for two class templates than not waiting, and return starts;
Wherein, if setting other classifying rules, according to other rule determine matching compare success or not, all matching into
Work(can be just the success of whole template matches, and the failure of any one local matching can all cause whole template matches to fail.
The Chinese invented according to the method using verb-noun (value) chain data correlation template construct custom built forms
Embodiment, with reference to shown in Fig. 7, perform following steps and illustrate:
By the use of can be as verb-noun (value) chain number of unit (department) data resource of theme (first) data resource
According to relation template, verb-noun (value) chain of the personal population data resource with target (second) data resource can be used as
Data correlation template, production unit's supply-personal group need custom built forms:
(d) S701 step 1:In verb-noun (value) chain data correlation template set of unit (department) data resource
In, choose the name word list of high word frequency;
Instance data is:Schoolboy, young man ...
(e) S702 step 2:With the noun " schoolboy " of the high frequency name word list, personal colony's number with example table 24
Match and compare according to the noun in verb-noun (value) chain data correlation template set of resource;
Table 24
(f) S703 step 3:Matched in verb-noun (value) chain data correlation template of personal population data resource
Successful noun position of the same name, embodiment schoolboy is in verb-noun (value) chain data correlation template ...: training in rotation: teacher
: education: noun schoolboy of the same name: like: the time: ... ≈ word frequency ratio ...: 3000: (120+300): 2000: (130+150): 1
Ten thousand: (100+200): ...), to the left, to the right or to the left and right sides verb/noun word alternating phrase chain on choose a verb and
One noun;
Teacher: education ≈ word frequency ratios (120+300): 2000
Like: time ≈, word frequency was than 10,000: (100+200)
The verb-noun that the position is chosen to the left, to the right or to the left and right sides (not including noun schoolboy of the same name) is handed over
Chain is associated for phrase:
Teacher: education ≈ word frequency ratios (120+300): 2000
Like: time ≈, word frequency was than 10,000: (100+200) turns into unit (department) data resource and provided with personal population data
The data correlation custom built forms (set) in source.
According to it is a kind of based on personal mobile device carry out data mining intelligence system invent, a Chinese specific reality
Mode is applied, the schematic diagram with reference to shown in Fig. 8, to illustrate intelligence system:
Intelligence system includes corpus, ontologies storehouse etc., it is characterised in that personal mobile device also includes individual
Mobile device output input synchronization module, template characteristic extraction module,
(1) output of personal mobile device input synchronization module, specially the input method software data including smart mobile phone 1
Synchronization module 1-1 and geographical location information synchronization module 1-3, for will be provided to the input text personal data on smart mobile phone 1
The output datas such as the geographical location information of source and smart mobile phone navigation API, which synchronously replicate, collects, there is provided is extracted to 2-1 template characteristics
Module is made to use;
Wherein it is possible to pretreatment module 1-2 is used, for carrying out data desensitization, filtering useless and repeat number to synchrodata
According to;
(2) template characteristic extraction makes module 2-1, soft to input method on smart mobile phone 1 using natural language processing technique
Personal data resource synchronous part data simultaneous module 1-1 carries out data mining, characteristics extraction, make personal data pattern or
Template;
Wherein it is possible to make the data pattern or template of personal data resource in smart mobile phone 1;
Wherein it is possible to verb-noun (value) chain data correlation template and class template are made,
The step as shown in Fig. 2 Fig. 3 is performed, verb-noun (value) chain data correlation template is made, in page 28
It is described in detail to page 33, can be directly referenced next, it is not repeated to describe.
Step as shown in Figure 5 is continued executing with, makes and obtains class template (data clues+processing rule+characteristic value collection
Close), it is described in detail at page 40 to page 44, can be directly referenced next, it is not repeated to describe.
Wherein, verb phrases can also be used identical close (synonym near synonym), is divided into the class template of similar verb;
Wherein, other homogenous characteristics data such as geographical position, time can also be extracted, with verb-noun (value) chain number
According to relation template grouped data together, the more complicated class template of following form is formed:
Class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...;A
<North latitude N1 ", east longitude E1 ">、B<North latitude N2 ", east longitude E2 ">、C<North latitude N3 ", east longitude E3 ">、D<North latitude N4 ", east longitude E4 ">)+
Processing rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2
∶...;Place order rule A: D: C: B) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...
∶v∶(n+n2)∶v∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word
Group chain segment word frequency ratio (word frequency v, n, n2 are positive integers);Place time value A:Time1, B:Time2, C:Time3, D:
time4)}
Embodiment data:
Beijing Polytechnical University's Tongzhou Students in Branch Schools class template data clues (education, schoolboy, like, cherish, schoolgirl, net purchase
A<North latitude 39.8N1 ", east longitude 116.6E1 ">、B<North latitude 39.8N2 ", east longitude 116.6E2 ">、C<North latitude 39.8N3 ", east longitude
116.6E3”>、D<North latitude 39.8N4 ", east longitude 116.6E4 ">)+processing rule (education: schoolboy: is liked;Love: schoolgirl:
Net purchase;Place order rule A: B: C: D)+characteristic value collection (education: schoolboy: like ≈ word frequency than v: (n+n2): v: ≈
2000: 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand;Place
Time value A:am8:00-pm16:30, B:pm16:30-18:30, C:pm18:30-17:30, D:pm22:00-am8:00)}
Wherein, A<North latitude 39.8N1 ", east longitude 116.6E1 ">、B<North latitude 39.8N2 ", east longitude 116.6E2 ">、C<North latitude
39.8N3 ", east longitude 116.6E3 ">、D<North latitude 39.8N4 ", east longitude 116.6E4 ">It is the teaching in Beijing Polytechnical University Tongzhou branch school respectively
Building, sports ground, campus leisure field and dormitory;Place order rule A: B: C: D, the flow rules of student place conversion;
Place time value A:am8:00-pm16:30, B:pm16:30-18:30, C:pm18:30-17:30, D:pm22:00-
am8:00) it is to be counted in the related place residence time;
The synchronous personal sample data resource of personal mobile device, obtains verb-noun (value) chain data correlation template
The personal class template (data clues+processing rule+characteristic value collection) of composition;Wherein, due to being the first person, default
Subject noun;
The personal population sample data resource of personal mobile device convergence, obtains numerous seriation verb-nouns (value)
The personal demographic categories template (data clues+processing rule+characteristic value collection) of chain data correlation template composition.
Personal mobile device is based on according to one kind, carries out the specific reality of a Chinese of the intelligence system invention of data mining
Mode is applied, the schematic diagram with reference to shown in Fig. 9, to illustrate intelligence system:
The server or PC 3 of unit include common data and excavate shared companion module 3-1, data mining common platform
Server 2 includes:Template characteristic extraction module 2-1, template matching module 2-2, ATL 2-3, matching result feed back and disappeared
Message communication module 2-4, wherein:
(1) output input personal data synchrodata module, specially input method software data simultaneous module 1-1, use
Collection is replicated in personal data resource synchronous asynchronous will be inputted to input method on personal mobile device (i.e. smart mobile phone 1) etc.,
(or after pretreatment module 1-2 carries out data desensitization pretreatment to synchrodata) downloads to data mining common platform service
Device 2;
(2) common data excavates shared companion module 3-1, for the unit on the server or PC 3 by unit (department)
It is public to share to data mining for (department) data for electronic documents resource, particularly text data, in manual or automated manner convergence
Platform Server 2, numerous unit data resource convergence composition office complex sample data resources;
Wherein, data prediction is carried out by pretreatment module 3-2 before sharing, exclude that content is identical or content is similar and
Time identical electronic document;
(3) template characteristic extraction make module 2-1, for data resource carry out data mining, make data pattern or
Template;
Wherein, the personal data resource synchronously downloaded on numerous personal mobile devices is pooled together to form personal colony's number
According to data mining is carried out, characteristics extraction, personal population data model, pattern or template are made;Personal mobile device convergence
Personal population sample data resource, obtain the individual crowd of numerous seriation verb-noun (value) chain data correlation template compositions
Body class template (data clues+processing rule+characteristic value collection);
Wherein, to being set from unit (department) data resource, the office complex sample data resource of numerous units convergence, movement
Standby synchronous personal data resource, personal population sample data resource, the personal population sample data resource of mobile device convergence
With the blended data resource (being used for unit supply template and demands of individuals template matches), individual number of unit (department) data resource
According to the blended data resource of resource and unit data resource (being used for individual speciality template and unit post template matches), whole numbers
Data mining is carried out according to resource, makes data pattern or template;
Wherein it is possible to make verb-noun (value) chain data correlation template and class template, verb-noun is obtained
(value) chain data correlation template and class template;
Perform as the step of Figure 10 can supply with production unit-class template of personal group need:It is personal to compare object
Population data student's class template data clues (education, schoolboy, like, cherish, schoolgirl, net purchase)+processing rule
(education: schoolboy: like;Love: schoolgirl: net purchase)+characteristic value collection (and education: schoolboy: like ≈ word frequency than v: (n+
N2): v: ≈ 2000: 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200:
1.3 ten thousand) }
(f) S1001 step 1:The comparison of class template-item name:
Item name " student " class gathered with unit (department) class template (segment) of unit (department) data resource
Not, the item name in personal demographic categories template (segment) set of personal population sample data resource is compared;
(g) S1002 step 2:Item name have it is identical, directly quote compare object class template content, obtain unit
Supply-personal group need name class template:
The match is successful for " student " item name, obtains " student " item name of the same name and adheres to unit (department) classification mould separately
Two set of plate (segment) and personal demographic categories template (segment);
The item name of above-mentioned " student " of the same name and affiliated comparison object individual's demographic categories template are directly quoted in selection
(segment) is gathered, component unit supply-personal group need class template;Result is unit supply-personal group need classification
Template=unit item name of the same name+personal colony class template of the same name (data clues+processing rule+characteristic value collection) is gathered
Composition;" student " class template data clues (education, schoolboy, like, cherish, schoolgirl, net purchase)+processing rule (education
: schoolboy: like;Love: schoolgirl: net purchase)+characteristic value collection (education: ≈ word frequency schoolboy: is liked than v: (n+n2): v:
≈ 2000: 130+150: 1 ten thousand;Love: schoolgirl: net purchase ≈ word frequency is than v: (n+n2): v: ≈ 1900: 190+200: 1.3 ten thousand) }
(4) template matching module 2-2, the characteristic value for the data template of different subjects data resource compare;
Wherein, the characteristic value comparison method of its verb-noun (value) chain data correlation template and class template is carried out,
It can be used for verb-noun (value) the chain data correlation template and class template, office complex sample of unit (department)
Verb-noun (value) chain data correlation template and class template, verb-noun (value) chain data correlation template of individual
And class template, verb-noun (value) the chain data correlation template of personal colony and class template, holography verb-
The mutually corresponding matching of characteristic value between noun (value) chain data correlation template and class template compares;
Wherein, the template matching of verb-noun (value) chain data correlation performs step as shown in Figure 4, the 34th
The 1st row of page is described in detail to page 40 page 7, can be directly referenced, is not repeated to describe.
Wherein, according to data resource different type main body, descriptor can be counted by natural language processing and is used as
The noun of subject or semi-artificial selected subject noun;
Wherein, verb-noun (value) chain data correlation template that personal data resource makes, can select certain name
Word phrase (for example, interest, hobby, speciality etc.) is mutual to be compared, and matching obtains individual and matches approximate verb-noun value chain pass
Connection relation;Or can be according to given noun, in verb-noun (value) chain data correlation template that personal data resource makes
On, verb-noun (value) chain data correlation template matching with the population sample data resource making of individual, obtain personal exists
(interest, hobby, speciality etc.) verb in (given) noun-verb action (value) chain link in population data resource entirety
The difference value positioning scenarios and situation that word frequency degree compares;
Wherein, verb-noun (value) the chain data correlation template made from unit (department) data resource, word frequency is selected
High noun phrase, compare from the office complex sample data resource of numerous unit set or obtained from the making of all data resources
Verb-noun (value) the chain data correlation template taken, the unit (department) can be obtained and provided in office complex sample data
The verb word frequency degree comparison in (high frequency) noun-verb action (value) chain link that source is overall or all data resources are overall
Difference value positioning scenarios and situation;
Wherein, in verb-noun (value) the chain data correlation template made from unit (department) data resource, selected word
Frequently high noun phrase, verb-noun (value) chain data correlation template from personal population sample data resource is compared;Matching
Verb-noun (value) chain data correlation template where successful noun can be as (high frequency) noun of the unit (department)
Verb-noun (value) the chain data correlation template for the matching relationship being the theme between supply and personal group need;
Wherein, class template (data clues+processing rule+characteristic value collection), which compares, performs step as shown in Figure 6,
It is described in detail at page 44 to page 47, can be directly referenced next, it is not repeated to describe.
Wherein, template matching module can also be included on personal mobile device, is moved for personal mobile device and individual
Point-to-point template matching is carried out between equipment, without by being carried out on server;
(5) ATL 2-3, the template of the data resource for preserving various main bodys,
Wherein, preserve unit (department) verb-noun (value) chain data correlation template and class template, handle official business it is comprehensive
Close verb-noun (value) the chain data correlation template and class template, verb-noun (value) chain data of individual of sample
Relation template and class template, verb-noun (value) the chain data correlation template of personal colony and class template, holography
Verb-noun (value) chain data correlation template and class template, unit (department) and personal colony's blended data resource
Verb-noun (value) chain data correlation template, the personal verb-noun with unit (department post) blended data resource
(value) chain data correlation template, etc. template set;
(6) matching result feedback and messaging module 2-4, feed back for each template matching message data that the match is successful
To corresponding each data resource main equipment, and for the interactive message communication between them;
Wherein, it is also used for verb-noun (value) chain data correlation template or class template (data clues+processing rule
Then+characteristic value collection) the message data that the match is successful in comparing module feed back to corresponding to each data resource main equipment, it is special
It is not and 1 corresponding matching result of smart mobile phone feedback and messaging module 1-4 interactive communications.
Personal mobile device is based on according to one kind, using the custom built forms of verb-noun (value) chain data correlation template
Or the specific embodiment party of a Chinese of the intelligence system invention of class template (data clues+processing rule+characteristic value collection)
Formula, the schematic diagram with reference to shown in Figure 11, to illustrate the intelligence system:
It is characterised by that personal mobile device smart mobile phone 1 includes:Output input data synchronization module, i.e. input method is soft
Part data simultaneous module 1-1, template clue filtering module 1-2, template matches comparing module 1-3, personal cue library 1-6, template
Storehouse 1-5, output display correspond to Agent Service content module 1-5;Wherein,
(1) input method software data simultaneous module 1-1, for will be synchronous to the input method input data on smart mobile phone 1
Replicate and collect;
(2) template clue filtering module 1-2, the data being collected into for above-mentioned input method software data simultaneous module 1-1,
One by one successively specific filtration resistance in the custom built forms of verb-noun (value) chain data correlation template construct in ATL 1-5
Hints data in the phrases such as all verbs, noun or class template (data clues+processing rule+characteristic value collection), matching
Successful result data recorded to obtain personal cue library 1-6, and record accumulative matching times;
(3) template matches comparing module 1-3, template and the template of personal cue library 1-6 extractions for ATL 1-5
Compare;
Wherein, the custom built forms of verb-noun (value) chain data correlation template in ATL 1-5, embodiment data
For:Noun third: verb first ≈ word frequency is than n: v teacher ≈: education ≈ word frequency ratios (120+300): 2000 and verb fourth: noun fourth ≈
Word frequency is liked than v: n2 ≈: time ≈, word frequency was than 10,000: (100+200), compares respectively from personal cue library 1-6 and directly extracts system
Verb-noun (value) the chain data correlation template of the same name made:Noun third: verb first teacher ≈: education word ≈ frequency ratios n: v and
Verb fourth: noun fourth ≈ likes: time ≈, word frequency was than v: n2;
If the word frequency with noun phrase is than identical or be approximately equal to, that is, it is unit supply-personal group need custom built forms
Otherwise it is that it fails to match with success;
Wherein, the method for making the custom built forms of verb-noun (value) chain data correlation template, using dynamic shown in Fig. 7
The step of word-noun (value) chain data correlation template construct custom built forms, had been described in.
Class template (data clues+processing rule+characteristic value collection), which compares, performs step as shown in Figure 6, the
The row of page 44 the 18th is described in detail to page 47 page 9, can directly it is referenced come, be not repeated to describe.
Wherein, in view of class template=unit item name of the same name+personal colony of unit supply-personal group need is same
Name class template (data clues+processing rule+characteristic value collection) collection is combined into, with unit supply-individual in ATL 1-5
Group need class template, compare made from personal cue library 1-6 extractions class template of the same name (data clues+processing rule+
Characteristic value collection).
The class template of the same name (data clues+processing rule+characteristic value collection) made from personal cue library 1-6 extractions is such as
Step shown in Fig. 5, make and obtain class template (data clues+processing rule+characteristic value collection), in page 40 the 8th
Go to page 44 page 17 and be described in detail, can be directly referenced next, it is not repeated to describe.
(4) output display corresponding data service content module 1-4, for the template in template matches comparing module 1-3
Compare after the match is successful, then the corresponding data, services content set by output display on smart mobile phone 1, general is the confession of unit
To data;
(5) ATL 1-5, for preserving the unit supply-individual of verb-noun (value) chain data correlation template construct
The set of group need custom built forms and unit supply-personal group need class template set;
Wherein, allow through smart mobile phone 1, can download renewal unit supply-personal group need custom built forms, individual-
Team learning associates the templates such as custom built forms to ATL 1-5;
(6) personal cue library 1-6, for export input data synchronization module synchrodata filter verb-noun (valency
Value) obtained after hints data in the phrase such as all verbs of custom built forms of chain data correlation template, noun and class template
Data form personal cue library 1-6.
Data mining is carried out based on personal mobile device according to one kind, using verb-noun (value) chain data correlation mould
The embodiment of one Chinese of the intelligence system invention of plate, the schematic diagram with reference to shown in Figure 12, to illustrate intelligence
System:
It is characterised by that personal mobile device smart mobile phone 1 includes:Input method software data simultaneous module 1-1, verb mistake
Filter and template generation module 1-2, template send management and matching result feedback interactive communication module 1-3, individual character ATL 1-4,
Verb library 1-5;
Data mining common platform server 2 includes:Template receives management and matching result feedback interactive communication module 2-
1st, template matching module 2-2, ATL 2-3;
Wherein, personal mobile device smart mobile phone 1 includes:
(1) input method software data simultaneous module 1-1, the data for will be inputted to the input method on smart mobile phone 1 are same
Step, which replicates, collects;
(2) verb filtering and template generation module 1-2, are collected into for above-mentioned input method software data simultaneous module 1-1
Data, compare the verb in filtering verb library 1-5 successively, following steps performed with reference to shown in Figure 13, describe generation in detail
Property template:
(e) S1301 step 1:Verb library filter synchronous data
With all conventional verb set of 1-5 in verb library, including " liking ", input method software data syn-chronization is filtered successively
The text data that module 1-1 is collected into;
(f) S1302 step 2:Sentence predicate verb where matching verb " liking " mark grammer
Part-of-speech tagging is carried out to sentence where the filtering verb " liking " that the match is successful, marks out the noun come in sentence;
Also Sentence Grammar where the filtering verb " liking " that the match is successful is analyzed, (trying one's best) marks out the master for carrying out sentence
Language, predicate and object;
Whether the verb " liking " for judging to be filled into is predicate verb;Namely it is labeled as the filtering verb of predicate;
Wherein, or the text data that the input method of time-wise separation is collected into adds and matches somebody with somebody punctuation mark automatically, enters
The processing of the reference resolution of row subject noun or object noun;
(g) S1303 step 3:If the verb " liking " being filled into is predicate verb, the overlapping mark of the sentence is extracted
For predicate and the phrase of verb, extraction is overlapping to be labeled as subject and noun, extracts the overlapping phrase for being labeled as object and noun, with
And they respectively in respective sentence corresponding subject-predicate/predicate-object one-to-one incidence relation:Subject noun: predicate
Verb/predicate verb: object noun;It see the table below 25
Table 25
(h) S1304 step 4:If extract the noun of step 3: verb/verb: the phrase of noun combines and he
Between subject-predicate/predicate-object fix (one-to-one) incidence relation, be saved in individual character ATL 1-4;
The data that input method data synchronization module 1-1 is collected into, compare successively described dynamic in filtering individual character ATL 1-4
Word: noun/noun: verb phrases combination, record the matching word frequency of each phrase and mark out, as measurement, two have a pair
The index of the phrase weight size of one chain incidence relation, verb word frequency v: noun word frequency n2/ noun word frequency n: verb word frequency v
(word frequency v, n, n2 are positive integers),
Obtain verb: noun ≈ word frequency is than v: n2/ noun: verb-noun (value) the chain data of verb ≈ word frequency than n: v
Relation template (set);
It is according to the instance data of table 25:
Schoolboy: like ≈ word frequency than 130: 1 ten thousand/like: time ≈, word frequency was than 10,000: 200
(3) template sends management and matching result feedback interactive communication module 1-3, the user management for smart mobile phone 1
Individual character ATL 1-4 shows and sent template to data mining common platform server 2, receives management by template and matching is tied
Fruit feeds back interactive communication module 2-1, is compared with the template of specified body data resource, and manages user and corresponding main body is mutual
Dynamic communication;
(4) individual character ATL 1-4, for preserving the individual character template set of verb filtering and template generation module 1-2 generations;
(5) verb library 1-5, for preserving conventional verb;
Wherein, Chinese is often included, but are not limited to as follows with verb:
Represent action behavior:Say, see, walking, listening, laughing at, taking, circling in the air, running, eating, singing, drinking, striking, sitting, shouting, staring at, kicking, hearing,
Touch, criticize, publicizing, safeguarding, learning, studying, carry out, start, stopping, forbidding
Represent that change be present disappears:, it is dead, have, be equal to, occur, develop, develop, grow, it is dead, exist, eliminate
Represent psychological activity:Think, like, hating, primary, miss, intend, liking, wishing, evil primary, worry, be disagreeable, feeling, thinking
Represent to judge:It is, be, is
Representing may wish necessity (auxiliary verb):Can, can, meeting, can with, be willing to, be ready, agree, dare, should, should,
Match somebody with somebody, be worth, would rather
Represent to tend to (directional verb):It is upper and lower, into and out of, return, open, cross,, come up, get off, come in, come out, return
Come, come, come, go, up, go down, enter, go out it is main, go back, open, the past
Represent development verb:As grown, withering, germinateing, result, spawning;
For plan, system, scheme, file etc.:
Work out, work out, draft, draft, authorize, audit, examine, transmit, deliver, submit, report, assign, put on record, deposit
Shelves, present one's view
For information, data:
Investigate, study, collect, arrange, analyze, conclude, analyze, summarize, provide, report, feed back, pass on, notify, send out
Cloth, maintenance management
On a certain work (higher level):
Preside over, organize, instructing, arranging, coordinating, indicating, supervising, managing, distributing, controlling, take the lead it is responsible, examination & approval, authorization,
Sign and issue, ratify, assess
Thinking behavior:
Research, analysis, assess, development, suggest, proposal, participate in, recommend, plan
Direct action:
Organize, carry out, performing, instructing, leading, controlling, supervising, use, production, participate in, illustrate, explaining, providing, assisting
Higher level's behavior:
License, ratify, define, determining, instructing, establishing, planning, supervising, determining
Administration behaviour:
Reach, assess, control, coordinate, ensure, identify, keep, supervise
Expert's behavior:
Analyze, assist, promote, get in touch with, suggest, recommend, support, assess, evaluate
Subordinate's behavior:
Check, check, collect, obtain, submit, make
Other:
Maintain, keep, establish, exploitation, prepare, processing, perform, reception, arrange, monitoring, report, manage, confirm, concept
Change, cooperate, cooperate, obtain, check, check, get in touch with, design, test, build, change, write, draft, guide, transmit, turn over
Translate, operate, ensure, prevent, solve, introduce, pay, calculate, revise, undertake, negotiate, confer, interview, refuse, veto, supervise
Depending on, predict, compare, delete, use
Wherein, data mining common platform server 2 includes:
(1) template receives management and matching result feedback interactive communication module 2-1, for receiving the cope plate of smart mobile phone 1
Send management and matching result feeds back the individual character template that interactive communication module 1-3 is sended over, template matching module is forwarded, with mould
The specified template in plate storehouse is compared;
Matching result data feedback smart mobile phone 1, and it is interactive logical with comparison template main equipment to carry out smart mobile phone 1
Letter;
(2) template matching module 2-2, for receiving individual character template and the template for the body data resource specified from ATL
Compare;And matching result is fed back into interactive communication module 2-1 by template receiving management and matching result and feeds back to smart mobile phone
1;
(3) ATL 2-3, for preserving the ATL of various body data resources;
Wherein, template includes but is not limited to, verb-noun (value) the chain data correlation template and class of unit (department)
Other template, verb-noun (value) the chain data correlation template of office complex sample and class template, verb-name of individual
Word (value) chain data correlation template and class template, personal colony verb-noun (value) chain data correlation template with
And the template set such as class template, verb-noun (value) the chain data correlation template of holography and class template.
Although being described in conjunction with the accompanying several embodiments of the present invention, those of ordinary skill in the art can be with
Various deformations or amendments are made within the scope of the appended claims.For example, other people's mobile devices can be converted, including intelligence
Energy mobile phone, navigation equipment, car networking equipment, Internet of Things mobile device etc..
Personal mobile device, smart mobile phone, navigation equipment, car networking equipment, Internet of Things mobile device, data mining are public
Platform Server, unit service device and PC, Cloud Server, corpus ontology knowledge base service etc., they are all that to include system total
Line, CPU, conventional computer system, micro controller system or the embedded system structure of memory and input/output interface, such as scheme
Shown by 14.
The embodiment of the simply invention described in this description, various illustrations are not in the essence of invention
Appearance is construed as limiting, the specific reality that person of an ordinary skill in the technical field can be described after specification has been read to more than
The mode of applying is made an amendment or deformed, without departing from the spirit and scope of invention.
Above in association with specific embodiment describe the present invention general principle but it is to be noted that skill to this area
It will be appreciated that the whole or any steps or part of methods and apparatus of the present invention can calculate dress any for art personnel
(including processor, storage medium etc.) is put either in the network of computing device in the form of hardware, firmware, software or its combination
Realize, this be those skilled in the art in the case where having read description of the invention using its basic circuit design knowledge or
The basic programming skill of person can be achieved with.And the invention also provides the journey of several instruction codes for being stored with machine-readable
Sequence system product.The instruction code can perform above-mentioned method according to embodiments of the present invention when being read and performed by machine.Phase
The storage medium of program product of the ground for carrying the above-mentioned instruction code for being stored with machine-readable is answered to be also included within the present invention
Disclosure in.The storage medium includes but is not limited to floppy disk, CD, magneto-optic disk, storage card, memory stick etc..By soft
Part or firmware are realized in the case of the present invention from storage medium or network to computer (such as Figure 14 with specialized hardware structure
Shown all-purpose computer) installation forms the program computer of the software and various work(is able to carry out when being provided with various programs
Can etc..
Claims (10)
1. one kind is based on natural language processing technique, extraction makes the method for data correlation characteristic value pattern or template, its feature
It is that making verb-noun (value) chain data correlation pattern or template performs following steps:
(a) step 1:The text data of (main body) data resource is carried out judging languages pretreatment, part-of-speech tagging, marked out
Noun, the verb of each sentence;
And syntactic analysis is carried out to text data, mark out the subject, predicate, object of each sentence;
Wherein it is possible to passive voice subject is labeled as object;
Wherein, the reference resolution processing of subject noun or object noun is carried out;
(b) step 2:The overlapping phrase for being labeled as subject and noun in sentence set is extracted, extracts overlapping mark in sentence set
For predicate and the phrase of verb, the overlapping phrase for being labeled as object and noun is extracted, obtains the name set of words as subject respectively,
As the verb set of predicate, and the name set of words as object, and they respectively subject-predicate/predicate in sentence-
Linked character relation corresponding to object, subject name set of words: predicate verb set/predicate verb set: object name set of words,
That is, noun: verb/verb: the phrase combination of noun and (one-to-one) association of subject-predicate/predicate-object between them
Characteristic relation;
(c) step 3:The accumulative word frequency of subject noun, predicate verb and object verb is counted respectively, is marked out and is used as measurement
Subject name set of words: predicate verb set/predicate verb set: object name set of words contains the phrase of (one-to-one) incidence relation
Weight characteristic value size, i.e.
Subject noun word frequency n: predicate verb word frequency v/
Predicate verb word frequency v: object noun word frequency n2 (word frequency v, n, n2 are positive integers),
Obtain (main body) data resource of incidence relation weight:
Noun: verb ≈ word frequency than n: v set and
Verb: noun ≈ word frequency is gathered than v: 2n,
The word frequency phrase of high frequency is chosen in set turns into verb-noun (value) chain data correlation pattern or template (phrase collection
Close).
2. according to claim 1 be based on natural language processing technique, verb-noun (value) chain data correlation mould is made
The method of formula or template, it is characterised in that:
Wherein, step 4:Merge noun: (one-to-one) the association phrase of verb is scolded with dynamic: (one-to-one) of noun associates phrase
The noun phrase and word frequency of front and rear repetition of the same name:
...
...: verb: identical noun
Identical noun: verb: ...
...
The word of noun two composition too many levels association phrase chain of the same name is connected, is obtained ...: verb: (merging identical) noun: verb:
(merge identical) noun: verb: ... too many levels phrase chain, so as to by man-to-man phrase chain (set) series connection in part
Formed with verb/noun phrase alternately for hinged node ...: verb: noun: verb: noun: ... too many levels various dimensions
Verb/noun replaces phrase chain, in some instances it may even be possible to forms the polycyclic of the verb/noun alternating phrase for the closed loop that head and the tail interlink
Save closed loop association phrase chain;
Merge the word frequency n+n2 of subject noun and object noun, it is reciprocal to obtain alternate cycles ...: verb: noun: ... link
Associate phrase weight ...: verb: noun: ... ≈ word frequency ratio ...: v: (n+n2): ... verb-noun (value) chain data
Association mode or template (phrase set, word frequency v, n, n2 are positive integers),
That is, verb-noun (value) chain data correlation pattern or template (phrase set), both can be man-to-man words
Group chain (noun: verb ≈ word frequency is than n: v or verb: noun ≈ word frequency is than v: n2) can also be the phrase chain of too many levels
(...: verb: noun: verb ... ≈ word frequency ratios ...: v: (n+n2): v: ...) two kinds of forms.
3. according to claim 1 or 2 be based on natural language processing technique, make and obtain verb-noun (value) chain number
According to association mode or the method for template, it is characterised in that:
Wherein it is possible to using the natural language processing such as corpus, digital dictionary, ontologies storehouse aid comprehensive analysis, enter
The hand-manipulating of needle is marked to the adverbial word of the predicate verb of each sentence, and cumulative statistics its word frequency is gone back when extracting adverbial word, is obtained:
Adverbial word: verb: the adverbial word of the associated weights of noun: verb: noun ≈ word frequency is than a: v: n2 or noun: adverbial word: verb ≈
(word frequency v, n, n2 are that positive integer a is natural number for verb-noun (value) chain data correlation pattern of the word frequency than n: a: v or template
Can be 0);
Or obtain the phrase chain of too many levels:
Alternate cycles are reciprocal ...: adverbial word: verb: noun: adverbial word: ... the associated weights of phrase ...: adverbial word: verb: noun
: ... ≈ word frequency ratio ...: a: v: (n+n2): a: ... verb-noun (value) chain data correlation pattern or template (word frequency v,
N, n2 are that positive integer a is natural number and can be 0);Wherein, adverbial word can be sky.
A kind of 4. comparison method of verb-noun (value) chain data correlation pattern or template (phrase set), it is characterised in that
Perform following steps:
(a) step 1:Verb-noun (value) the chain data correlation pattern or mould that two difference (main body) data resources make
Plate (set of phrase chain), the comparison of phrase is mutually carried out,
(b) step 2:If comparison result obtains:
Identical verb,
Identical noun,
Identical verb and identical noun,
Identical noun and identical verb,
Or identical verb/noun alternating too many levels phrase chain,
Wherein, if adverbial word can also add the comparison of identical adverbial word, i.e.
Identical noun, identical adverbial word and identical verb,
Identical adverbial word, identical verb and identical noun,
Into in next step;
(c) step 3:Identical phrase carry out word frequency than comparison;
(d) step 4:Output result:
One, word frequency is than equal result:
Identical verb and identical noun: word frequency is more equal than v: n2, template successful match;
Identical noun and identical verb: word frequency is more equal than n: v, template successful match;
Identical verb/noun replaces too many levels phrase:Word frequency ratio ...: v: (n+n2): v: ...,
1 (n, n2) component is entirely equal;
2 (n+n2) total amounts are equal;
Template successful match;
Two, result of the word frequency than not grade:
Identical verb:
Show the sequence of associated high-frequency noun;
Identical noun:
Show the sequence of associated high-frequency verb;
Identical verb and identical noun:
Two template noun word frequency are poorer than inversely proportional relation, display word frequency ratio in the proportion of verb word frequency and their word frequency respectively
Different value;
Identical noun and identical verb:
Two template noun word frequency compare difference with their word frequency in the proportion of verb word frequency than direct proportionality, display word frequency
Value;
Identical verb/noun replaces too many levels phrase:
Noun, verb word frequency compare difference value;
Wherein it is possible in verb-noun (value) the chain data correlation template made from unit (department) data resource, selected word
Frequently high noun phrase (set), verb-noun (value) chain data correlation mould from personal population sample data resource is compared
Plate (set);Verb-noun (value) chain data correlation template where the identical noun that the match is successful can be used as the unit
The unit that (high frequency) noun of (department) is the theme supplies the verb-noun (valency of the matching relationship between personal group need
Value) chain data correlation template;
Wherein, verb-noun (value) the chain data correlation template made from unit (department) data resource, it is high to select word frequency
Noun phrase, compare from the office complex sample data resource of numerous unit set or make what is obtained from all data resources
Verb-noun (value) chain data correlation template, it is whole in office complex sample data resource can to obtain the unit (department)
It is relative in verb word frequency degree in (high frequency) noun-verb action (value) chain link of body or all data resources entirety
The difference value positioning scenarios and situation compared.
5. one kind uses verb-noun (value) chain data correlation template construct class template, and the ratio other side of class template
Method,
Characterized in that,
One, the extraction of class template (data clues+processing rule+characteristic value collection), which makes, performs following steps:
(a) step 1:By natural language processing instruments such as corpus, ontologies storehouses, to (main body) data resource
In verb-noun (value) chain data correlation template (set), all noun phrases classify/cluster, and to same class noun
The adjacent principle for respectively retaining at least one phrase (i.e. adjacent verb) of place template position or so, a selection part (including it is similar
Noun) composition verb-noun (value) chain data correlation template fragment, mark off generic verb-noun (value) chain
Segment (set), form classification/cluster template, i.e. each class template may include the verb-noun of less phrase chain
(value) chain segment (set), and classification is named:
Item name a ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: it is dynamic
Word 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... verb-noun (value) chain segment (word
Frequency v, n, n2 are positive integers) }
Wherein, same method, the identical similar synonym near synonym of verb phrases can also be used, divides the classification mould of similar verb
Plate:
(b) step 2:Classification identical verb-noun (value) chain segment set, with original phrase (all verbs of segment, name
Word) a part of clue is used as, combination clue (first) is gathered,
Verb-noun (value) is obtained and matches respectively to retain (identical) principle of at least one verb phrases with class noun or so
Chain segment ordering rule is gathered as part rule, rule of combination (second),
Word frequency with verb-noun (value) chain segment of class noun is used for a part of statistical characteristics, assemblage characteristic value (
Three) gather,
Simplify and represent and be developed in details respectively:
Class template (data clues+processing rule+characteristic value collection)=class template data clues (... verb 1, same to class name
Word 1, verb 1, verb 2, with class noun 2, verb 2...)+processing rule:Phrase chain segment ordering rule (...: verb 1: same
Class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2: ...)+characteristic value collection (...: verb 1: same to class name
Word 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency
Than ...: v: (n+n2): v: ...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers)) }
Wherein, other classifying rules can also be set, together with verb-noun (value) chain data correlation template classification rule,
Form class template rule;
Wherein it is possible to verb-noun (value) chain data correlation template construct unit (portion for unit (department) data resource
Door) class template;
Wherein it is possible to make individual for verb-noun (value) the chain data correlation template extraction of personal population data resource
Demographic categories template;
Wherein it is possible to made for verb-noun (value) the chain data correlation template extraction of office complex sample data resource
Office complex class template;
Two, the comparison of class template (data clues+processing rule+characteristic value collection) performs following steps:
Former class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...)+place
Reason rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2
: ...)+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...∶
Verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... phrase chain segment word frequency ratio (word frequency
V, n, n2 are positive integers)) }
The object type template (data clues+processing rule+characteristic value collection) of comparison;
(a) step 1:According to former class template identical rule, from the object type template to be compared (data clues+processing rule
Then+characteristic value collection) data clues in, extract the same noun phrase in former class template data clues;
If not extracting identical same noun phrase, return starts;
If extracting the same noun phrase of above-mentioned whole, data clues (... verb 1, with class noun 1, verb 1, verb 2, same to class name
Word 2, verb 2...) the match is successful for data clues, into next step;
(b) step 2:Equally classify according to former class template/clustering rule classified, obtain same category phrase chain segment
Ordering rule:
To in verb-noun (value) the chain data correlation template (set) of the object to be compared (main body) data resource, owning
Noun phrase classify/cluster, and respectively retains identical individual phrase principle, choosing to adjacent with template position where class noun or so
Identical a part of verb-noun (value) chain segment is taken, marks off generic verb-noun (value) chain segment set, group
Constituent class/cluster template, classification is named, obtained:
Object type title A phrase sequence processing regular (phrase chain segment) ...: verb 11: with class noun 11: verb 11
∶...;...: verb 22: with class noun 22: verb 22: ...;... phrase chain segment }
Compare
Former item name a phrase segment sequence processing rule (...: verb 1: with class noun 1: verb 1: ...;...: verb 2
: with class noun 2: verb 2: ...;... data correlation template phrase chain segment)
In phrase chain segment sequence;
If the phrase chain segment sequence ... of processing rule: verb 11: with class noun 11: verb 11: ... with ...: verb 1:
With class noun 1: verb 1: ... matching is consistent;...: verb 22: with class noun 22: verb 22: ... with ...: verb 2: similar
Noun 2: verb 2: ... matching is consistent;... handling regular phrase chain segment sequence by that analogy, all unanimously, entrance is next for matching
Step;
If phrase chain segment sequence processing rule mismatches, return starts;
(c) step 3:Handle rule according to former class template identical, compare in two class templates it is similar include it is mutually of the same name
Claim the word frequency ratio with the sequence of identical phrase chain segment:
Comparison object type template characteristic value set ...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v:
(n+n2)∶v∶...;...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word
Frequency ratio (word frequency v, n, n2 are positive integers) }
With
Former class template characteristic value collection ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v
∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word frequency ratio (word frequency v,
N, n2 are positive integers) }
It is compared;
If:
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 1: same
Class noun 1: verb 1: ...;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 2: same
Class noun 2: verb 2: ...
... characteristic value word frequency is than all identical or be approximately equal to by that analogy, and as the match is successful for characteristic value collection,
The match is successful for data clues in class template (data clues+processing rule+characteristic value collection), handles the phrase of rule
The sequence of chain segment is consistent, and for the word frequency of characteristic value collection than equal, result is that the match is successful for two class templates;
If:Characteristic value word frequency is that it fails to match for class template than not waiting;
Wherein, if also using other classifying rules, determine that matching compares success or not according to other rules, all the match is successful
Can be just the success of whole template matches, the failure of any one local matching can all cause whole template matches to fail.
6. the method that one kind uses verb-noun (value) chain data correlation template construct custom built forms, it is characterised in that perform
Following steps:
(a) step 1:In verb-noun (value) chain data correlation template set of theme (first) data resource, choose
The name word list (for example, noun first, noun second ...) of high word frequency;
(b) step 2:With the noun of the high frequency name word list, the verb-noun (value) with target (second) data resource
Noun matching in chain data correlation template set compares;
(for example, noun first with ...: verb third: noun third: verb first: noun first: verb fourth: noun fourth: ... ≈ word frequency
Than ...: v: (n+n2): v: (n+n2): v: (n+n2): ... noun third, noun first, noun third compares;)
(c) step 3:What the match is successful in verb-noun (value) chain data correlation template of target (second) data resource
Noun position of the same name is (for example, noun first of the same name is in verb-noun (value) chain data correlation template ...: noun third: verb
First: noun first position of the same name: verb fourth: noun fourth: .. ≈ word frequency ratios ...: (n+n2): v: (n+n2): v: (n+n2): ...),
To the left, at least one verb and a noun are chosen on verb/noun word alternating phrase chain to the right or to the left and right sides;
The verb-noun chosen to the left, the to the right or to the left and right sides alternating phrase of the position (not including noun of the same name) closes
Connection chain turns into (for example, noun third: verb first ≈ word frequency ratios (n+n2): v and verb fourth: noun fourth ≈ word frequency is than v: (n+n2))
The data correlation custom built forms (set) of theme (first) data resource and target (second) data resource;
Wherein, by the use of can be as verb-noun (value) chain of unit (department) data resource of theme (first) data resource
Data correlation template, the verb-noun (value) of the personal population data resource with target (second) data resource can be used as
Chain data correlation template, production unit's supply-personal group need custom built forms;
Wherein, by the use of can be as verb-noun (value) chain of unit (department) data resource of theme (first) data resource
Data correlation template, the verb-noun (valency of the office complex sample data resource with target (second) data resource can be used as
Value) chain data correlation template, production unit's supply-office complex value chain supply chain custom built forms;
Wherein, by the use of can as the personal data resource of theme (first) data resource verb-noun (value) chain data close
Gang mould plate, verb-noun (value) chain data of the personal population data resource with target (second) data resource can be used as
Relation template, make individual-team learning contacts custom built forms.
7. a kind of intelligence system that data mining is carried out based on personal mobile device, includes corpus, ontologies storehouse etc., its
Be characterised by, personal mobile device include the output of personal mobile device input synchronization module, template characteristic extraction module, its
In:
(1) output of personal mobile device input synchronization module, for by the input method on personal mobile device, camera, altogether
Internal memory, caching, temporary file caching, application APP record is enjoyed to be saved in local temporary file, network opening interface, lead
Navigate the output such as API input personal data resource synchronous asynchronous replicate and collect, there is provided give template characteristic extraction to make module and use,
Data desensitization bleaching or image extraction characteristic value are carried out to synchrodata wherein it is possible to pre-process;
(2) template characteristic extraction makes module, and data mining is carried out to personal data resource synchronous on personal mobile device, special
Value indicative is extracted, and makes personal model of data, pattern or template;
Wherein it is possible to data model, pattern or the template of personal data resource are made on personal mobile device;
Wherein it is possible to make verb-noun (value) chain data correlation template and class template;
Perform following steps and make verb-noun (value) chain data correlation template:
(a) step 1:Text data is carried out to judge languages Preprocessing, part-of-speech tagging, marks out the name for carrying out each sentence
Word, verb;
And syntactic analysis is carried out, mark out subject, predicate, the object for carrying out each sentence;
Wherein it is possible to passive voice subject is labeled as object;
Wherein, according to the different type main body of data resource, the reference resolution processing of subject noun or object noun is carried out;
(b) step 2:The overlapping phrase for being labeled as subject and noun is extracted, extracts the overlapping phrase for being labeled as predicate and verb,
Extract the overlapping phrase for being labeled as object and noun, and their subject-predicates and predicate-object in respective sentence respectively
Corresponding incidence relation:
The phrase set of the noun as subject is obtained respectively, as the phrase set of the verb of predicate, and the name as object
The phrase set of word -- subject name set of words: predicate verb set/predicate verb set: object name set of words, i.e.
Noun: verb/
Verb: the phrase combination of noun and subject-predicate between them/predicate-object (one-to-one) incidence relation;
(c) step 3:The word frequency of each phrase of (during extraction) word frequency statisticses simultaneously marks out, and has one-to-one chain to close as measurement
The index of the phrase weight size of connection relation,
Noun n: verb v and
Verb v: noun n2 (word frequency v, n, n2 are positive integers);
(d) step 4:Merge noun: the one-to-one chain association phrase and verb of verb: the one-to-one chain conjunctive word of noun
The front and rear noun phrase repeated of group and word frequency, and connect two words composition too many levels association chain:
...
...: verb: identical noun
Identical noun: verb: ...
...
Obtain ...: verb: (merge identical) noun: verb: ... too many levels phrase chain, formed and replaced with verb-noun
For link phrase node, i.e. ...: verb: noun: verb: noun: ... too many levels various dimensions the alternate word of verb/noun
Group chain, so as to by verb: noun and noun: verb, which associate phrase link and moved in circles, to be together in series, or even is formed mutual from beginning to end
The alternate too many levels closed loop association phrase chain of verb/noun of the closed loop of link,
It is reciprocal to obtain alternate cycles ...: verb: noun: ... link association phrase weight index ...: verb: noun
: ... ≈ word frequency ratio ...: v: (n+n2): ... (word frequency v, n, n2 are just whole to verb-noun (value) chain data correlation template
Number);
Wherein, following steps are continued executing with and make acquisition class template (data clues+processing rule+characteristic value collection):
(a) step 1:By natural language processing instruments such as corpus, ontologies storehouses, to personal (colony) data resource
In verb-noun (value) chain data correlation template (set), all noun phrases classify/cluster, and to same class noun
The adjacent principle for respectively retaining at least one phrase (i.e. adjacent verb) of place template position or so, a selection part (including it is similar
Noun) composition verb-noun (value) chain data correlation template fragment, mark off generic verb-noun (value) chain
Segment (set), classification/cluster template is formed, and classification is named:
Item name a ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: it is dynamic
Word 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... verb-noun (value) chain segment (word
Frequency v, n, n2 are positive integers) }
Wherein, same method, the identical similar synonym near synonym of verb phrases can also be used, divides the classification mould of similar verb
Plate:
(b) step 2:Classification identical verb-noun (value) chain segment set, with original phrase (all verbs of segment, name
Word) a part of clue is used as, composition clue (first) is gathered,
Obtained with adjacent (identical) principle for respectively retaining at least one verb phrases with template position where class noun or so and
Gather with verb-noun (value) chain segment ordering rule as part rule, composition rule (second),
A part of statistical characteristics, composition characteristic value are used for the word frequency of verb-noun (value) chain segment of same class noun
(the 3rd) gather;
Simplify expression and expression is developed in details and be respectively:
Class template (data clues+processing rule+characteristic value collection)=class template data clues (... verb 1, same to class name
Word 1, verb 1, verb 2, with class noun 2, verb 2...)+processing rule:Phrase chain segment ordering rule (...: verb 1: same
Class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2: ...)+characteristic value collection (...: verb 1: same to class name
Word 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency
Than ...: v: (n+n2): v: ...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers)) }
Wherein, other homogenous characteristics data such as geographical position, time can also be extracted, are closed with verb-noun (value) chain data
Gang mould plate grouped data together, forms the more complicated class template of following form:
Class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...;A<North
Latitude N1 ", east longitude E1 ">、B<North latitude N2 ", east longitude E2 ">、C<North latitude N3 ", east longitude E3 ">、D<North latitude N4 ", east longitude E4 ">...)+
Processing rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2
∶...;Place ordering rule A: B: C: D...) and+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency
Than ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v
∶...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers);Place temporal characteristics value A:Time1, B:Time2, C:
Time3, D:time4...)};
Wherein, verb phrases can also be used identical close (synonym near synonym), marks off the class template of similar verb;
The synchronous personal sample data resource of personal mobile device, obtain verb-noun (value) chain data correlation template composition
Personal class template (data clues+processing rule+characteristic value collection);Wherein, due to being the first person, default subject
Noun.
8. a kind of intelligence system that data mining is carried out based on personal mobile device according to claim 7, its feature are existed
In data mining common platform server includes:Template characteristic extraction module, template matching module, ATL, matching result
Feedback and messaging module, wherein it is possible to according to input method, camera, shared drive, caching, temporary file caching, application
Program APP records be saved in local temporary file, network opening interface, navigation API etc. outputs input personal data resource with
The different degree of individual subject's correlation, the different weights of the word frequency of extraction are set;Wherein, the server of unit or PC include
Common data excavates shared companion's module:
(1) common data excavates shared companion's module, for the unit (department) on the server or PC of unit (department) is electric
Subdocument data resource, particularly text data, in manual or automated manner convergence share to data mining common platform service
Device, numerous unit data resource convergence composition office complex sample data resources;
Wherein, data prediction is carried out before sharing, excludes that content is identical or content is similar and time identical electronic document, can
Data desensitization is carried out to synchrodata with pretreatment;
(2) the template characteristic extraction on data mining common platform server makes module, and data digging is carried out for data resource
Pick, make data model, pattern or template;
Wherein, the personal data resource synchronously downloaded on numerous personal mobile devices is pooled together and forms personal population data and enter
Row data mining, characteristics extraction, make personal population data model, pattern or template;The individual of personal mobile device convergence
Population sample data resource, obtain personal colony's class of numerous seriation verb-noun (value) chain data correlation template compositions
Other template (data clues+processing rule+characteristic value collection);
Wherein, it is synchronous to unit (department) data resource, the office complex sample data resource of numerous units convergence, mobile device
Personal data resource, mobile device convergence personal population sample data resource, personal population sample data resource and unit
The blended data resource (being used for unit supply template and demands of individuals template matches), personal data resource of (department) data resource
Provided with blended data resource (being used for individual speciality template and unit post template matches), the total data of unit data resource
Source, data mining is carried out, make data pattern or template;
Wherein it is possible to make verb-noun (value) chain data correlation template and class template, verb-noun (valency is obtained
Value) chain data correlation template and class template;
Wherein, following steps are performed the-class template of personal group need can be supplied with production unit:
(d) step 1:The item name gathered with unit (department) class template (segment) of unit (department) data resource, than
To the item name in personal demographic categories template (segment) set as the personal population sample data resource for comparing object;
(e) step 2:If the match is successful for item name, obtain item name of the same name and adhere to unit (department) class template separately
Two set of (segment) and personal demographic categories template (segment);
Above-mentioned item name of the same name and affiliated personal demographic categories template (segment) set for comparing object are directly quoted in selection,
Component unit supply-personal group need class template;
Result is unit supply-personal group need class template=unit item name of the same name+personal colony classification mould of the same name
Plate (data clues+processing rule+characteristic value collection) collection is combined into;
(3) template matching module, the characteristic value for the data template of different subjects data resource compare;
Wherein, the characteristic value comparison method of its verb-noun (value) chain data correlation template and class template is carried out, can be with
The verb of verb-noun (value) chain data correlation template and class template, office complex sample for unit (department)-
Noun (value) chain data correlation template and class template, individual verb-noun (value) chain data correlation template and
Class template, verb-noun (value) the chain data correlation template of personal colony and class template, the verb-noun of holography
The mutually corresponding matching of characteristic value between (value) chain data correlation template and class template compares;
Wherein, the template matching of verb-noun (value) chain data correlation performs following steps:
(a) step 1:Verb-noun (value) the chain data correlation pattern or mould that two difference (main body) data resources make
Plate (phrase set), the comparison of phrase is mutually carried out,
(b) step 2:If the identical comparison result of phrase obtains:Identical verb, identical noun, identical verb and identical noun,
Identical noun and identical verb or identical verb/noun alternating too many levels phrase chain, wherein, if adverbial word can also add
Enter the comparison of identical adverbial word, i.e. identical noun, identical adverbial word and identical verb, identical adverbial word, identical verb and identical noun;
(c) step 3:Identical phrase carry out word frequency than comparison;
(d) step 4:Output result:
One, identical phrase word frequency is than equal result:
Identical verb and identical noun: word frequency is more equal than v: n2;
Identical noun and identical verb: word frequency is more equal than n: v;
Identical verb/noun replaces too many levels phrase:Word frequency ratio ...: v: (n+n2): v: ...,
1 (n, n2) component is entirely equal;
2 (n+n2) total amounts are equal;
The match is successful for the template matching of verb-noun (value) chain data correlation;
Two, result of the identical phrase word frequency than not grade:
Identical verb:
Show the sequence of associated high-frequency noun;
Identical noun:
Show the sequence of associated high-frequency verb;
Identical verb and identical noun:
Noun word frequency is inversely proportional in verb word frequency proportion and word frequency ratio, and display word frequency compares difference value;
Identical noun and identical verb:
For noun word frequency in verb word frequency proportion and word frequency than directly proportional, display word frequency compares difference value;
Identical verb/noun replaces too many levels phrase:
Noun, verb word frequency compare difference value;
Wherein, verb-noun (value) chain data correlation template that personal data resource makes, the name of certain topic can be selected
Word (for example, interest, hobby, speciality etc.) mutually compares, and matching obtains the approximate verb-noun value chain of personal matching theme and closed
Connection relation;
Or can according to given noun, in verb-noun (value) chain data correlation template that personal data resource makes,
Verb-noun (value) the chain data correlation template matching made with the population sample data resource of individual, individual is obtained in group
(interest, hobby, speciality etc.) verb word in (given) noun-verb action (value) chain link in volume data resource entirety
The difference value positioning scenarios and situation that sound interval degree compares;
Wherein, verb-noun (value) the chain data correlation template made from unit (department) data resource, it is high to select word frequency
Noun phrase, compare from the office complex sample data resource of numerous unit set or make what is obtained from all data resources
Verb-noun (value) chain data correlation template, it is whole in office complex sample data resource can to obtain the unit (department)
The difference that verb word frequency degree in (high frequency) noun-verb action (value) chain link of body or all data resources entirety compares
Different value positioning scenarios and situation;
Wherein, in verb-noun (value) the chain data correlation template made from unit (department) data resource, it is high to select word frequency
Noun phrase, compare verb-noun (value) chain data correlation template from personal population sample data resource;The match is successful
Noun where verb-noun (value) chain data correlation template can as the unit (department) (high frequency) noun based on
Verb-noun (value) chain data correlation template of matching relationship between topic supply and personal group need;
Wherein,
Former class template data clues (... verb 1, with class noun 1, verb 1, verb 2, with class noun 2, verb 2...)+place
Reason rule:Phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: with class noun 2: verb 2
: ...)+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...∶
Verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... phrase chain segment word frequency ratio (word frequency
V, n, n2 are positive integers)) }
The object type template (data clues+processing rule+characteristic value collection) of comparison;
Class template (data clues+processing rule+characteristic value collection), which compares, performs following steps:
(a) step 1:According to former class template identical rule, to the object to be compared (main body) object type template (data
Clue+processing rule+characteristic value collection) data clues in, extraction compares the same noun phrase in former class template data clues;
If not extracting identical same noun phrase, return starts;
If extracting the same noun phrase of above-mentioned whole, data clues (... verb 1, with class noun 1, verb 1, verb 2, same to class name
Word 2, verb 2...) the match is successful for data clues, into next step;
(b) step 2:Equally classify according to former class template/clustering rule classified, obtain same category phrase chain segment
Sequence:
To in verb-noun (value) the chain data correlation template (set) of the object to be compared (main body) data resource, owning
Noun phrase classify/cluster, and respectively retains identical individual phrase principle, choosing to adjacent with template position where class noun or so
Identical a part of verb-noun (value) chain segment is taken, is divided into generic verb-noun (value) chain segment set, group
Constituent class/cluster template, classification is named, obtained:
Object type title A processing regular (phrase chain segment) ...: verb 11: with class noun 11: verb 11: ...;...∶
Verb 22: with class noun 22: verb 22: ...;... phrase chain segment }
Compare
Former item name a processing rule phrase segment (...: verb 1: with class noun 1: verb 1: ...;...: verb 2: same
Class noun 2: verb 2: ...;... data correlation template phrase chain segment)
In phrase chain segment sequence processing rule;
If phrase chain segment sequence processing rule ...: verb 11: with class noun 11: verb 11: ... with ...: verb 1:
With class noun 1: verb 1: ... matching is consistent;...: verb 22: with class noun 22: verb 22: ... with ...: verb 2: similar
Noun 2: verb 2: ... matching is consistent;... all matchings of phrase chain segment sequence processing rule are consistent by that analogy, and entrance is next
Step;
If processing rule phrase chain segment have it is unmatched, return start;
(c) step 3:Sorted processing rule according to former class template identical, comparing similar in two class templates includes phase
With the word frequency ratio of title same words group chain segment sequence:
Comparison object type template characteristic value set ...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...:
v∶(n+n2)∶v∶...;...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...
Word frequency ratio (word frequency v, n, n2 are positive integers) }
With
Former class template characteristic value collection ...: verb 1: with class noun 1: verb 1: ... ≈ word frequency ratio ...: v: (n+n2): v
∶...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;... word frequency ratio (word frequency v,
N, n2 are positive integers) }
It is compared;
If:
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 1: same
Class noun 1: verb 1: ...;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 2: same
Class noun 2: verb 2: ...
... characteristic value word frequency is than all identical or be approximately equal to by that analogy, and as the match is successful for characteristic value collection,
The match is successful for data clues in class template (data clues+processing rule+characteristic value collection), the sequence of phrase chain segment
Processing rule is consistent, and for the word frequency of characteristic value collection than equal, final result is that the match is successful for two class templates;
If:Characteristic value word frequency is that it fails to match for class template than not waiting;
Wherein, if setting other classifying rules to extract other characteristics, according to other rule determine matching compares successfully and
No, it can be just the success of whole template matches that all the match is successful, and the failure of any one local matching can all cause whole template
It fails to match;
Wherein, template matching module can also be included on personal mobile device;
(4) ATL, the template of the data resource for preserving various main bodys,
Wherein, verb-noun (value) the chain data correlation template and class template, office complex sample of unit (department) are preserved
This verb-noun (value) chain data correlation template and class template, verb-noun (value) chain data correlation of individual
Template and class template, verb-noun (value) the chain data correlation template of personal colony and class template, holography it is dynamic
Word-noun (value) chain data correlation template and class template, unit (department) are dynamic with personal colony's blended data resource
Word-noun (value) chain data correlation template, the personal verb-noun (value) with unit (department post) blended data resource
Chain data correlation template, etc. template set;
(5) matching result feedback and messaging module, feed back to correspondingly for each template matching message data that the match is successful
Each data resource main equipment, and for the interactive message communication between them;Wherein, it is also used for verb-noun
In (value) chain data correlation template or class template (data clues+processing rule+characteristic value collection) comparing module matching into
The message data of work(feeds back to corresponding each data resource main equipment;
It is wherein it is possible to defeated using personal mobile device is run on single (safety chip) processor on personal mobile device
Go out input synchronization module and template characteristic extraction make module.
9. one kind is based on personal mobile device, using the custom built forms or classification of verb-noun (value) chain data correlation template
The intelligence system of template (data clues+processing rule+characteristic value collection), it is characterised in that personal mobile device includes:It is defeated
Go out input data synchronization module, template clue filtering module, template matches comparing module, personal cue library, ATL, output
The corresponding Agent Service content module of display;Wherein,
(1) output input data synchronization module, for by the input method on personal mobile device, camera, shared drive,
Caching, temporary file caching, application APP record are saved in local temporary file, network opening interface, navigation API etc.
Output input data synchronously replicate collect;
Data desensitization bleaching or image extraction characteristic value are carried out to synchrodata wherein it is possible to pre-process;
(2) template clue filtering module, for above-mentioned output the data that are collected into of input data synchronization module, mistake successively one by one
Filter all verbs, the noun in the custom built forms of verb-noun (value) chain data correlation template construct in comparison template storehouse
Deng the hints data in phrase or class template (data clues+processing rule+characteristic value collection), result data note that the match is successful
Record to personal cue library, and record accumulative matching times;
Wherein it is possible to preserved according to input method, camera, shared drive, caching, temporary file caching, application APP record
To the temporary file of local, network opening interface, navigation API etc. outputs input personal data resource and individual subject's correlation
Different degree, the different weights of the data of filtering are set to add up word frequency or record;
(3) template matches comparing module, the comparison for the template and the template of personal cue library extraction of ATL;
Wherein, verb-noun (value) chain of unit supply-personal group need custom built forms is included but is not limited in ATL
Data correlation template custom built forms (for example, noun third: verb first ≈ word frequency is than n: v and verb fourth: noun fourth ≈ word frequency is than v:
N2), verb-noun (value) the chain data correlation template of the same name from the extraction making of personal cue library is compared (for example, noun third
: verb first ≈ word frequency is than n: v ' and verb fourth: noun fourth ≈ word frequency is than v: n2);If the word frequency with noun phrase is than identical or about
It is equal to, that is, is unit supply-the match is successful for personal group need custom built forms, is otherwise that it fails to match;
Wherein, in view of class template=unit item name of the same name of unit supply-personal group need+personal colony class of the same name
Other template (data clues+processing rule+characteristic value collection) collection is combined into, and is needed with the unit supply in ATL-personal colony
Class template is sought, compares the class template of the same name (data clues+processing rule+characteristic value collection made from the extraction of personal cue library
Close),
Unit supply-personal group need class template data clues (... verb 1, with class noun 1, verb 1, verb 2,
With class noun 2, verb 2...)+processing rule:The sequence of phrase chain segment (...: verb 1: with class noun 1: verb 1: ...;...
: verb 2: with class noun 2: verb 2: ...)+characteristic value collection (...: verb 1: with class noun 1: verb 1: ... ≈ word frequency
Than ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v
∶...;... phrase chain segment word frequency ratio (word frequency v, n, n2 are positive integers)) }
The class template of the same name (data clues+processing rule+characteristic value collection) that the object individual cue library extraction of comparison makes;
Perform following class template and compare step:
(a) step 1:- class template identical the rule of personal group need is supplied according to unit, it is (main to the object to be compared
Body) object type template (data clues+processing rule+characteristic value collection) data clues in, extraction compare unit supply-it is individual
Same noun phrase in the data clues of the class template of people's group need;
If not extracting identical same noun phrase, return starts;
If extracting the same noun phrase of above-mentioned whole, data clues (... verb 1, with class noun 1, verb 1, verb 2, same to class name
Word 2, verb 2...) the match is successful for data clues, into next step;
(b) step 2:Supply according to unit-class template of personal group need equally classifies/clustering rule classified, obtain
Obtain same category phrase chain segment:
To in verb-noun (value) the chain data correlation template (set) of the object to be compared (main body) data resource, owning
Noun phrase classify/cluster, and respectively retains identical individual phrase principle, choosing to adjacent with template position where class noun or so
Identical a part of verb-noun (value) chain segment is taken, is divided into generic verb-noun (value) chain segment set, group
Constituent class/cluster template, classification is named, obtained:
Object type title A processing regular (phrase chain segment) ...: verb 11: with class noun 11: verb 11: ...;...∶
Verb 22: with class noun 22: verb 22: ...;... phrase chain segment }
Compare
The class template a of unit supply-personal group need processing rule phrase segment (...: verb 1: with class noun 1: dynamic
Word 1: ...;...: verb 2: with class noun 2: verb 2: ...;... data correlation template phrase chain segment)
In phrase chain segment sequence processing rule;
If the phrase chain segment sequence ... of processing rule: verb 11: with class noun 11: verb 11: ... with ...: verb 1:
With class noun 1: verb 1: ... matching is consistent;...: verb 22: with class noun 22: verb 22: ... with ...: verb 2: similar
Noun 2: verb 2: ... matching is consistent;... handling regular phrase chain segment sequence by that analogy, all unanimously, entrance is next for matching
Step;
If processing rule phrase chain segment sequence have it is unmatched, return start;
(c) step 3:Supply according to unit-the class template identical of personal group need handles rule, compare two classifications
The similar word frequency ratio for including the identical phrase chain segment sequence of same names in template:
Comparison object type template characteristic value set ...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...:
v∶(n+n2)∶v∶...;...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ...;...
Word frequency ratio (word frequency v, n, n2 are positive integers) }
With
Unit supply-personal group need class template characteristic value collection ...: verb 1: with class noun 1: verb 1: ... ≈
Word frequency ratio ...: v: (n+n2): v: ...;...: verb 2: with class noun 2: verb 2: ... ≈ word frequency ratio ...: v: (n+n2): v
∶...;... word frequency ratio (word frequency v, n, n2 are positive integers) }
It is compared;
If:
...: verb 11: with class noun 11: verb 11: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 1: same
Class noun 1: verb 1: ...;
...: verb 22: with class noun 22: verb 22: ... ≈ word frequency ratio ...: v: (n+n2): v: ... ≈ ...: verb 2: same
Class noun 2: verb 2: ...
... characteristic value word frequency is than all identical or be approximately equal to by that analogy, and as the match is successful for characteristic value collection,
Data wire in the class template of the same name (data clues+processing rule+characteristic value collection) that personal cue library extraction makes
The match is successful for rope, and the phrase chain segment for handling rule sorts unanimously, and for the word frequency of characteristic value collection than equal, final result is two
The match is successful for class template;
If:Characteristic value word frequency is that it fails to match for class template than not waiting;
Unit supply-personal group need custom built forms the match is successful or personal colony class template of the same name the match is successful all forms
Template matches compare successfully;
(4) output display corresponding data service content module, in template matches comparing module template matching matching into
After work(, then the corresponding data, services content set by output display on personal mobile device;
(5) ATL, unit supply-personal colony for preserving verb-noun (value) chain data correlation template construct need
Ask custom built forms set and unit supply-personal group need class template set;
Wherein, allow through personal mobile device, renewal unit supply-personal group need custom built forms, individual-group can be downloaded
Body study associates the templates such as custom built forms to ATL;
(6) personal cue library, for export input data synchronization module synchrodata filter verb-noun (value) chain number
According to the data group obtained after hints data in the phrases such as all verbs of the custom built forms of relation template, noun and class template
Into personal cue library.
10. one kind carries out data mining based on personal mobile device, using the intelligence of verb-noun (value) chain data correlation template
Energy system, it is characterised in that personal mobile device includes:Input method software data simultaneous module, verb filtering and template generation
Module, template send management and matching result feedback interactive communication module, individual character ATL, verb library;
Data mining common platform server includes:Template receives management and matching result feedback interactive communication module, template ratio
To module, ATL;
Wherein, personal mobile device includes:
(1) input method software data simultaneous module, for will synchronously be replicated to the input method input data on personal mobile device
Collect;
(2) verb filtering and template generation module, the data being collected into for above-mentioned input method software data simultaneous module, successively
The verb in filtering verb library is compared, performs following steps generation individual character template:
(a) step 1:With all conventional verb set in verb library, input method software data simultaneous module is filtered successively and is collected
The text data arrived;
(b) step 2:Part-of-speech tagging is carried out to sentence where the filtering verb that the match is successful, marks out the noun come in sentence;
Also Sentence Grammar where the filtering verb that the match is successful is analyzed, (trying one's best) marks out the subject for carrying out sentence, predicate and guest
Language;
Whether the verb for judging to be filled into is predicate verb;
Wherein, or the text data that the input method of time-wise separation is collected into adds and matches somebody with somebody punctuation mark automatically, is led
The processing of the reference resolution of language noun or object noun;
(c) step 3:If the verb being filled into is predicate verb, the overlapping word for being labeled as predicate and verb of the sentence is extracted
Group, extraction is overlapping to be labeled as subject and noun, extracts the overlapping phrase for being labeled as object and noun, and they are respectively respective
The one-to-one incidence relation of corresponding subject-predicate/predicate-object in sentence:Subject noun: predicate verb/predicate verb: guest
Language noun;
(d) step 4:If extract step 3 noun: verb/verb: the phrase combination of noun and the master between them
Language-predicate/predicate-object (one-to-one) incidence relation, is saved in individual character ATL;
The data that input method software data simultaneous module is collected into, the verb in filtering individual character ATL: name is compared successively
Word/noun: verb phrases combination, record the matching word frequency of each phrase and mark out, as measurement, two have one-to-one chain
The index of the phrase weight size of bar incidence relation, verb word frequency v: noun word frequency n2/ noun word frequency n: verb word frequency v (word frequency
V, n, n2 are positive integers), obtain verb: noun ≈ word frequency is than v: n2/ noun: verb-noun (valency of the verb ≈ word frequency than n: v
Value) chain data correlation template (set);
(3) template sends management and matching result feedback interactive communication module, the user management individual character for personal mobile device
ATL, which shows and managed, sends template to the progress of the template of data mining common platform server and specified body data resource
Compare, and manage user and communicated with corresponding Parties ' Mutual;
(4) individual character ATL, for preserving the individual character template set of verb filtering and template generation module generation;
(5) verb library, for preserving conventional verb;
Wherein, Chinese is often included, but are not limited to as follows with verb:
Represent action behavior:Say, see, walking, listening, laughing at, taking, circling in the air, running, eating, singing, drinking, striking, sitting, shouting, staring at, kicking, hearing, touching, criticizing
Comment, publicize, safeguarding, learning, studying, carry out, start, stopping, forbidding representing that change be present disappears:, it is dead, have, be equal to, send out
Raw, differentiation, development, grow, be dead, exist, eliminate
Represent psychological activity:Think, like, hating, primary, miss, intend, liking, wishing, evil primary, worry, be disagreeable, feeling, thinking
Represent to judge:It is, be, is
Representing may wish necessity (auxiliary verb):Can, can, meeting, can with, be willing to, be ready, agree, dare, should, should, match somebody with somebody, be worth
Must, would rather
Represent to tend to (directional verb):It is upper and lower, into and out of, return, open, cross,, come up, get off, come in, come out, return, open
Come, come, getting up, going, up, go down, enter, go out it is main, go back, open, the past
Represent development verb:As grown, withering, germinateing, result, spawning;
For plan, system, scheme, file etc.:
Work out, work out, draft, draft, authorize, audit, examine, transmit, deliver, submit, report, assign, put on record, achieve, carry
Go out opinion
For information, data:
Investigate, study, collect, arrange, analyze, conclude, analyze, summarize, provide, report, feed back, pass on, notify, issue, tie up
Pillar is managed
On a certain work (higher level):
Preside over, organize, instructing, arranging, coordinating, indicating, supervising, managing, distributing, controlling, take the lead responsible, examination & approval, authorization, label
Hair, approval, assess
Thinking behavior:
Research, analysis, assess, development, suggest, proposal, participate in, recommend, plan
Direct action:
Organize, carry out, performing, instructing, leading, controlling, supervising, use, production, participate in, illustrate, explaining, providing, assisting
Higher level's behavior:
License, ratify, define, determining, instructing, establishing, planning, supervising, determining
Administration behaviour:
Reach, assess, control, coordinate, ensure, identify, keep, supervise
Expert's behavior:
Analyze, assist, promote, get in touch with, suggest, recommend, support, assess, evaluate
Subordinate's behavior:
Check, check, collect, obtain, submit, make
Other:
Maintenance, holding, foundation, exploitation, preparation, processing, execution, reception, arrangement, monitoring, report, operation, confirmation, generalities,
Cooperate, cooperate, obtain, check, check, get in touch with, design, test, build, change, write, draft, guide, transmit, translate, grasp
Make, ensure, preventing, solving, introducing, paying, calculating, revising, undertaking, negotiating, conferring, interviewing, refusing, vetoing, monitoring, in advance
Survey, compare, delete, use
Wherein, data mining common platform server includes:
(1) template receives management and matching result feedback interactive communication module, for receiving to send on personal mobile device
Individual character template, forward template matching module, be compared with the specified template of ATL;
Matching result data feedback personal device, and personal device and comparison template main equipment interactive communication can be carried out;
(2) template matching module, for receiving individual character template and the template matching for the body data resource specified from ATL;And
Matching result is received into management by template and matching result feeds back interactive communication module feedback to personal device;
(3) ATL, for preserving the ATL of various body data resources;
Wherein, template includes but is not limited to, verb-noun (value) the chain data correlation template and classification mould of unit (department)
Plate, verb-noun (value) the chain data correlation template of office complex sample and class template, the verb-noun (valency of individual
Value) chain data correlation template and class template, verb-noun (value) the chain data correlation template and classification of personal colony
The template set such as template, holographic verb-noun (value) chain data correlation template and class template.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710303290.0A CN107341171B (en) | 2017-05-03 | 2017-05-03 | Method for extracting data feature template and method and system for applying template |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710303290.0A CN107341171B (en) | 2017-05-03 | 2017-05-03 | Method for extracting data feature template and method and system for applying template |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107341171A true CN107341171A (en) | 2017-11-10 |
CN107341171B CN107341171B (en) | 2021-07-27 |
Family
ID=60220082
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710303290.0A Active CN107341171B (en) | 2017-05-03 | 2017-05-03 | Method for extracting data feature template and method and system for applying template |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107341171B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109657013A (en) * | 2018-11-30 | 2019-04-19 | 杭州数澜科技有限公司 | A kind of systematization generates the method and system of label |
CN110471597A (en) * | 2019-07-25 | 2019-11-19 | 北京明略软件系统有限公司 | A kind of data mask method and device, computer readable storage medium |
CN110738033A (en) * | 2018-07-03 | 2020-01-31 | 百度在线网络技术(北京)有限公司 | Report template generation method, device and storage medium |
CN111428508A (en) * | 2018-12-24 | 2020-07-17 | 微软技术许可有限责任公司 | Style customizable text generation |
CN111859858A (en) * | 2020-07-22 | 2020-10-30 | 智者四海(北京)技术有限公司 | Method and device for extracting relationship from text |
US20210019569A1 (en) * | 2019-07-16 | 2021-01-21 | Ancestry.Com Operations Inc. | Extraction of genealogy data from obituaries |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030055625A1 (en) * | 2001-05-31 | 2003-03-20 | Tatiana Korelsky | Linguistic assistant for domain analysis methodology |
CN101814067A (en) * | 2009-01-07 | 2010-08-25 | 张光盛 | System and methods for quantitative assessment of information in natural language contents |
CN103186633A (en) * | 2011-12-31 | 2013-07-03 | 北京百度网讯科技有限公司 | Method for extracting structured information as well as method and device for searching structured information |
CN106104524A (en) * | 2013-12-20 | 2016-11-09 | 国立研究开发法人情报通信研究机构 | Complex predicate template collection device and be used for its computer program |
CN106484675A (en) * | 2016-09-29 | 2017-03-08 | 北京理工大学 | Fusion distributed semantic and the character relation abstracting method of sentence justice feature |
-
2017
- 2017-05-03 CN CN201710303290.0A patent/CN107341171B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030055625A1 (en) * | 2001-05-31 | 2003-03-20 | Tatiana Korelsky | Linguistic assistant for domain analysis methodology |
CN101814067A (en) * | 2009-01-07 | 2010-08-25 | 张光盛 | System and methods for quantitative assessment of information in natural language contents |
CN103186633A (en) * | 2011-12-31 | 2013-07-03 | 北京百度网讯科技有限公司 | Method for extracting structured information as well as method and device for searching structured information |
CN106104524A (en) * | 2013-12-20 | 2016-11-09 | 国立研究开发法人情报通信研究机构 | Complex predicate template collection device and be used for its computer program |
CN106484675A (en) * | 2016-09-29 | 2017-03-08 | 北京理工大学 | Fusion distributed semantic and the character relation abstracting method of sentence justice feature |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110738033A (en) * | 2018-07-03 | 2020-01-31 | 百度在线网络技术(北京)有限公司 | Report template generation method, device and storage medium |
CN110738033B (en) * | 2018-07-03 | 2023-09-19 | 百度在线网络技术(北京)有限公司 | Report template generation method, device and storage medium |
CN109657013A (en) * | 2018-11-30 | 2019-04-19 | 杭州数澜科技有限公司 | A kind of systematization generates the method and system of label |
CN111428508A (en) * | 2018-12-24 | 2020-07-17 | 微软技术许可有限责任公司 | Style customizable text generation |
US20210019569A1 (en) * | 2019-07-16 | 2021-01-21 | Ancestry.Com Operations Inc. | Extraction of genealogy data from obituaries |
US11537816B2 (en) * | 2019-07-16 | 2022-12-27 | Ancestry.Com Operations Inc. | Extraction of genealogy data from obituaries |
US20230109073A1 (en) * | 2019-07-16 | 2023-04-06 | Ancestry.Com Operations Inc. | Extraction of genealogy data from obituaries |
US11797774B2 (en) * | 2019-07-16 | 2023-10-24 | Ancestry.Com Operations Inc. | Extraction of genealogy data from obituaries |
CN110471597A (en) * | 2019-07-25 | 2019-11-19 | 北京明略软件系统有限公司 | A kind of data mask method and device, computer readable storage medium |
CN111859858A (en) * | 2020-07-22 | 2020-10-30 | 智者四海(北京)技术有限公司 | Method and device for extracting relationship from text |
CN111859858B (en) * | 2020-07-22 | 2024-03-01 | 智者四海(北京)技术有限公司 | Method and device for extracting relation from text |
Also Published As
Publication number | Publication date |
---|---|
CN107341171B (en) | 2021-07-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lucy et al. | Content analysis of textbooks via natural language processing: Findings on gender, race, and ethnicity in Texas US history textbooks | |
CN107341171A (en) | Extract the method and system of data (gene) feature templates method and application template | |
CN109478205B (en) | Architecture and method for computer learning and understanding | |
Neumann et al. | Chatbots as a tool to scale mentoring processes: Individually supporting self-study in higher education | |
Abdulwahid et al. | Library Management System Using Artificial Intelligence | |
Ermel et al. | Literature reviews: modern methods for investigating scientific and technological knowledge | |
CN110309114A (en) | Processing method, device, storage medium and the electronic device of media information | |
Pepinsky et al. | Silver anniversary: The Journal of Counseling Psychology as a matter of policies. | |
Zweig | Awkward intelligence: Where AI goes wrong, why it matters, and what we can do about it | |
Grossman | Reading for gender in the Damascus Document | |
Watts | A study of alternative frameworks in school science | |
O'Halloran | A posthumanist pedagogy using digital text analysis to enhance critical thinking in higher education | |
Utami et al. | The analysis of denotative and connotative meaning of Indonesian sexist metaphors | |
Sinag | Dance Ethnography: An Analysis on Aeta Ambala Tribe of Barangay Tubo-tubo, Bataan | |
Kasprzik | Automating subject indexing at ZBW: making research results stick in practice | |
Tong et al. | Automating Psychological Hypothesis Generation with AI: Large Language Models Meet Causal Graph | |
Jatain et al. | A Hybrid Bio-inspired Fuzzy Feature Selection Approach for Opinion Mining of Learner Comments | |
Kunifuji et al. | Knowledge, Information and Creativity Support Systems: Selected Papers from KICSS’2014-9th International Conference, held in Limassol, Cyprus, on November 6-8, 2014 | |
Ujwal et al. | A Hybrid Weight based Feature Selection Algorithm for Predicting Students’ Academic Advancement by Employing Data Science Approaches | |
Nelson | Begging for money: technology commercialization and the genre of the business pitch | |
Tenzin et al. | Sentiment Analysis | |
Mohamed et al. | Enhancing the Performance of Educational Systems Using Efficient Opinion Mining Techniques. | |
Bhardwaj et al. | A NARRATIVE METHOD OF BRAIN ENCOURAGED EMOTION INVESTIGATION | |
Gullerud | Leveraging LSTM and Language Embeddings for Age Group Estimation in Child Language Data | |
Pacer | Mind as theory engine: causation, explanation and time |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 101100 room 1901, unit 1, Beijing ONE3 building, Zhongshan Avenue, Tongzhou District, Beijing. Applicant after: Liu Hongli Address before: 101100 Beijing Tongzhou District Xinhua Street East End Gucheng Road West Garden District 1 District 7 Building 342 room. Applicant before: Liu Hongli |
|
DD01 | Delivery of document by public notice | ||
DD01 | Delivery of document by public notice |
Addressee: Liu Hongli Document name: Notification of Passing Examination on Formalities |
|
GR01 | Patent grant | ||
GR01 | Patent grant |