CN104537060A - Observed object system mixed organization model oriented to space-time datum - Google Patents
Observed object system mixed organization model oriented to space-time datum Download PDFInfo
- Publication number
- CN104537060A CN104537060A CN201410836206.8A CN201410836206A CN104537060A CN 104537060 A CN104537060 A CN 104537060A CN 201410836206 A CN201410836206 A CN 201410836206A CN 104537060 A CN104537060 A CN 104537060A
- Authority
- CN
- China
- Prior art keywords
- observation
- data
- feature
- name
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
Abstract
The invention discloses an observed object system mixed organization model oriented to a space-time datum. An observed object system is established; observed objects needing careful attention are collected to establish the observed object system, and the observed objects are entities or targets needing careful attention; characteristics of the observed objects are extracted and comprise external characteristics, boundary characteristics, famous name characteristics, transliteration name symbol characteristics and word class characteristics; the observed objects are identified by means of a statistical machine learning method, parameter estimation is carried out through a GIS algorithm, and names and weapon names are identified according to rules; the incidence relation between data and the observed objects is established according to identification results obtained in the third step and the fourth step. The observed object system mixed organization model has the advantages that fast organization of data is achieved, and convenience is brought to subsequent deep data mining.
Description
Technical field
The invention belongs to data auto-associating technical field, relate to the object of observation system line and staff control model towards space-time datum.
Background technology
Towards in the system of Military Application, for the ease of storage, the management of data, be convenient to the retrieval of information, extraction and analysis, meet the application purpose of information excavating and intelligence analysis, all kinds of various mass spatial information must be integrated according to unified data model and organization framework.Traditional new data organization model is generally from data type, space, the time organizes data.Such system can the type of data of description and time-space relationship each other, but cannot set up content-based contact between data.The more important thing is, typical Military Application generally comparatively pays close attention to the object of observation (personage as military-political in certain, certain combat forces) of some particular types, provides a series of application by statistical process object appearance situation in the data.
Such as, in military-specific data organize models, the object of observation system model of composition data application layer is positioned at the high level of model hierarchy structure, is directly market demand service.Therefore the structure of object of observation system is the basis of Military Application.
Summary of the invention
The object of the present invention is to provide the object of observation system line and staff control model towards space-time datum, the invention has the beneficial effects as follows the automatic tissue and related question that solve and solve mass data.
The technical solution adopted in the present invention is the method following steps of Modling model:
Step 1: set up object of observation system; Collect the object of observation needing to pay close attention to, set up object of observation system, object of observation is exactly paid close attention to entity or target;
Step 2: the feature extracting object of observation, comprises surface, boundary characteristic, famous name feature, transliteration name symbolic feature, part of speech feature;
Step 3: Using statistics machine learning method identifies object of observation, uses GIS algorithm to carry out parameter estimation:
Calculate
Wherein P (x) is the experience distribution of x in training sample, P
j(y|x) represent that the word sequence observed produces the probability of label, f
i(x, y) is different features;
Calculate
wherein
c is the size of training sample;
Recalculate
Double counting, until convergence, by computation process above, is automatically stamped label y to data x, is forecasting process, the classification of data that what label represented is exactly;
Step 4: utilize rule to carry out the identification of name weapon name;
Step 5: the incidence relation setting up data and object of observation according to the recognition result of step 3 and step 4.
Further, the method extracting the feature of object of observation in described step 2 is:
Characteristic window size is selected to be 2, if the centre word of potential target extraction and former and later two words are w
-2w
-1w
0w
1w
2, wherein w
0represent current word, w
1represent a rear word of current word, w
-1represent the previous word of current word, w
2and w
-2the like:
Surface:
X represents w
-2w
-1w
0w
1w
2, y represents mark label, and i represents sequence number, if there is the combination of these data and label, then claim fundamental function to meet, namely value is 1, otherwise is 0, works as w
1=" delivering ", y=person satisfies condition, and namely value is 1;
Boundary characteristic:
Famous name feature:
W
0mate completely in dictionary, semi-match or fragment match;
Transliteration name symbolic feature: containing special character " ", " ", the sentence of "-";
Part of speech feature:
The invention has the beneficial effects as follows the rapid tissue realizing data, excavate for follow-up data deep layer and provide convenience.
Accompanying drawing explanation
Fig. 1 is the object of observation system line and staff control model of the present invention towards space-time datum.
Embodiment
Below in conjunction with embodiment, the present invention is described in detail.
Fig. 1 is the object of observation system line and staff control model of the present invention towards space-time datum, and concrete steps comprise:
Step 1: set up object of observation system; Collect the object of observation needing to pay close attention to, set up object of observation system, particularly, object of observation be exactly we pay close attention to entity or target, generally express with noun, refer to a certain concrete things, such as Qiao Busi, Apple Computers etc. are all physical object, and object of observation system can gather and obtains from knowledge base (wikipedia, Baidupedia etc.), wherein only can consider the object of observation paid close attention to according to demand, then these object of observations are classified, just establish object of observation system.
Step 2: the feature extracting object of observation, comprises surface, boundary characteristic, famous name feature, transliteration name symbolic feature, part of speech feature.
Selected characteristic window size is 2, i.e. the centre word of potential target extraction and former and later two words (w
-2w
-1w
0w
1w
2), wherein w
0represent current word, w
1represent a rear word of current word, w
-1represent the previous word of current word, w
2and w
-2the like, comprise following characteristics:
Surface:
X represents w
-2w
-1w
0w
1w
2, y represents mark label, in corpus, meets this condition and frequency is greater than certain threshold value (in the present invention, threshold value is 2) just thinks validity feature;
In above formula, i represents sequence number, if there is the combination of these data and label, then claim fundamental function to meet, namely value is 1, otherwise is 0.W is worked as in above formula
1=" delivering ", y=person satisfies condition, and namely value is 1.
Boundary characteristic:
Famous name feature is as shown in table 1:
Table 1
Transliteration name symbolic feature: containing special character " ", " ", the sentence of "-" may be name near special character.
Part of speech feature:
Step 3: Using statistics machine learning method identifies object of observation, uses GIS algorithm to carry out parameter estimation.
Calculate
Wherein P (x) is the experience distribution of x in training sample, P
j(y|x) represent that the word sequence observed produces the probability of label, f
i(x, y) is different features.
Calculate
wherein
c is the size of training sample.
Recalculate
Double counting is until convergence.By computation process above, automatically stamp label y can to data x, be forecasting process, the classification of data that what label represented is exactly.
Step 4: utilize rule to carry out the identification of name weapon name.
If find " president " and find within 10 words after him " saying: " etc., think that middle word forms name; If found one " place name ", then the word in its front and back is not just name.Step 3 be Using statistics machine learning method to identify object of observation, but effect depends on training data, and what step 4 here adopted is rule-based method, and both associatings can improve the effect of identification greatly.
Step 5: the incidence relation setting up data and object of observation according to the recognition result of step 3 and step 4.Particularly, utilize algorithm above automatically can identify object of observation from data, and then stored in a record in database, preserve the incidence relation of data and object of observation.
The invention belongs to self-data constitution field, disclose a kind of object of observation system line and staff control model towards space-time datum, the method proposes to set up object of observation system, utilize maximum entropy algorithm Sum fanction method to realize the identification of object of observation simultaneously, set up the incidence relation between data and object of observation, realize the rapid tissue of data, excavate for follow-up data deep layer and provide convenience.
The above is only to better embodiment of the present invention, not any pro forma restriction is done to the present invention, every any simple modification done above embodiment according to technical spirit of the present invention, equivalent variations and modification, all belong in the scope of technical solution of the present invention.
Claims (2)
1., towards the object of observation system line and staff control model of space-time datum, it is characterized in that the method following steps of Modling model:
Step 1: set up object of observation system; Collect the object of observation needing to pay close attention to, set up object of observation system, object of observation is exactly paid close attention to entity or target;
Step 2: the feature extracting object of observation, comprises surface, boundary characteristic, famous name feature, transliteration name symbolic feature, part of speech feature;
Step 3: Using statistics machine learning method identifies object of observation, uses GIS algorithm to carry out parameter estimation:
Calculate
Wherein P (x) is the experience distribution of x in training sample, P
j(y|x) represent that the word sequence observed produces the probability of label, f
i(x, y) is different features;
Calculate
Wherein
c is the size of training sample;
Recalculate
Double counting, until convergence, by computation process above, is automatically stamped label y to data x, is forecasting process, the classification of data that what label represented is exactly;
Step 4: utilize rule to carry out the identification of name weapon name;
Step 5: the incidence relation setting up data and object of observation according to the recognition result of step 3 and step 4.
2., according to the object of observation system line and staff control model towards space-time datum described in claim 1, it is characterized in that: the method extracting the feature of object of observation in described step 2 is:
Characteristic window size is selected to be 2, if the centre word of potential target extraction and former and later two words are w
-2w
-1w
0w
1w
2, wherein w
0represent current word, w
1represent a rear word of current word, w
-1represent the previous word of current word, w
2and w
-2the like:
Surface:
X represents w
-2w
-1w
0w
1w
2, y represents mark label, and i represents sequence number, if there is the combination of these data and label, then claim fundamental function to meet, namely value is 1, otherwise is 0, works as w
1=" delivering ", y=person satisfies condition, and namely value is 1;
Boundary characteristic:
Famous name feature:
W
0mate completely in dictionary, semi-match or fragment match;
Transliteration name symbolic feature: containing special character " ", " ", the sentence of "-";
Part of speech feature:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410836206.8A CN104537060A (en) | 2014-12-26 | 2014-12-26 | Observed object system mixed organization model oriented to space-time datum |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410836206.8A CN104537060A (en) | 2014-12-26 | 2014-12-26 | Observed object system mixed organization model oriented to space-time datum |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104537060A true CN104537060A (en) | 2015-04-22 |
Family
ID=52852588
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410836206.8A Pending CN104537060A (en) | 2014-12-26 | 2014-12-26 | Observed object system mixed organization model oriented to space-time datum |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104537060A (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008047101A (en) * | 2006-07-10 | 2008-02-28 | Nec (China) Co Ltd | Natural language-based location query system, keyword-based location query system, and natural language-based/keyword-based location query system |
CN101650942A (en) * | 2009-08-26 | 2010-02-17 | 北京邮电大学 | Prosodic structure forming method based on prosodic phrase |
-
2014
- 2014-12-26 CN CN201410836206.8A patent/CN104537060A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008047101A (en) * | 2006-07-10 | 2008-02-28 | Nec (China) Co Ltd | Natural language-based location query system, keyword-based location query system, and natural language-based/keyword-based location query system |
CN101650942A (en) * | 2009-08-26 | 2010-02-17 | 北京邮电大学 | Prosodic structure forming method based on prosodic phrase |
Non-Patent Citations (3)
Title |
---|
卢朝华: ""基于语义分析的汉语短语识别方法研究"", 《中国优秀硕士学位论文全文数据 信息科技辑》 * |
牛晓妍: ""基于最大熵的汉语人名识别方法研究"", 《福建电脑》 * |
贾宁 等: ""基于最大熵模型和规则的中文姓名识别"", 《计算机工程与应用》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104572958B (en) | A kind of sensitive information monitoring method based on event extraction | |
CN107766324B (en) | Text consistency analysis method based on deep neural network | |
CN104391942B (en) | Short essay eigen extended method based on semantic collection of illustrative plates | |
US20150310096A1 (en) | Comparing document contents using a constructed topic model | |
CN106909643A (en) | The social media big data motif discovery method of knowledge based collection of illustrative plates | |
CN104834747A (en) | Short text classification method based on convolution neutral network | |
CN104199972A (en) | Named entity relation extraction and construction method based on deep learning | |
CN105335349A (en) | Time window based LDA microblog topic trend detection method and apparatus | |
CN109800310A (en) | A kind of electric power O&M text analyzing method based on structuring expression | |
Finarelli et al. | Potential pitfalls of reconstructing deep time evolutionary history with only extant data, a case study using the Canidae (Mammalia, Carnivora) | |
CN103473380B (en) | A kind of computer version sensibility classification method | |
CN104598535A (en) | Event extraction method based on maximum entropy | |
CN102298632B (en) | Character string similarity computing method and device and material classification method and device | |
CN107609055B (en) | Text image multi-modal retrieval method based on deep layer topic model | |
Salas‐Eljatib et al. | Evaluation of modeling strategies for assessing self‐thinning behavior and carrying capacity | |
CN104077417A (en) | Figure tag recommendation method and system in social network | |
CN106202030A (en) | A kind of rapid serial mask method based on isomery labeled data and device | |
CN107045532A (en) | The visual analysis method of space-time geographical space | |
CN102880631A (en) | Chinese author identification method based on double-layer classification model, and device for realizing Chinese author identification method | |
CN104408161A (en) | Mould CAD drawing query based on similarity query and management method | |
CN110516210A (en) | The calculation method and device of text similarity | |
CN104834718A (en) | Recognition method and system for event argument based on maximum entropy model | |
CN112363996B (en) | Method, system and medium for establishing physical model of power grid knowledge graph | |
CN110990451B (en) | Sentence embedding-based data mining method, device, equipment and storage device | |
Kordopatis-Zilos et al. | Placing Images with Refined Language Models and Similarity Search with PCA-reduced VGG Features. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20150422 |