OGC geographic information services based on differential matrix is described vocabulary reduction method
Technical field
The present invention relates to a kind of OGC geographic information services based on differential matrix and describe vocabulary reduction method.Be used for solving the description of OGC geographic information services vocabulary, reduction method and semantic information search problem, belong to geographic information processing technical field.
Background technology
Along with the rise of Web service concept and software architecture thought thereof, geographic information services is also day by day ripe, becomes the Main Means of current geography information Share and interoperability.For realizing fully sharing and widespread use of geographic information resources, embody its maximum value, some government organs or organize and in succession subsidized a collection of project, in the process of these project implementations, a large amount of achievements in research that geographic information services resource is principal mode of take have been produced, be accompanied by the particularly development of Sensor Network of information network technique, the service relevant to geography information must get more and more.OGCShi Yige non-profiting organization, object is the interoperability that promotion adopts new technology and commercial system to improve geographic information processing, OGC(Open Geospatial Consortium, open geographical space federation) geographic information services is the certain geographic information services of sharing form that has forming based on OGC standard, is also the maximum geographic information services of issue at present.
In OGC geographic information services description document, contain a large amount of description vocabulary, there are and complicated GIS descriptor more abundanter than general database, yet OGC geographic information services is more weak in effective tissue and the ability aspect expression of geo-spatial knowledge, the semantic description of shortage to information on services, in data, enrich and the condition of lack of knowledge, caused the defect of existing data in the expression of knowledge and retrieval, made full use of a large amount of data with existing information, just must do some processing of refining to data, the description that retains its useful knowledge and it is carried out to semantic information is processed, this just finds from big data quantity in the urgent need to a kind of, extract the effective ways of describing vocabulary and building semantic knowledge.
The traditional optimum description of OGC geographic information services vocabulary acquisition methods is probability of use statistical method or empirical method, these methods will be take mass data as prerequisite, obtain the statistical law of priori, general this mass data is difficult to obtain, objectively the world exists a large amount of Fuzzy Geographical Objects simultaneously, by conventional process data, just there will be error or uncertainty, thereby cause the result of OGC geographic information services knowledge base not exclusively reliable, even mistake, causes error or the failure of decision-making the most at last.Rough set (Rough Set, RS) theory is a kind of delineation imperfection and probabilistic mathematical tool, the information of only utilizing data itself to provide, need not any priori, effectively analyze and process out of true, the various incomplete information such as inconsistent, imperfect, and therefrom find tacit knowledge, disclose potential rule.The correlation theory of rough set is applied to OGC geographic information services and describes vocabulary yojan research, not only developed OGC geographic information services semantic sharing method, and can better serve the intelligent inference of OGC geographic information services, OGC geographic information services intelligent semantic reasoning research and application tool are of great significance.
Summary of the invention
The object of this invention is to propose a kind of brand-new OGC geographic information services to describe vocabulary reduction model, solve the description of OGC geographic information services vocabulary, reduction method and semantic information search problem, the invention provides a kind of OGC geographic information services based on differential matrix and describe vocabulary reduction method, possess accurate, efficient, reliable feature.
Technical solution of the present invention is:
OGC geographic information services based on differential matrix is described a vocabulary reduction method, and all OGC geographic information services data are resolved, and extracts the description vocabulary Composition of contents OGC geographic information services example descriptor aggregated data storehouse of each service;
Foundation, towards the description vocabulary reduction model of OGC geographic information services, is carried out yojan to the example descriptor aggregated data storehouse of OGC geographic information services, and the optimum descriptor that obtains each OGC geographic information services collects.
Preferably, determine the ability file description information that needs parsing, adopt application oriented object formula analytic method to resolve, application JAXB data binding framework, the parsing of realization to OGC geographic information services ability description file, data word finder data inserting storehouse by after resolving, forms OGC geographic information services example descriptor aggregated data storehouse.
Preferably, it is as follows that OGC geographic information services is described the establishment step of vocabulary reduction model:
The 1st step is resolved OGC geographic information services sample data, form OGC geographic information services descriptor and collect instance database, the data that each OGC geographic information services descriptor is collected form a research object, in conjunction with domain knowledge, form and are suitable for the data mode of implementing analysis;
Meanwhile, the definition based on infosystem, using the descriptor aggregated data of all OGC geographic information services as domain, descriptor collects as community set, forms an infosystem S who describes vocabulary towards OGC geographic information services;
The definition of the 2nd step combining information system S and differential matrix, forms the differential matrix of describing vocabulary towards OGC geographic information services
The 3rd step is obtained the core of S, according to the definition of System Core, and all single description vocabulary in search differential matrix,
in comprise
number be 1, assign it to CORE (A), and establish B=CORE (A);
In the 4th step differential matrix, do not exist the set of common factor to be set to sky with System Core collection,
If
make α (x
i, x
j)=0;
The 5th step judges whether all set in differential matrix are empty,
if there is α (x
i, x
j)=0, forwards the 7th step to, otherwise forwards the 6th step to;
The existing number of times of each descriptor remittance abroad in the 6th step statistics differential matrix, chooses the description vocabulary that occurrence number is maximum and adds in attribute nucleus collection CORE (A), the number of times that a=max{a occurs in differential matrix }, B=B ∪ a, forwards the 4th step to;
The 7th step output CORE (A), the description vocabulary comprising in CORE (A) is the optimum of each OGC geographic information services and describes vocabulary.Preferably, infosystem S is defined as: S=<U, and A, V, wherein, U is the descriptor aggregated data of all OGC geographic information services to f>; A is that descriptor collects; V=∪ V
a, V
arepresent to describe the concrete data content of vocabulary a; F:U * A → V.
Preferably, S=(U, A, V, f) is the infosystem of describing OGC geographic information services, wherein a U={x
1, x
2..., x
n, n is OGC geographic information services sample data number, definition
Wherein,
for the differential matrix of OGC geographic information services descriptor system, the element of differential matrix
can distinguish object x exactly
iand x
jthe set that forms of all simple description vocabulary.
Preferably, the core of system equals the set that in the differential matrix of infosystem, the single description vocabulary of OGC geographic information services forms,
Preferably, determine and need the ability file description information of resolving to be: service caption, service chaining, figure layer title, service summary, figure layer title, COS, service release information, map projection, the minimum frame X of map coordinate, the minimum frame Y of map coordinate, the maximum frame X of map coordinate, the maximum frame Y coordinate of map and map output mode.
The invention has the beneficial effects as follows: the present invention is that a kind of OGC geographic information services based on differential matrix is described vocabulary reduction method, not only need not any priori, and reliability is strong, work efficiency is high, greatly overcome and in classic method, needed mass data, the poor defect of result reliability, will provide support for the semantic description of OGC geographic information services.
The OGC geographic information services of this kind based on differential matrix described vocabulary reduction method, without any priori and Given information amount, little in the situation that, can realize the Reduction of Knowledge of magnanimity OGC geographic information services data; Than traditional, obtain optimum descriptor to collect method efficiency higher, reliability is stronger, and accessible data volume is larger.
The OGC geographic information services of this kind based on differential matrix described vocabulary reduction method, with the description vocabulary of simplifying most, OGC geographic information services expressed, and solved the problem that OGC geographic information services Data Discretization is described; Determine the optimum vocabulary of describing of OGC geographic information services, for solving towards the deduction of OGC geographic information services semantic level, provide optimum description lexical set.
Accompanying drawing explanation
Fig. 1 is the explanation block diagram of the embodiment of the present invention;
Fig. 2 is the process flow diagram of JAXB data binding framework in the embodiment of the present invention;
Fig. 3 is the explanation schematic diagram of the operation of two groups in JAXB data binding framework in the embodiment of the present invention;
Fig. 4 is the yojan process flow diagram of OGC geographic information services in the embodiment of the present invention.
Embodiment
Below in conjunction with accompanying drawing, describe the preferred embodiments of the present invention in detail.
As shown in Figure 1, it is research object that the present embodiment be take 300 OGC geographic information services obtaining at random on internet, ability description document based on OGC geographic information services and description vocabulary, OGC geographic information services sample data is resolved, and all features of extracting every service are described vocabulary content, formation OGC geographic information services example descriptor aggregated data storehouse; Reduction method based on differential matrix, sets up the description vocabulary reduction model towards OGC geographic information services, and yojan is carried out in the example descriptor aggregated data storehouse of OGC geographic information services, and acquisition OGC geographic information services separately optimum descriptor is collected.
One. the OGC geographic information services based on description field is resolved
International standard ISO19119:2005 based on geographic information services and OGC geographic information services related realization standard, determine and need the description field of resolving to be: service caption, service chaining, figure layer title, service summary, figure layer title, COS, service release information, map projection, the minimum frame X of map coordinate, the minimum frame Y of map coordinate, the maximum frame X of map coordinate, the maximum frame Y coordinate of map and map output mode, each description field will be resolved, and not need definite structural relation between description vocabulary.
As shown in Figure 2, adopt application oriented object formula analytic method to resolve, application JAXB (Java Architecture for XML Binding) data binding framework is realized OGC geographic information services ability description file is resolved, data word finder data inserting storehouse by after resolving, forms OGC geographic information services example descriptor aggregated data storehouse.Wherein, as shown in Figure 3, at JAXB, relate to two groups of operations: 1. from source Schema, generate and compiling JAXB class, and build the class that realizes these application; 2. move this application, make validity check, and bind Analysis on Framework ability file with JAXB.
Two. towards OGC geographic information services, building of vocabulary reduction model described
Defining 1 one infosystem S can be expressed as: S=<U, and A, V, wherein, U is the descriptor aggregated data of all OGC geographic information services to f>; A is that descriptor collects; V=∪ V
a, V
arepresent to describe the concrete data content of vocabulary a; F:U * A → V.
Definition 2S=(U, A, V, f) is the infosystem of describing OGC geographic information services, wherein a U={x
1, x
2..., x
n, n is OGC geographic information services sample data number, definition
Wherein,
for the differential matrix of OGC geographic information services descriptor system, the element of differential matrix
can distinguish object x exactly
iand x
jthe set that forms of all single description vocabulary.
Definition 3
be that the core of system equals the set that in the differential matrix of infosystem, the single description vocabulary of all OGC geographic information services forms.
OGC geographic information services describe vocabulary reduction model process of establishing as shown in Figure 4, arthmetic statement is as follows:
The 1st step is resolved OGC geographic information services sample data, form OGC geographic information services descriptor and collect instance database, the data that each OGC geographic information services descriptor is collected form a research object, in conjunction with domain knowledge, form and are suitable for the data mode of implementing analysis;
Meanwhile, the definition based on infosystem, using the descriptor aggregated data of all OGC geographic information services as domain, descriptor collects as community set, forms an infosystem S who describes vocabulary towards OGC geographic information services;
The definition of the 2nd step combining information system S and differential matrix, forms the differential matrix of describing vocabulary towards OGC geographic information services
The 3rd step is obtained the core of S, and according to the definition of System Core, lists all in search differential matrix are described vocabulary,
in comprise
number be 1, assign it to CORE (A), and establish B=CORE (A);
In the 4th step differential matrix, do not exist the set of common factor to be set to sky with System Core collection,
If
make α (x
i, x
j)=0;
The 5th step judges whether all set in differential matrix are empty,
if there is α (x
i, x
j)=0, forwards the 7th step to, otherwise forwards the 6th step to;
The existing number of times of each descriptor remittance abroad in the 6th step statistics differential matrix, chooses the description vocabulary that occurrence number is maximum and adds in attribute nucleus collection CORE (A), the number of times that a=max{a occurs in differential matrix }, B=B ∪ a, forwards the 4th step to;
The 7th step output CORE (A), the description vocabulary comprising in CORE (A) is the optimum of each OGC geographic information services and describes vocabulary, completes reduction model and builds.
The optimum descriptors of three .OGC geographic information services collect and obtain
Based on OGC geographic information services, vocabulary reduction model is described, finally obtaining the optimum descriptor of OGC geographic information services collects: service caption, service chaining, figure layer title, service summary, figure layer title, COS, minimum frame Y coordinate, minimum frame X coordinate, maximum frame Y coordinate, maximum frame X coordinate, form the optimum instance database of OGC geographic information services, be yojan result database, the field in database is collected and is formed by optimum descriptor.The optimum descriptor of OGC geographic information services is collected and collects with the descriptor that classic method obtains the discovery of comparing, it is all that the descriptor obtaining in classic method collects the vocabulary that middle weight proportion is larger that the descriptor obtaining by Algorithm for Reduction collects, so the correctness of reduction model is verified.
Based on the optimum descriptor of the resulting OGC geographic information services of this reduction model, collect, by collecting and set up ontology storehouse towards optimum descriptor, exploitation OGC geographic information services semantic reasoning system, and by experiment inference system has been carried out to many-sided checking, experiment shows: this reduction method is in the situation that having simplified service describing vocabulary, by setting up the ontology storehouse of collecting towards descriptor, improved recall ratio and the precision ratio of OGC geographic information services, and the retrieval time of having reduced system.Experimental result has further been verified correctness, validity and the feasibility of this reduction method.
As shown in Figure 1, in embodiment, include: the parsing of OGC geographic information services based on describing vocabulary; Towards OGC geographic information services, building of vocabulary reduction model described; The optimum descriptor of OGC geographic information services collects and obtains and correction judgement.Comparing with traditional OGC geographic information services reduction method, mainly there is the difference of four parts in embodiment: OGC geographic information services is described vocabulary content by the analysis mode automatic acquisition of data binding; Reduction method, without any need for priori, only need be set up mathematical model; Information needed amount basis is little, and the foundation of mathematical model does not need a large amount of service datas to support; Describe vocabulary yojan sequencing, can realize efficiently the description vocabulary yojan of magnanimity OGC geographic information services.By foregoing, that has greatly improved that the optimum descriptor of OGC geographic information services collects obtains efficiency and reliability.