CN105488100A - Efficient detection and discovery system for secret-associated geographic data in non secret-associated environment - Google Patents

Efficient detection and discovery system for secret-associated geographic data in non secret-associated environment Download PDF

Info

Publication number
CN105488100A
CN105488100A CN201510790728.3A CN201510790728A CN105488100A CN 105488100 A CN105488100 A CN 105488100A CN 201510790728 A CN201510790728 A CN 201510790728A CN 105488100 A CN105488100 A CN 105488100A
Authority
CN
China
Prior art keywords
concerning security
security matters
geodata
scanning
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510790728.3A
Other languages
Chinese (zh)
Inventor
许礼林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Geo-Compass (beijing) Geographic Information Technology Co Ltd
Original Assignee
Geo-Compass (beijing) Geographic Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Geo-Compass (beijing) Geographic Information Technology Co Ltd filed Critical Geo-Compass (beijing) Geographic Information Technology Co Ltd
Priority to CN201510790728.3A priority Critical patent/CN105488100A/en
Publication of CN105488100A publication Critical patent/CN105488100A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases

Abstract

The invention discloses an efficient detection and discovery system for secret-associated geographic data in a non secret-associated environment. A roughness-to-fineness policy and a multi-thread scanning model are used in the system. Firstly, the "roughness-to-fineness" policy is adopted in a geographic data identification process, two aspects of comprehensive search and accurate search are considered, and a suspected file type list is filtered out by the "roughness-to-fineness" search policy; secondly, keywords of file names or path names and keywords of sensitive words are screened; and finally, data files are deeply analyzed according to different file types, and the "postman" multi-thread scanning model is adopted for making the processes of screening, analysis, grading, storage and the like parallel, so that the scanning speed of the data files is increased. According to the system, based on the demands of quick scanning, judgment, extraction and processing of the secret-associated geographic data in a one machine or a local area network in the non secret-associated environment, a secret-associated feature library and a judgment feature library of the geographic data are established, so that various data can be deeply traversed and scanned at the same time, the time cost can be remarkably reduced, and the efficiency can be improved.

Description

Under a kind of non-concerning security matters environment, the efficient detection of concerning security matters geodata finds system
Technical field
The present invention relates to a kind of detection discovery system, specifically under a kind of non-concerning security matters environment, the efficient detection of concerning security matters geodata finds system.
Background technology
Information security is a major issue of current network times, information age people common concern.Concerning security matters geodata is national important strategic information resource, be widely used in economic construction, social development, each field of national defense construction, the particularly industry such as mapping, geology, mineral products, forestry, military affairs, will seriously jeopardize economic security and national security once occur revealing.At present, still there is certain potential safety hazard in the mapping of the every profession and trade enterprises and institutions that are correlated with, once divulge a secret, can cause serious consequence in the storage, distribution, use etc. of concerning security matters surveying and mapping result.Therefore, in requisition for the detection strengthened the concerning security matters geodata under non-concerning security matters environment and discovery, distinguish ordinary file and concerning security matters geodata fast.
The common search technique based on inverted index is only applicable to text-type geography information (as place name text), for the geographical data file existed in a binary format and geographical data bank, how finding the geographic object that wherein contains rapidly and carry out content analysis, is larger technological difficulties;
Meanwhile, the concerning security matters of geospatial information judge the essential characteristic such as form, title will considering data file on the one hand, but the more important thing is will according to the content of data file.Geographical spatial data is judged inherently very difficult according to file layout, carry out judging that then difficulty is larger according to content.According to incompletely statistics, current geodata proprietary format is existing exceedes hundred kinds, but also there is the general format being easy in a large number obscure with alternative document.Furthermore, both bases also exist numerous geographical spatial datas and derive form.
Along with the development of technology, the file layout of geodata gets more and more, data volume is also increasing, file layout is varied, concerning security matters geodata contains vector, grid, place name address, database multiple format, mostly be destructuring and semi-structured, also more and more higher to the requirement of the inspection of geodata.
The current analysis to geographic information data and retrieval are mainly based on geography information semantic analysis technology, comprise the analytical technologies such as common semantic analysis, lexical analysis, syntactic analysis, further comprises the semantic analysis (similarity, the degree of correlation) towards geographic object and the technology such as the parsing of multi-mode geography information and automatic classification.And with regard to geographical information semantic analysis aspect, its research emphasis mainly concentrates on based on the Similarity measures of natural language analysis and the semantic classification based on ontology thought, and the achievement in research (theory, model and software) taking geographical information space feature, temporal characteristics and scale feature into account is also considerably less, the needs of extensive geographic information data treatment and analysis effectively cannot be met.On the one hand, the accuracy of single method identification geography information is lower, needs in conjunction with multiple method precisely to identify geography information.On the other hand, by the efficiency improving geography information identification be further how the problems needing to solve further.
The many search techniques based on concerning security matters keyword index of common censorship software, are only applicable to the documents such as txt, doc, pdf.By comparison, the efficient detection that the present invention more stresses geographical confidential data under non-concerning security matters environment finds, with reference to national Surveying and Mapping Industry standard, use and inspection experience in conjunction with confidential data, for the concerning security matters feature of vector data and raster data, have based on its specific body of data and semantic description by geographical spatial data simultaneously, set up target storehouse based on expertise and concerning security matters rule base from the aspect design of semanteme, the geographical confidential data of quick examination, analyzes concerning security matters risk and judges risk etc.
Prior art deficiency mainly contains the following aspects:
(1) detection for geographic information data is lacked
What prior art was more is realize examining the concerning security matters key word of text data (as Office, PDF, TXT etc.), and geographic information data is stored as master mainly with binary data format, with text formatting store few, prior art does not support the detection for multiple geographic information data form.Particularly geographical spatial data has space characteristics, attributive character, temporal characteristics, comprise vector data structure, raster data structure, three-dimensional data structure etc., containing space topological information, thematic attribute information, the characteristics such as sorting code number, data hierarchy, space coordinates, metadata, spatial index, and file layout is numerous.Existing technology does not support examination for multiple geographic information data form and detection.
(2) effective concerning security matters feature database and decision rule storehouse is lacked
Existing concerning security matters check that software mainly adopts keyword patterns matching algorithm, feature based mates, and conventional pattern matching algorithm thought mainly contains based on charactor comparison, based on automat, searches based on hash, searches for based on position logical operation with based on Tries tree.Less than the feature database set up based on expertise for concerning security matters geodata and concerning security matters decision rule storehouse.
(3) support under local area network environment is lacked
Though conventional individual version checking tool effectively can complete the degree of depth and detect, but lack the support under multi-network environment, particularly within the limited supervision time, safety inspection for numerous standalone terminal lacks ageing, and length consuming time, personnel have high input, quick concerning security matters scanning in mass data cannot be completed, more cannot form effective data statistic analysis.
Summary of the invention
The efficient detection of concerning security matters geodata under a kind of non-concerning security matters environment is the object of the present invention is to provide to find system, to solve the problem proposed in above-mentioned background technology.
For achieving the above object, the invention provides following technical scheme:
Under a kind of non-concerning security matters environment, the efficient detection of concerning security matters geodata finds system, be divided into four steps: (1) is thick rear thin strategy and multithreading scan model first, " first thick rear thin " strategy is have employed in geodata identifying, take into account and looked into complete and look into accurate two aspects, the search strategy of " first thick after thin " filters out apocrypha list of types: then, to filename or pathname key word and the examination of sensitive word key word, finally, by different file type, data file is analysed in depth, meanwhile, adopts " postman " multithreading scan model, will screen, and analyze, and grade, the concurrent process such as preservation, the sweep velocity of lifting data file, (2) concerning security matters feature database and decision rule storehouse is set up, identify concerning security matters geodata, need to arrange and collect geodata form, research basic-scale topographic map naming rule, induction and conclusion geodata concerning security matters feature, concerning security matters keyword, set up geodata concerning security matters feature database and comprise responsive dictionary, the bank of geographical names and artificial qualification experience storehouse, concerning security matters feature database has extendability, support the supplementary renewal of concerning security matters feature, and support to provide different pieces of information engine for files in different types, realize file content analytic function, concerning security matters judgment rule storehouse is on the basis of concerning security matters feature database, every concrete evaluation index is given a mark and weighted statistical, set up concerning security matters risk class evaluation rule, example to be included in rule base and get rid of the sample data archives that common GIS software is set up in storehouse, thus get rid of the interference of non-concerning security matters sample data, improve recall precision, (3) apocrypha risk rating model, concerning security matters risk scanning determination module is that concerning security matters geodata finds and the core checked, in geodata scanning process, use " analytical hierarchy process " theoretical, the concerning security matters feature analysing in depth each data type is carried out based on geodata concerning security matters feature database, the risk class of data is resolved into some target layers according to different characteristic from top to down, according to concerning security matters risk decision rule storehouse, set up the different risk rating flow process for each apocrypha type, by Semi-qualitative, semiquantitative problem is converted into quantitative computational problem, eventually through the importance of successively more various association concerning security matters feature for software realizing the judgement of suspicious data, classification provides quantitative foundation, (4) digital photograph is distinguished with scanning map, raster data is the vital tissue part of geodata, image particularly in raster data and scintigram event data, but ubiquitous digital photograph causes great interference to concerning security matters inspection in LAN (Local Area Network) machine, for improving speed and the accuracy of the scanning of concerning security matters risk, need the feature summing up digital photograph and scanning map respectively, header file analysis and frequency domain analysis two kinds of methods will be adopted to distinguish digital photograph, scanning map, " header file analysis " method reads digital photograph by software, the header file information of scanning map and image map, according to the difference of header file attribute field, can rapidly by digital photograph, scanning map, image map distinguishes, for the raster data file of header file disappearance, the method of " frequency domain analysis " is adopted to carry out Fourier transform to raster data, then the frequecy characteristic of comparison raster data, compared with scanning the map datum of map or computer export, digital photograph comprises random noise, with whether comprising this both feature differentiation of random noise.
As the further scheme of the present invention: under described non-concerning security matters environment, the efficient detection of concerning security matters geodata finds that system also comprises LAN (Local Area Network) Distributed Scans schedule link module, LAN (Local Area Network) Distributed Scans schedule link module provides terminal scanning internodal communication link, Service supportive end communicates with the scan schedule of terminal, realize transmission and the reception of orders such as scanning, report, support Distributed Scans in LAN (Local Area Network), carry out control terminal scanning, monitoring scan progress, collect scanning result, add up output function.
As the present invention's further scheme: described concerning security matters feature database comprise for responsive dictionary, thesaurus, example get rid of Sum fanction storehouse, storehouse.
Compared with prior art, the invention has the beneficial effects as follows: the present invention bases oneself upon the rapid scanning of concerning security matters geodata in unit under non-concerning security matters environment or LAN (Local Area Network), differentiate, extract the demand with process, arrange and collect geodata form, research basic-scale topographic map naming rule, induction and conclusion geodata concerning security matters feature, concerning security matters keyword etc., set up geodata concerning security matters feature database and judge feature database, multi-pattern matching algorithm is adopted to combine first thick rear thin strategy and multithreading scan model, several data can be scanned by extreme saturation simultaneously, significantly can reduce time cost, raise the efficiency, simultaneously, design LAN (Local Area Network) Distributed Scans schedule link, under the advantage that scanned by unit is extended to LAN environment, the scan efficiency that under solution LAN (Local Area Network), terminal concerning security matters check and accuracy.
Accompanying drawing explanation
Fig. 1 is the Distributed Scans schedule link figure of the efficient detection discovery system of concerning security matters geodata under non-concerning security matters environment;
Fig. 2 is scanning technique route process flow diagram in the efficient detection discovery system of concerning security matters geodata under non-concerning security matters environment;
Fig. 3 is multithreading scan model figure in the efficient detection discovery system of concerning security matters geodata under non-concerning security matters environment;
Fig. 4 is concerning security matters feature decision model figure in the efficient detection discovery system of concerning security matters geodata under non-concerning security matters environment;
Fig. 5 is apocrypha Risk Screening figure in the efficient detection discovery system of concerning security matters geodata under non-concerning security matters environment.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
Refer to Fig. 1 ~ 5, in the embodiment of the present invention, under a kind of non-concerning security matters environment, the efficient detection of concerning security matters geodata finds system, be divided into four steps: (1) is thick rear thin strategy and multithreading scan model first, " first thick rear thin " strategy is have employed in geodata identifying, taken into account and looked into complete and look into accurate two aspects, the search strategy of " first thick rear thin " filters out apocrypha list of types, then, to filename or pathname key word and the examination of sensitive word key word, finally, by different file type, data file is analysed in depth, meanwhile, adopts " postman " multithreading scan model, will screen, and analyze, and grade, the concurrent process such as preservation, the sweep velocity of lifting data file, (2) concerning security matters feature database and decision rule storehouse is set up, identify concerning security matters geodata, need to arrange and collect geodata form, research basic-scale topographic map naming rule, induction and conclusion geodata concerning security matters feature, concerning security matters keyword, set up geodata concerning security matters feature database and comprise responsive dictionary, the bank of geographical names and artificial qualification experience storehouse, concerning security matters feature database has extendability, support the supplementary renewal of concerning security matters feature, and support to provide different pieces of information engine for files in different types, realize file content analytic function, concerning security matters judgment rule storehouse is on the basis of concerning security matters feature database, every concrete evaluation index is given a mark and weighted statistical, set up concerning security matters risk class evaluation rule, example to be included in rule base and get rid of the sample data archives that common GIS software is set up in storehouse, thus get rid of the interference of non-concerning security matters sample data, improve recall precision, (3) apocrypha risk rating model, concerning security matters risk scanning determination module is that concerning security matters geodata finds and the core checked, in geodata scanning process, use " analytical hierarchy process " theoretical, the concerning security matters feature analysing in depth each data type is carried out based on geodata concerning security matters feature database, the risk class of data is resolved into some target layers according to different characteristic from top to down, according to concerning security matters risk decision rule storehouse, set up the different risk rating flow process for each apocrypha type, by Semi-qualitative, semiquantitative problem is converted into quantitative computational problem, eventually through the importance of successively more various association concerning security matters feature for software realizing the judgement of suspicious data, classification provides quantitative foundation, (4) digital photograph is distinguished with scanning map, raster data is the vital tissue part of geodata, image particularly in raster data and scintigram event data, but ubiquitous digital photograph causes great interference to concerning security matters inspection in LAN (Local Area Network) machine, for improving speed and the accuracy of the scanning of concerning security matters risk, need the feature summing up digital photograph and scanning map respectively, header file analysis and frequency domain analysis two kinds of methods will be adopted to distinguish digital photograph, scanning map, " header file analysis " method reads digital photograph by software, the header file information of scanning map and image map, according to the difference of header file attribute field, can rapidly by digital photograph, scanning map, image map distinguishes, for the raster data file of header file disappearance, the method of " frequency domain analysis " is adopted to carry out Fourier transform to raster data, then the frequecy characteristic of comparison raster data, compared with scanning the map datum of map or computer export, digital photograph comprises random noise, with whether comprising this both feature differentiation of random noise.
Under described non-concerning security matters environment, the efficient detection of concerning security matters geodata finds that system also comprises LAN (Local Area Network) Distributed Scans schedule link module, LAN (Local Area Network) Distributed Scans schedule link module provides terminal scanning internodal communication link, Service supportive end communicates with the scan schedule of terminal, realize transmission and the reception of orders such as scanning, report, support Distributed Scans in LAN (Local Area Network), carry out control terminal scanning, monitoring scan progress, collect scanning result, add up output function.
Described concerning security matters feature database comprise for responsive dictionary, thesaurus, example get rid of Sum fanction storehouse, storehouse.
Principle of work of the present invention is: refer to Fig. 1, and the concerning security matters geodata scanning technique route flow process of single terminal node, is divided into four steps: geodata identifies fast, venture analysis grade judges, manually qualification and result derive.Realize geodata to find, needs arrange and collect data layout, research basic-scale topographic map naming rule, realize SQLite database access, set up geographical data bank organization regulation on this basis and realize geodata to realize plug-in unit; Then apocrypha venture analysis is realized, need to set up geodata concerning security matters feature database and comprise responsive dictionary, the bank of geographical names and artificial qualification experience storehouse, secondly also needed the data engine of all kinds file, namely realized file data read-write, realize file content analytic function on this basis; After this realize mutual qualification aid and result derives, mutual Identification Tools comprises browsing data, file attribute checks and to check with component attributes.
(1) first thick rear thin strategy and multithreading scan model
Have employed " first thick rear thin " strategy in geodata identifying, taken into account and looked into complete and look into accurate two aspects.The search strategy of " first thick rear thin " filters out apocrypha list of types; Then, to filename or pathname key word and the examination of sensitive word key word; Finally, by different file type, data file is analysed in depth, as: sensitive word retrieval, header file analysis and frequency domain analysis.Meanwhile, adopt famous " postman " multithreading scan model, will screen, and analyze, and grade, the concurrent process such as preservation, promote the sweep velocity of data file.
(2) concerning security matters feature database and decision rule storehouse is set up
Identify concerning security matters geodata, need to arrange and collect geodata form, research basic-scale topographic map naming rule, induction and conclusion geodata concerning security matters feature, concerning security matters keyword etc., set up geodata concerning security matters feature database and comprise responsive dictionary, the bank of geographical names and artificial qualification experience storehouse.Such as: feature database contains standard 1: 5 ten thousand map sheet naming rule, satellite image map naming rule etc., and concerning security matters feature database has extendability, the supplementary renewal of concerning security matters feature is supported.And support to provide different pieces of information engine for files in different types, realize file content analytic function.Concerning security matters judgment rule storehouse is on the basis of concerning security matters feature database, gives a mark and weighted statistical to every concrete evaluation index, sets up concerning security matters risk class evaluation rule.Example to be included in rule base and get rid of the sample data archives that common GIS software is set up in storehouse, thus get rid of the interference of non-concerning security matters sample data, improve recall precision.
(3) apocrypha risk rating model
Concerning security matters risk scanning determination module is that concerning security matters geodata finds and the core checked.In geodata scanning process, use " analytical hierarchy process " theoretical, carry out based on geodata concerning security matters feature database the concerning security matters feature analysing in depth each data type, the risk class of data is resolved into some target layers according to different characteristic from top to down.According to concerning security matters risk decision rule storehouse, set up the different risk rating flow process for each apocrypha type, Semi-qualitative, semiquantitative problem are converted into quantitative computational problem, eventually through the importance of successively more various association concerning security matters feature for software realizing the judgement of suspicious data, classification provides quantitative foundation.
(4) digital photograph is distinguished with scanning map
Raster data is image in the vital tissue part, particularly raster data of geodata and scintigram event data.But ubiquitous digital photograph causes great interference to concerning security matters inspection in LAN (Local Area Network) machine.For improving speed and the accuracy of the scanning of concerning security matters risk, needing the feature summing up digital photograph and scanning map respectively, header file analysis and frequency domain analysis two kinds of methods will be adopted to distinguish digital photograph, scanning map." header file analysis " method is read the header file information of digital photograph, scanning map and image map, according to the difference of header file attribute field, can digital photograph, scanning map, image map be distinguished rapidly.For the raster data file of header file disappearance, the method for " frequency domain analysis " is adopted to carry out Fourier transform to raster data, the then frequecy characteristic of comparison raster data.Compared with scanning the map datum of map or computer export, digital photograph comprises random noise, can with whether comprising this both feature differentiation of random noise.
1) LAN (Local Area Network) Distributed Scans schedule link
The Distributed Scans schedule link module of LAN (Local Area Network), there is provided terminal scanning internodal communication link, Service supportive end communicates with the scan schedule of terminal, realize transmission and the reception of orders such as scanning, report, support Distributed Scans in LAN (Local Area Network), carry out control terminal scanning, monitoring scan progress, collect the function such as scanning result, statistics output.
2) geographical spatial data concerning security matters feature database and decision rule storehouse
The identification of polymorphic type geodata goes through with the data in screening of maintaining secrecy and grades with documentation risk, will be to be based upon on comparatively comprehensive surveying and mapping result concerning security matters feature database basis.Concerning security matters feature database is designed to four parts, be respectively responsive dictionary, thesaurus, example get rid of Sum fanction storehouse, storehouse.Wherein, responsive dictionary prepares the sensitive word configuration file adopting National Administration for the Protection of State Secrets's internal standard, and dictionary tissue adopts the analysis rules such as certain participle; Ground thesaurus is by common province, city, three grades, county geographical name data warehouse-in, is mainly used in the path of geodata; It is the sample data archives setting up common GIS software that example gets rid of storehouse, thus gets rid of the interference of non-concerning security matters sample data, improves recall precision; Rule base content is by introducing standard 1: 5 ten thousand map sheet naming rule, satellite image map naming rule etc.
3) the quantitative assessment model of geodata concerning security matters risk
Find with checking process in geodata, by collecting, arrange and analyze conventional geodata and the data characteristics of derived product thereof, by the file type to geographical spatial data files, file designation rule, file attribute, metadata and file content are analysed in depth, set up geographical spatial data file characteristic library, find according to apocrypha, the processing procedure Establishing process scan model that documentation risk is analyzed and documentation risk is evaluated, research application " analytical hierarchy process " is theoretical, according to risk rule model, the risk class from top to down of suspicious data is resolved into some index coefficients, and implement different risk rating flow processs pointedly, by Semi-qualitative, semiquantitative problem is converted into quantitative computational problem, the data concerning security matters feature weight that final zone-by-zone analysis goes out and total weight, risk stratification for suspicious data provides quantitative foundation.
4) multithreading scanning engine
Research adopts multithreading means, adopt famous " postman " multithreading scan model, design meets the file in parallel searching algorithm of self-verifying, to screen, analyze, grade, the concurrent process such as preservation, realize parallelization operation between each processing procedure and inner, realize the fast automatic identification to geographical spatial data files relevant in computer file system, the scanning promoting data file judges speed.
5) based on the raster data decision technology of characteristics of image
In the identification of grid geodata and secret examination process, adopt header file analysis and frequency domain analysis two kinds of methods to distinguish digital photograph, scan map." header file analysis " method is read the header file information of digital photograph, scanning map and image map, can digital photograph, scanning map, image map be distinguished rapidly.For the raster data file of header file disappearance, the method for " frequency domain analysis " is adopted to carry out Fourier transform to raster data, the then frequecy characteristic of comparison raster data.
To those skilled in the art, obviously the invention is not restricted to the details of above-mentioned one exemplary embodiment, and when not deviating from spirit of the present invention or essential characteristic, the present invention can be realized in other specific forms.Therefore, no matter from which point, all should embodiment be regarded as exemplary, and be nonrestrictive, scope of the present invention is limited by claims instead of above-mentioned explanation, and all changes be therefore intended in the implication of the equivalency by dropping on claim and scope are included in the present invention.Any Reference numeral in claim should be considered as the claim involved by limiting.
In addition, be to be understood that, although this instructions is described according to embodiment, but not each embodiment only comprises an independently technical scheme, this narrating mode of instructions is only for clarity sake, those skilled in the art should by instructions integrally, and the technical scheme in each embodiment also through appropriately combined, can form other embodiments that it will be appreciated by those skilled in the art that.

Claims (3)

1. under a non-concerning security matters environment, the efficient detection of concerning security matters geodata finds system, it is characterized in that, be divided into four steps: (1) is thick rear thin strategy and multithreading scan model first, " first thick rear thin " strategy is have employed in geodata identifying, taken into account and looked into complete and look into accurate two aspects, the search strategy of " first thick rear thin " filters out apocrypha list of types, then, to filename or pathname key word and the examination of sensitive word key word, finally, by different file type, data file is analysed in depth, meanwhile, adopts " postman " multithreading scan model, will screen, and analyze, and grade, the concurrent process such as preservation, the sweep velocity of lifting data file, (2) concerning security matters feature database and decision rule storehouse is set up, identify concerning security matters geodata, need to arrange and collect geodata form, research basic-scale topographic map naming rule, induction and conclusion geodata concerning security matters feature, concerning security matters keyword, set up geodata concerning security matters feature database and comprise responsive dictionary, the bank of geographical names and artificial qualification experience storehouse, concerning security matters feature database has extendability, support the supplementary renewal of concerning security matters feature, and support to provide different pieces of information engine for files in different types, realize file content analytic function, concerning security matters judgment rule storehouse is on the basis of concerning security matters feature database, every concrete evaluation index is given a mark and weighted statistical, set up concerning security matters risk class evaluation rule, example to be included in rule base and get rid of the sample data archives that common GIS software is set up in storehouse, thus get rid of the interference of non-concerning security matters sample data, improve recall precision, (3) apocrypha risk rating model, concerning security matters risk scanning determination module is that concerning security matters geodata finds and the core checked, in geodata scanning process, use " analytical hierarchy process " theoretical, the concerning security matters feature analysing in depth each data type is carried out based on geodata concerning security matters feature database, the risk class of data is resolved into some target layers according to different characteristic from top to down, according to concerning security matters risk decision rule storehouse, set up the different risk rating flow process for each apocrypha type, by Semi-qualitative, semiquantitative problem is converted into quantitative computational problem, eventually through the importance of successively more various association concerning security matters feature for software realizing the judgement of suspicious data, classification provides quantitative foundation, (4) digital photograph is distinguished with scanning map, raster data is the vital tissue part of geodata, image particularly in raster data and scintigram event data, but ubiquitous digital photograph causes great interference to concerning security matters inspection in LAN (Local Area Network) machine, for improving speed and the accuracy of the scanning of concerning security matters risk, need the feature summing up digital photograph and scanning map respectively, header file analysis and frequency domain analysis two kinds of methods will be adopted to distinguish digital photograph, scanning map, " header file analysis " method reads digital photograph by software, the header file information of scanning map and image map, according to the difference of header file attribute field, can rapidly by digital photograph, scanning map, image map distinguishes, for the raster data file of header file disappearance, the method of " frequency domain analysis " is adopted to carry out Fourier transform to raster data, then the frequecy characteristic of comparison raster data, compared with scanning the map datum of map or computer export, digital photograph comprises random noise, with whether comprising this both feature differentiation of random noise.
2. under non-concerning security matters environment according to claim 1, the efficient detection of concerning security matters geodata finds system, it is characterized in that, under described non-concerning security matters environment, the efficient detection of concerning security matters geodata finds that system also comprises LAN (Local Area Network) Distributed Scans schedule link module, LAN (Local Area Network) Distributed Scans schedule link module provides terminal scanning internodal communication link, Service supportive end communicates with the scan schedule of terminal, realize scanning, transmission and the reception of order such as to report, support Distributed Scans in LAN (Local Area Network), carry out control terminal scanning, monitoring scan progress, collect scanning result, statistics output function.
3. under non-concerning security matters environment according to claim 1 concerning security matters geodata efficient detection find system, it is characterized in that, described concerning security matters feature database comprise for responsive dictionary, thesaurus, example get rid of Sum fanction storehouse, storehouse.
CN201510790728.3A 2015-11-18 2015-11-18 Efficient detection and discovery system for secret-associated geographic data in non secret-associated environment Pending CN105488100A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510790728.3A CN105488100A (en) 2015-11-18 2015-11-18 Efficient detection and discovery system for secret-associated geographic data in non secret-associated environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510790728.3A CN105488100A (en) 2015-11-18 2015-11-18 Efficient detection and discovery system for secret-associated geographic data in non secret-associated environment

Publications (1)

Publication Number Publication Date
CN105488100A true CN105488100A (en) 2016-04-13

Family

ID=55675075

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510790728.3A Pending CN105488100A (en) 2015-11-18 2015-11-18 Efficient detection and discovery system for secret-associated geographic data in non secret-associated environment

Country Status (1)

Country Link
CN (1) CN105488100A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108920710A (en) * 2018-07-20 2018-11-30 北京开普云信息科技有限公司 A kind of pair of internet information carries out concerning security matters and relates to quick information monitoring method and system
CN109508557A (en) * 2018-10-22 2019-03-22 中国科学院信息工程研究所 A kind of file path keyword recognition method of association user privacy
CN110795397A (en) * 2019-10-30 2020-02-14 河南省有色金属地质矿产局第七地质大队 Automatic identification method for catalogue and file type of geological data packet
CN111082970A (en) * 2019-11-22 2020-04-28 博智安全科技股份有限公司 Network-based terminal checking and analyzing system
CN111967052A (en) * 2020-09-21 2020-11-20 北京市测绘设计研究院 Method and system for realizing topographic map distribution
CN112580092A (en) * 2020-12-07 2021-03-30 北京明朝万达科技股份有限公司 Sensitive file identification method and device
CN114611125A (en) * 2022-03-15 2022-06-10 南京师范大学 Basic geographic data attribute confidentiality processing method and system
CN117240850A (en) * 2023-11-10 2023-12-15 中印云端(深圳)科技有限公司 Intelligent monitoring system for network information technology

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103164515A (en) * 2013-03-01 2013-06-19 傅如毅 Computer system confidential file knowledge base searching method
CN104008169A (en) * 2014-05-30 2014-08-27 中国测绘科学研究院 Semanteme based geographical label content safe checking method and device
CN104410799A (en) * 2014-12-24 2015-03-11 北京中科大洋信息技术有限公司 Distributed technical review method
US20150244731A1 (en) * 2012-11-05 2015-08-27 Tencent Technology (Shenzhen) Company Limited Method And Device For Identifying Abnormal Application

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150244731A1 (en) * 2012-11-05 2015-08-27 Tencent Technology (Shenzhen) Company Limited Method And Device For Identifying Abnormal Application
CN103164515A (en) * 2013-03-01 2013-06-19 傅如毅 Computer system confidential file knowledge base searching method
CN104008169A (en) * 2014-05-30 2014-08-27 中国测绘科学研究院 Semanteme based geographical label content safe checking method and device
CN104410799A (en) * 2014-12-24 2015-03-11 北京中科大洋信息技术有限公司 Distributed technical review method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴赛松: "涉密矢量数字地图敏感信息量测度方法研究", 《中国优秀硕士学位论文全文数据库基础科学辑》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108920710A (en) * 2018-07-20 2018-11-30 北京开普云信息科技有限公司 A kind of pair of internet information carries out concerning security matters and relates to quick information monitoring method and system
CN109508557A (en) * 2018-10-22 2019-03-22 中国科学院信息工程研究所 A kind of file path keyword recognition method of association user privacy
CN110795397A (en) * 2019-10-30 2020-02-14 河南省有色金属地质矿产局第七地质大队 Automatic identification method for catalogue and file type of geological data packet
CN110795397B (en) * 2019-10-30 2022-02-01 河南省有色金属地质矿产局第七地质大队 Automatic identification method for catalogue and file type of geological data packet
CN111082970A (en) * 2019-11-22 2020-04-28 博智安全科技股份有限公司 Network-based terminal checking and analyzing system
CN111967052A (en) * 2020-09-21 2020-11-20 北京市测绘设计研究院 Method and system for realizing topographic map distribution
CN112580092A (en) * 2020-12-07 2021-03-30 北京明朝万达科技股份有限公司 Sensitive file identification method and device
CN114611125A (en) * 2022-03-15 2022-06-10 南京师范大学 Basic geographic data attribute confidentiality processing method and system
CN117240850A (en) * 2023-11-10 2023-12-15 中印云端(深圳)科技有限公司 Intelligent monitoring system for network information technology
CN117240850B (en) * 2023-11-10 2024-02-09 中印云端(深圳)科技有限公司 Intelligent monitoring system for network information technology

Similar Documents

Publication Publication Date Title
CN105488100A (en) Efficient detection and discovery system for secret-associated geographic data in non secret-associated environment
US9519718B2 (en) Webpage information detection method and system
Cosma et al. An approach to source-code plagiarism detection and investigation using latent semantic analysis
Openshaw Learning to live with errors in spatial databases
CN103544436B (en) System and method for distinguishing phishing websites
CN107872454B (en) Threat information monitoring and analyzing system and method for ultra-large Internet platform
WO2014094332A1 (en) Method for creating knowledge base engine for emergency management of sudden event and method for querying in knowledge base engine
US8312012B1 (en) Automatic determination of whether a document includes an image gallery
CN103729402A (en) Method for establishing mapping knowledge domain based on book catalogue
CN109698820A (en) A kind of domain name Similarity measures and classification method and system
Li et al. An automatic approach for generating rich, linked geo-metadata from historical map images
Isaj et al. Multi-source spatial entity linkage
KR20200045700A (en) System for detecting image based fake news
Pat et al. Where's Waldo? Geosocial Search over Myriad Geotagged Posts
Moura et al. Integration of linked data sources for gazetteer expansion
Codocedo et al. A Contribution to Semantic Indexing and Retrieval Based on FCA-An Application to Song Datasets.
CN115858801A (en) Coal mine safety knowledge map construction method and device based on spatial knowledge map
Liu et al. An illegal billboard advertisement detection framework based on machine learning
AT&T Where's Waldo? Geosocial Search over Myriad Geotagged Posts
CN114880540A (en) Intelligent reminding method based on intelligent financial text comments
KR101266504B1 (en) Method for extracting top word on set of documents using richness
KR101846347B1 (en) Method and apparatus for managing massive documents
Gao et al. Detecting geometric conflicts for generalisation of polygonal maps
KR20070102036A (en) System and method for making analysis of document
Renteria-Agualimpia et al. Identifying hidden geospatial resources in catalogues

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160413

RJ01 Rejection of invention patent application after publication