CN115033728A - Data crawling and normalizing method and system for global satellite image search engine - Google Patents

Data crawling and normalizing method and system for global satellite image search engine Download PDF

Info

Publication number
CN115033728A
CN115033728A CN202210506311.XA CN202210506311A CN115033728A CN 115033728 A CN115033728 A CN 115033728A CN 202210506311 A CN202210506311 A CN 202210506311A CN 115033728 A CN115033728 A CN 115033728A
Authority
CN
China
Prior art keywords
satellite image
image data
remote sensing
satellite
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210506311.XA
Other languages
Chinese (zh)
Inventor
雷帆
谢玲琳
曹里
杨凯钧
魏继德
吴烨
曾海波
张哲�
熊伟
师俊峰
蒋琦
杨亮亮
贾庆仁
王强
胡芳
谢祥安
张泽旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Second Surveying And Mapping Institute
National University of Defense Technology
Original Assignee
Hunan Second Surveying And Mapping Institute
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Second Surveying And Mapping Institute, National University of Defense Technology filed Critical Hunan Second Surveying And Mapping Institute
Priority to CN202210506311.XA priority Critical patent/CN115033728A/en
Publication of CN115033728A publication Critical patent/CN115033728A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the invention relates to the technical field of satellite images, and particularly discloses a global satellite image search engine data crawling and normalizing method and system. The method can acquire newly released remote sensing satellite image data from all satellite official platforms in the world in real time, and pre-calculate relevant data and products of satellite orbit prediction, so that the speed of displaying the satellite images when a user accesses in real time is improved, the acquired satellite image big data can be classified efficiently and accurately, a unified standard specification can be constructed to integrate the consistency of mass satellite image attribute information, a standard quick view can be displayed by deflection angle correction, the interest area of the user can be displayed visually, the display of the coverage area under the remote sensing satellite orbit can be predicted by adding a satellite cloud picture, the effective coverage rate of the images can be calculated more effectively and accurately, the effective coverage area of the satellite images can be displayed, and the user-defined image screening function of the user can be supported by considering the sidesway capability parameters of the satellites.

Description

Data crawling and normalizing method and system for global satellite image search engine
Technical Field
The invention belongs to the technical field of satellite images, and particularly relates to a global satellite image search engine data crawling and normalizing method and system.
Background
The satellite remote sensing image is important geographic space data, in recent years, with the rapid development of the satellite remote sensing technology, a series of remote sensing satellites carrying different sensors are launched in various countries of the world, remote sensing images with multiple platforms, multiple sensors, multiple wave bands and multiple space-time resolutions are collected, the data volume is increased explosively, and the types and the structures of the data are more and more complex. Meanwhile, satellite remote sensing images have been widely applied in various fields such as agriculture, forestry, water conservancy, traffic, homeland, environmental protection, residential construction and the like. The diversity demand of various industries on satellite remote sensing images is increasing day by day. Therefore, the global multisource mass satellite image data is obtained and is recorded into the database according to the unified standard specification, and the method has very important significance for application of various industries and related scientific research.
The existing software only provides a plurality of or a plurality of series of satellite image data, does not effectively integrate massive global multi-source satellite images, rarely considers the problem of heterogeneous structures of different data source images, does not divide a uniform standard specification to uniformly store the images in a warehouse, and greatly reduces the readability of the image summarizing result. Most of the traditional satellite image effective coverage rate methods search and query corresponding remote sensing images on related platforms, the overall coverage condition is checked through image quick views, the operation process is complex, and an accurate effective coverage rate value cannot be obtained. In addition, the existing software does not calculate the effective coverage rate of the satellite images, so that a user needs to check the satellite images one by one through a quick view, and the time cost is obviously increased. The current software does not consider the influence of the satellite side swing angle on the quality of the satellite imaging effect, and requires a user to select images at required angles one by one, so that the working efficiency is greatly influenced. And the existing software only provides a single data retrieval function, cannot meet the individual requirements of each user, and has the problem of inaccurate image segmentation of the interest area.
Disclosure of Invention
The embodiment of the invention aims to provide a global satellite image search engine data crawling and normalizing method and system, and aims to solve the problems in the background technology.
In order to achieve the above purpose, the embodiments of the present invention provide the following technical solutions:
the global satellite image search engine data crawling and normalizing method specifically comprises the following steps:
acquiring newly released remote sensing satellite image data from all global satellite official platforms in real time by using a big data real-time acquisition program, and primarily classifying and filing the remote sensing satellite image data according to data sources and storing the remote sensing satellite image data in a local database;
carrying out fine classification processing on the remote sensing satellite image data which is classified, filed and stored in a local database in an idle period by using the computing power of a super-computation center;
constructing a unified standard specification, and performing attribute information consistency integration on the remote sensing satellite image data subjected to fine classification processing;
identifying and calculating a view deflection angle of each satellite image in the remote sensing satellite image data, performing standard angle correction on each satellite image according to the view deflection angle, and uploading the remote sensing satellite image data subjected to fine classification processing, attribute information consistency integration and standard angle correction to a server;
receiving query information of a user, determining an interest area of the user according to the query information, and calling remote sensing satellite image data corresponding to the interest area to be displayed to the user in a visualized manner;
the satellite cloud picture is introduced to assist the satellite image area to accurately identify the single cloud, the condition of cloud discrete distribution in the image coverage area is considered, the effective coverage rate is obtained through the ratio of the coverage area of each cloud to the image coverage area, and the satellite side swing angle is obtained from the satellite image attribute information by utilizing the effective coverage rate.
As a further limitation of the technical solution of the embodiment of the present invention, the step of using the big data real-time acquisition program to acquire the latest released remote sensing satellite image data from each global satellite official platform in real time, and primarily classifying, filing and storing the remote sensing satellite image data in the local database according to the data source specifically comprises the following steps:
checking whether the target platform provides an API (application program interface), if so, directly calling remote sensing satellite image data, and if not, acquiring the remote sensing satellite image data by using a big data real-time acquisition program;
carrying out data structure analysis and data storage on the remote sensing satellite image data;
carrying out data flow analysis on the remote sensing satellite image data;
and carrying out data arrangement on the remote sensing satellite image data, and storing the remote sensing satellite image data into a local database.
As a further limitation of the technical solution of the embodiment of the present invention, the fine classification processing of the remote sensing satellite image data archived in classification and stored in the local database specifically includes the following steps:
performing efficient and accurate classification on the obtained remote sensing satellite image data by adopting a hierarchy-based text clustering algorithm;
and (4) carrying out accuracy evaluation on the classified images, and directly classifying and extracting the acquired remote sensing satellite image data according to the result after determining the complete and accurate category.
As a further limitation of the technical solution of the embodiment of the present invention, the constructing a unified standard specification, and the performing attribute information consistency integration on the remote sensing satellite image data after the fine classification processing specifically includes the following steps:
designing a unified satellite image data attribute table by utilizing a remote sensing image data integrated storage strategy considering the spatial characteristics and a satellite image data segmentation and partition storage strategy considering the multisource heterogeneous characteristics;
attribute naming and modification of satellite image attribute information standard fields are carried out by adopting an attribute field batch unified standard naming algorithm;
and dynamically expanding satellite image attribute information according to the satellite image data attribute table, and performing attribute information consistency integration on the remote sensing satellite image data.
As a further limitation of the technical solution of the embodiment of the present invention, the satellite image attribute information standard field includes: image name, product ID, satellite type, sensor type, image acquisition date, image acquisition time, image cloud cover, scene number, data classification, and wkid value.
As a further limitation of the technical solution of the embodiment of the present invention, a calculation formula of the view deflection angle of the satellite image is as follows:
Figure BDA0003636304060000041
Figure BDA0003636304060000042
calculating the inclination angle D according to the formula by the corner point coordinates of the standard fast view BD And then calculating the nonstandard fast view inclination angle theta, wherein the difference value of the two inclination angles is the view deflection angle of the satellite image.
The global satellite image search engine data crawling and normalizing system comprises a satellite image data acquisition unit, a satellite image data classification unit, an attribute information consistency integration unit, a satellite image angle correction unit, an interested area image display unit and a satellite yaw angle acquisition unit, wherein:
the satellite image data acquisition unit is used for acquiring newly released remote sensing satellite image data from all global satellite official platforms in real time by using a big data real-time acquisition program, and preliminarily classifying and filing the remote sensing satellite image data according to data sources and storing the remote sensing satellite image data in a local database;
the satellite image data classification unit is used for carrying out fine classification processing on the remote sensing satellite image data which are classified, filed and stored in the local database in an idle period by utilizing the computing power of the super-computation center;
the attribute information consistency integration unit is used for constructing a unified standard specification and performing attribute information consistency integration on the remote sensing satellite image data after the fine classification processing;
the satellite image angle correction unit is used for identifying and calculating view deflection angles of all satellite images in the remote sensing satellite image data, carrying out standard angle correction on all satellite images according to the view deflection angles, and uploading the remote sensing satellite image data subjected to fine classification processing, attribute information consistency integration and standard angle correction to a server;
the interest area image display unit is used for receiving query information of a user, determining an interest area of the user according to the query information, and calling remote sensing satellite image data corresponding to the interest area to be displayed to the user in a visualized manner;
the satellite side-sway angle acquisition unit is used for introducing a satellite cloud picture to assist a satellite image area to accurately identify a single cloud, considering the condition of cloud discrete distribution in an image coverage area, obtaining effective coverage rate through the ratio of the coverage area of each cloud to the image coverage area, and obtaining a satellite side-sway angle from satellite image attribute information by utilizing the effective coverage rate.
Compared with the prior art, the invention has the beneficial effects that:
the method can acquire newly released remote sensing satellite image data from all satellite official platforms in the world in real time, and pre-calculate relevant data and products of satellite orbit prediction, so that the speed of displaying the satellite images when a user accesses in real time is improved, the acquired satellite image big data can be classified efficiently and accurately, a unified standard specification can be constructed to integrate the consistency of mass satellite image attribute information, a standard quick view can be displayed by deflection angle correction, the interest area of the user can be displayed visually, the display of the coverage area under the remote sensing satellite orbit can be predicted by adding a satellite cloud picture, the effective coverage rate of the images can be calculated more effectively and accurately, the effective coverage area of the satellite images can be displayed, and the user-defined image screening function of the user can be supported by considering the sidesway capability parameters of the satellites.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention.
Fig. 1 shows a flow chart of a method provided by an embodiment of the invention.
Fig. 2 is a schematic diagram illustrating a satellite image standard angle calibration in the method according to the embodiment of the invention.
Fig. 3 is a schematic diagram illustrating a principle of calculating effective coverage in the method according to the embodiment of the present invention.
Fig. 4 shows an application architecture diagram of a system provided by an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It can be understood that the existing software only provides several or some series of satellite image data, and does not effectively integrate massive global multi-source satellite images, and the heterogeneous problem of different data source images is less concerned, and uniform standard specifications are not divided to uniformly store the images in a warehouse, so that the readability of the image summarizing result is greatly reduced. Most of traditional satellite image effective coverage rate methods search and query corresponding remote sensing images on related platforms, the overall coverage condition is checked through image quick views, the operation process is complex, and accurate effective coverage rate values cannot be obtained. In addition, the effective coverage rate of satellite images is not calculated by the existing software, a user needs to check the satellite images one by one through a quick view, and the time cost is obviously increased. The current software does not consider the influence of the satellite side swing angle on the quality of the satellite imaging effect, and requires a user to select images at required angles one by one, so that the working efficiency is greatly influenced. And the existing software only provides a single data retrieval function, cannot meet the individual requirements of each user, and has the problem of inaccurate image segmentation of the interest area.
In order to solve the above problems, embodiments of the present invention provide a global satellite image search engine data crawling and normalizing method and system.
Fig. 1 shows a flow chart of a method provided by an embodiment of the invention.
Specifically, the global satellite image search engine data crawling and normalizing method is characterized by specifically comprising the following steps of:
step S101, a big data real-time acquisition program is used for acquiring newly released remote sensing satellite image data from all global satellite official platforms in real time, and the remote sensing satellite image data are preliminarily classified and filed in a local database according to data sources.
In the embodiment of the invention, newly released remote sensing satellite image data is acquired from all global satellite official platforms every day in real time by using a big data real-time acquisition program written by the user, and the acquired massive image data is firstly classified and filed in a local database according to data sources. Specifically, the flow of the real-time big data acquisition program is as follows: firstly, checking whether a target platform provides an API (application program interface), if so, directly calling, and if not, using a real-time acquisition program; the second step is data structure analysis and data storage, firstly determining required fields, determining constructed tables and connection relations and selecting a database for storage; thirdly, analyzing data flow, mainly comprising determining an acquisition range and a cut-in source, skipping among multi-layer webpage structures, range subdivision, access mode analysis and URL and parameter analysis; and the fourth step is data acquisition, wherein a scratch module, a Beautiful Soup analysis tool and a Pandas method are used for data sorting, and finally the data are written into a database.
Specifically, in a preferred embodiment provided by the present invention, the step of using the big data real-time obtaining program to obtain the latest distributed remote sensing satellite image data from the official platforms of all satellites in the world in real time, and preliminarily classifying and filing the remote sensing satellite image data according to data sources and storing the remote sensing satellite image data in the local database specifically includes the following steps:
checking whether the target platform provides an API (application program interface), if so, directly calling remote sensing satellite image data, and if not, acquiring the remote sensing satellite image data by using a big data real-time acquisition program;
carrying out data structure analysis and data storage on the remote sensing satellite image data;
carrying out data flow analysis on the remote sensing satellite image data;
and carrying out data arrangement on the remote sensing satellite image data, and storing the remote sensing satellite image data into a local database.
Further, the global satellite image search engine data crawling and normalizing method further comprises the following steps:
and S102, finely classifying the remote sensing satellite image data which is classified, filed and stored in a local database in an idle period by using the computing power of the super-computation center.
In the embodiment of the invention, after the preliminarily classified mass image data are stored in the local server, the acquired image data are further finely classified and processed by utilizing the strong calculation power of the super-calculation center and setting the idle time period. And performing efficient and accurate classification on the acquired satellite image big data by adopting a hierarchy-based text clustering algorithm. Establishing a satellite model, a sensor type and a resolution multi-level structure in the aspect of level division; and in the aspect of text clustering algorithm, accurate classification is realized on the basis of semantic association between the hierarchical structure and the satellite image information and the clustering algorithm. And then, carrying out accuracy evaluation on the classified images, and after determining the complete and accurate category, directly classifying and extracting the satellite image data according to the result.
Specifically, image data are classified by a hierarchical division method, the first layer is of a sensor type, the second layer is of optics, radars, hyperspectrum and the like, the third layer is of multiple resolutions of the optical sensor image, the resolutions are specifically 0.5m, 1m, 2m, 3m and medium-low resolutions, and the last layer is of satellite models corresponding to the classifications. Based on the hierarchical division mode, clustering is carried out by adopting an aggregation type clustering algorithm AGNES algorithm. AGNES clustering algorithm logic: if the Euclidean distance between all objects belonging to different clusters is the smallest when the distance between one object in the cluster C1 and one object in the cluster C2 is the smallest, C1 and C2 may be merged, and the similarity of the image information description text is used for representing the Euclidean distance. This is a single-join method, each cluster of which can be represented by all objects in the cluster, and the similarity between two clusters is determined by the similarity of the closest pair of data points in the two clusters. The algorithm flow is as follows:
inputting: a database containing n objects, the number k of termination condition clusters;
and (3) outputting: k clusters
(1) Treating each object as an initial cluster;
(2)Repeat;
(3) finding two closest clusters according to the closest data points in the two clusters;
(4) merging the two clusters to generate a new cluster set;
until reaches the defined number of clusters.
Specifically, in a preferred embodiment provided by the present invention, the fine classification processing of the remote sensing satellite image data archived in classification and stored in the local database specifically includes the following steps:
performing efficient and accurate classification on the obtained remote sensing satellite image data by adopting a hierarchy-based text clustering algorithm;
and (4) carrying out accuracy evaluation on the classified images, and after determining the complete and accurate category, directly classifying and extracting the acquired remote sensing satellite image data according to the result.
Further, the global satellite image search engine data crawling and normalizing method further comprises the following steps:
and S103, constructing a unified standard specification, and performing attribute information consistency integration on the remote sensing satellite image data after the fine classification processing.
In the embodiment of the invention, firstly, a remote sensing image data integrated storage strategy considering the taboo spatial characteristics and a satellite image data segmentation and partition storage strategy considering the taboo multi-source heterogeneous characteristics are utilized, then, a unified satellite image data attribute table is designed, and an attribute field batch unified standard naming algorithm is adopted to replace the manual naming and modifying process. Finally, on the basis of not modifying the program standard specification, the satellite image attribute information is dynamically expanded, and the attribute normalization processing of new image data is quickly supported.
Specifically, the satellite image attribute information standard field includes: image name, product ID, satellite type, sensor type, image acquisition date, image acquisition time, image cloud cover, scene number, data classification, and wkid value.
Specifically, in a preferred embodiment provided by the present invention, the constructing a unified standard specification, and the performing attribute information consistency integration on the remote sensing satellite image data after the fine classification processing specifically includes the following steps:
designing a unified satellite image data attribute table by utilizing a remote sensing image data integrated storage strategy considering the spatial characteristics and a satellite image data segmentation and partition storage strategy considering the multisource heterogeneous characteristics;
attribute naming and modification of satellite image attribute information standard fields are carried out by adopting an attribute field batch unified standard naming algorithm;
and dynamically expanding satellite image attribute information according to the satellite image data attribute table, and performing attribute information consistency integration on the remote sensing satellite image data.
Further, the global satellite image search engine data crawling and normalizing method further comprises the following steps of:
and S104, identifying and calculating the view deflection angle of each satellite image in the remote sensing satellite image data, performing standard angle correction on each satellite image according to the view deflection angle, and uploading the remote sensing satellite image data subjected to fine classification processing, attribute information consistency integration and standard angle correction to a server.
In the embodiment of the invention, the fast view forms provided by all the satellite platforms are different, and the fast view deflection angles have larger difference. And calculating the deflection angle of the fast view of each satellite image through rapid identification, and correcting the abnormal fast view of the deflection angle into a standard angle to generate a standard fast view for a user to check conveniently. And finally, uploading the image, classification and attribute information integrated in the steps to a server at the same time interval.
Specifically, fig. 2 shows a schematic diagram of a standard angle calibration of a satellite image in the method provided by the embodiment of the present invention, and a calculation formula of a view deflection angle of the satellite image is as follows:
Figure BDA0003636304060000101
Figure BDA0003636304060000102
calculating the inclination angle D according to the formula by the corner point coordinates of the standard fast view BD And then calculating the nonstandard fast view inclination angle theta, wherein the difference value of the two inclination angles is the view deflection angle of the satellite image.
And S105, receiving query information of a user, determining an interest area of the user according to the query information, and calling remote sensing satellite image data corresponding to the interest area to be displayed to the user in a visualized manner.
In the embodiment of the invention, the user side receives the query information such as satellite objects, time, sensor types, resolution and the like queried by the user, and corresponding image data is called for visualization to be displayed to the user by utilizing the image classification and attribute information. In order to meet the individual requirements of the interest areas of the users, all satellite images falling in the range of the vector polygons are matched according to administrative areas, the self-drawing polygons and vector area files uploaded by the users by a vector index method based on big data. The administrative boundary is selected, and the satellite images are divided according to provincial administrative boundaries, regional city/state/union administrative boundaries, district/county city/county and township administrative boundaries in advance.
And S106, introducing a satellite cloud picture to assist the satellite image area to accurately identify the single cloud, considering the cloud discrete distribution condition in the image coverage area, obtaining the effective coverage rate according to the ratio of the coverage area of each cloud to the image coverage area, and obtaining the satellite side swing angle from the satellite image attribute information by utilizing the effective coverage rate.
In the embodiment of the invention, the satellite cloud picture is introduced to assist the satellite image area to accurately identify the single cloud, so that the condition of cloud discrete distribution in the image area is considered, and the accurate effective coverage rate is obtained through the ratio of the coverage area of each cloud to the coverage area of the image. And finally, the satellite yaw angle is obtained from the satellite image attribute information by utilizing the calculated effective coverage rate, so that the requirement of a user for self-defining and screening the corresponding satellite images is met.
Specifically, fig. 3 shows a schematic diagram of a calculation principle of effective coverage in the method provided by the embodiment of the present invention, assuming that the middle rectangle in fig. 3 is a given interest area of a user, and the four surrounding rectangles are all satellite images falling in the interest area, intersecting the single cloud identified by each image with the interest area, and obtaining a ratio of the area of the blue cloud portion in the diagram to the total area of the middle rectangle.
Introducing according to a single-cloud intersection method of a polygon given by a user and recognized single-cloud intersection methods: the interest area given by the user is generally an irregular polygon, and the shape of the single cloud is complex, so that a complex vector polygon intersection algorithm is adopted. The algorithm flow is as follows:
step 1: inputting the layer to extract the lowest outsourcing rectangle of the polygon;
step 2: building a Hibert grid based on the superimposed polygonal image layer, and filling the minimum outsourcing rectangle set of the image layer into each grid partition in a data distribution stage to form a grid partition minimum outsourcing rectangle set;
and 3, step 3: constructing an R-tree index in the grid for the superposed layers to form a grid partition index;
and 4, step 4: calling a zipPartition operator to carry out join operation on the grid partition index of the superposed layer and the superposed layer grid partition minimum outsourcing rectangle set, and carrying out cross-region minimum outsourcing rectangle duplicate removal based on a cross-region data intersection point positioning strategy;
and 5: and each grid partition reads the polygon data from the distributed cache to perform intersection calculation, and outputs a result.
Further, fig. 4 is a diagram illustrating an application architecture of the system according to the embodiment of the present invention.
In another preferred embodiment, the global satellite imagery search engine data crawling and normalizing system includes:
the satellite image data acquisition unit 101 is used for acquiring newly released remote sensing satellite image data from all global satellite official platforms in real time by using a big data real-time acquisition program, and primarily classifying and filing the remote sensing satellite image data into a local database according to data sources.
And the satellite image data classification unit 102 is used for performing fine classification processing on the remote sensing satellite image data which is classified, filed and stored in the local database in an idle period by utilizing the computing power of the super-computation center.
And the attribute information consistency integration unit 103 is used for constructing a unified standard specification and performing attribute information consistency integration on the remote sensing satellite image data subjected to the fine classification processing.
And the satellite image angle correction unit 104 is used for identifying and calculating view deflection angles of all satellite images in the remote sensing satellite image data, performing standard angle correction on all satellite images according to the view deflection angles, and uploading the remote sensing satellite image data subjected to fine classification processing, attribute information consistency integration and standard angle correction to a server.
And the interest area image display unit 105 is used for receiving query information of a user, determining an interest area of the user according to the query information, and calling remote sensing satellite image data corresponding to the interest area for visualization and displaying the remote sensing satellite image data to the user.
The satellite yaw angle obtaining unit 106 is configured to introduce a satellite cloud map to assist a satellite image area to accurately identify a single cloud, consider the situation of cloud discrete distribution in an image coverage area, obtain an effective coverage rate according to a ratio of a coverage area of each cloud to an image coverage area, and obtain a satellite yaw angle from satellite image attribute information by using the effective coverage rate.
In conclusion, the embodiment of the invention can acquire the newly released remote sensing satellite image data from all the satellite official platforms in real time, and pre-calculate the relevant data and products of satellite orbit prediction, therefore, the speed of displaying the satellite images when a user accesses in real time is improved, the obtained satellite image big data can be classified efficiently and accurately, a unified standard specification can be constructed to integrate the attribute information of massive satellite images in a consistent manner, the angle can be deflected to correct the displaying standard fast view, the interest areas of the user can be displayed visually, the display of the coverage area under the orbit of the remote sensing satellite can be predicted by additionally using the satellite cloud map, the effective coverage rate of the images can be calculated more effectively and accurately, and the effective coverage area of the satellite images can be displayed, the user-defined image screening function of the user is supported by considering the side swing capability parameters of the satellite.
It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in various embodiments may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (7)

1. The global satellite image search engine data crawling and normalizing method is characterized by comprising the following steps:
using a big data real-time acquisition program to acquire newly released remote sensing satellite image data from all global satellite official platforms in real time, and preliminarily classifying and filing the remote sensing satellite image data according to data sources and storing the remote sensing satellite image data into a local database;
carrying out fine classification processing on the remote sensing satellite image data which is classified, filed and stored in a local database in an idle period by using the computing power of a super-computation center;
constructing a unified standard specification, and performing attribute information consistency integration on the remote sensing satellite image data subjected to fine classification processing;
identifying and calculating a view deflection angle of each satellite image in the remote sensing satellite image data, performing standard angle correction on each satellite image according to the view deflection angle, and uploading the remote sensing satellite image data subjected to fine classification processing, attribute information consistency integration and standard angle correction to a server;
receiving query information of a user, determining an interest area of the user according to the query information, and calling remote sensing satellite image data corresponding to the interest area to be displayed to the user in a visualized manner;
the satellite cloud picture is introduced to assist the satellite image area to accurately identify the single cloud, the condition of cloud discrete distribution in the image coverage area is considered, the effective coverage rate is obtained through the ratio of the coverage area of each cloud to the image coverage area, and the satellite side swing angle is obtained from the satellite image attribute information by utilizing the effective coverage rate.
2. The global satellite image search engine data crawling and normalizing method according to claim 1, wherein the step of using a big data real-time acquisition program to acquire newly released remote sensing satellite image data from all global satellite official platforms in real time and to archive the remote sensing satellite image data in a local database according to the preliminary classification of data sources specifically comprises the steps of:
checking whether the target platform provides an API (application program interface), if so, directly calling remote sensing satellite image data, and if not, acquiring the remote sensing satellite image data by using a big data real-time acquisition program;
carrying out data structure analysis and data storage on the remote sensing satellite image data;
carrying out data flow analysis on the remote sensing satellite image data;
and carrying out data arrangement on the remote sensing satellite image data, and storing the remote sensing satellite image data into a local database.
3. The global satellite image search engine data crawling and normalizing method according to claim 1, wherein the fine classification of the remote sensing satellite image data which is classified, filed and stored in a local database specifically comprises the following steps:
performing efficient and accurate classification on the obtained remote sensing satellite image data by adopting a hierarchy-based text clustering algorithm;
and (4) carrying out accuracy evaluation on the classified images, and after determining the complete and accurate category, directly classifying and extracting the acquired remote sensing satellite image data according to the result.
4. The global satellite image search engine data crawling and normalizing method according to claim 1, wherein the step of constructing a unified standard specification and performing attribute information consistency integration on the remote sensing satellite image data subjected to the fine classification specifically comprises the following steps:
designing a unified satellite image data attribute table by utilizing a remote sensing image data integrated storage strategy considering the spatial characteristics and a satellite image data segmentation and partition storage strategy considering the multisource heterogeneous characteristics;
attribute naming and modification of satellite image attribute information standard fields are carried out by adopting an attribute field batch unified standard naming algorithm;
and dynamically expanding satellite image attribute information according to the satellite image data attribute table, and performing attribute information consistency integration on the remote sensing satellite image data.
5. The global satellite imagery search engine data crawling and normalizing method of claim 4, wherein the satellite imagery attribute information criteria field comprises: image name, product ID, satellite type, sensor type, image acquisition date, image acquisition time, image cloud cover, scene number, data classification, and wkid value.
6. The global satellite image search engine data crawling and normalizing method according to claim 4, wherein a calculation formula of a view deflection angle of the satellite image is as follows:
Figure FDA0003636304050000031
Figure FDA0003636304050000032
calculating the inclination angle D according to the formula by the corner point coordinates of the standard fast view BD And then calculating the nonstandard fast view inclination angle theta, wherein the difference value of the two inclination angles is the view deflection angle of the satellite image.
7. Global satellite image search engine data crawling and normalizing system, characterized in that, the system includes satellite image data acquisition unit, satellite image data classification unit, attribute information consistency integration unit, satellite image angle correction unit, interest area image display unit and satellite side-sway angle acquisition unit, wherein:
the satellite image data acquisition unit is used for acquiring newly released remote sensing satellite image data from all global satellite official platforms in real time by using a big data real-time acquisition program, and preliminarily classifying and filing the remote sensing satellite image data according to data sources and storing the remote sensing satellite image data in a local database;
the satellite image data classification unit is used for carrying out fine classification processing on the remote sensing satellite image data which are classified, filed and stored in the local database in an idle period by utilizing the computing power of the super-computation center;
the attribute information consistency integration unit is used for constructing a unified standard specification and performing attribute information consistency integration on the remote sensing satellite image data after the fine classification processing;
the satellite image angle correction unit is used for identifying and calculating view deflection angles of all satellite images in the remote sensing satellite image data, carrying out standard angle correction on all satellite images according to the view deflection angles, and uploading the remote sensing satellite image data subjected to fine classification processing, attribute information consistency integration and standard angle correction to a server;
the remote sensing satellite image display unit is used for receiving query information of a user, determining an interest area of the user according to the query information, and calling remote sensing satellite image data corresponding to the interest area to be displayed to the user in a visualized manner;
the satellite side-sway angle acquisition unit is used for introducing a satellite cloud picture to assist a satellite image area to accurately identify a single cloud, considering the condition of cloud discrete distribution in an image coverage area, obtaining effective coverage rate through the ratio of the coverage area of each cloud to the image coverage area, and obtaining a satellite side-sway angle from satellite image attribute information by utilizing the effective coverage rate.
CN202210506311.XA 2022-05-10 2022-05-10 Data crawling and normalizing method and system for global satellite image search engine Pending CN115033728A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210506311.XA CN115033728A (en) 2022-05-10 2022-05-10 Data crawling and normalizing method and system for global satellite image search engine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210506311.XA CN115033728A (en) 2022-05-10 2022-05-10 Data crawling and normalizing method and system for global satellite image search engine

Publications (1)

Publication Number Publication Date
CN115033728A true CN115033728A (en) 2022-09-09

Family

ID=83121880

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210506311.XA Pending CN115033728A (en) 2022-05-10 2022-05-10 Data crawling and normalizing method and system for global satellite image search engine

Country Status (1)

Country Link
CN (1) CN115033728A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115934759A (en) * 2022-11-30 2023-04-07 二十一世纪空间技术应用股份有限公司 Accelerated computing method for massive multi-source heterogeneous satellite data query

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115934759A (en) * 2022-11-30 2023-04-07 二十一世纪空间技术应用股份有限公司 Accelerated computing method for massive multi-source heterogeneous satellite data query
CN115934759B (en) * 2022-11-30 2023-12-22 二十一世纪空间技术应用股份有限公司 Acceleration calculation method for massive multi-source heterogeneous satellite data query

Similar Documents

Publication Publication Date Title
CN109829399B (en) Vehicle-mounted road scene point cloud automatic classification method based on deep learning
Li et al. An efficient measure of compactness for two-dimensional shapes and its application in regionalization problems
Puissant et al. The utility of texture analysis to improve per‐pixel classification for high to very high spatial resolution imagery
CN103337052B (en) Automatic geometric correcting method towards wide cut remote sensing image
Yamashkin et al. Improving the efficiency of deep learning methods in remote sensing data analysis: geosystem approach
Alvioli et al. Topography-driven satellite imagery analysis for landslide mapping
US8855427B2 (en) Systems and methods for efficiently and accurately detecting changes in spatial feature data
Peng et al. Object-based change detection from satellite imagery by segmentation optimization and multi-features fusion
US20150235325A1 (en) Management of Tax Information Based on Topographical Information
CN110163294A (en) Remote Sensing Imagery Change method for detecting area based on dimensionality reduction operation and convolutional network
CN114677695A (en) Table analysis method and device, computer equipment and storage medium
CN115033728A (en) Data crawling and normalizing method and system for global satellite image search engine
CN110826454B (en) Remote sensing image change detection method and device
Liu et al. MS-CNN: multiscale recognition of building rooftops from high spatial resolution remote sensing imagery
CN105740901B (en) Mutative scale object-oriented Classification in Remote Sensing Image antidote based on ontology
CN113704276A (en) Map updating method and device, electronic equipment and computer readable storage medium
CN110188682B (en) Optical remote sensing image target detection method based on geometric structure double-path convolution network
Keyvanfar et al. Performance comparison analysis of 3D reconstruction modeling software in construction site visualization and mapping
CN114238541A (en) Sensitive target information acquisition method and device and computer equipment
Bao et al. An automatic extraction method for individual tree crowns based on self-adaptive mutual information and tile computing
Xydas et al. Buildings Extraction from Historical Topographic Maps via a Deep Convolution Neural Network.
Liu et al. A color balancing method for wide range remote sensing imagery based on regionalization
Moore et al. The impact of seasonality on multi-scale feature extraction techniques
Lu et al. An efficient annotation method for big data sets of high-resolution earth observation images
CN113313101B (en) Building contour automatic aggregation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination