CN114372185A

CN114372185A - Rapid search system and method for remote sensing big data

Info

Publication number: CN114372185A
Application number: CN202210051029.7A
Authority: CN
Inventors: 黄祥志; 刘向东; 郝梦非; 周丽玲; 顾冬冬; 吴志钦; 邓冬; 戴希凡
Original assignee: Jiangsu Tianhui Spatial Information Research Institute Co ltd
Current assignee: Jiangsu Tianhui Spatial Information Research Institute Co ltd
Priority date: 2022-01-17
Filing date: 2022-01-17
Publication date: 2022-04-19
Anticipated expiration: 2042-01-17
Also published as: CN114372185B

Abstract

The invention discloses a quick search method of remote sensing big data, which comprises the following steps: step S100: acquiring and analyzing a retrieval request input by a user to generate a combined retrieval tag; data crawling is carried out to obtain a plurality of combined data information fragments meeting the retrieval request; step S200: decomposing a plurality of combined data information fragments meeting the retrieval request into a plurality of mapping data pairs; sequencing and primarily adjusting the priority of each combined data information fragment; step S300: traversing all mapping data pairs, capturing the differential data pairs, and converging the differential data pairs into a differential mapping pair set to calculate and predict the activity and reliability of different source data centers; step S400: the priority sequence of each combined data information fragment is adjusted again based on the activity calculation result and the reliability prediction result; and the user retrieves and acquires the data in the adjusted transfer data center.

Description

Rapid search system and method for remote sensing big data

Technical Field

The invention relates to the technical field of remote sensing data search, in particular to a system and a method for quickly searching remote sensing big data.

Background

Through development of more than half a century, the remote sensing technology and the multi-field application enter a new stage, the remote sensing technology is more and more closely related to national economy, ecological protection and national defense safety, such as land resource investigation, ecological environment monitoring, agricultural monitoring and crop assessment, disaster forecast and disaster assessment, marine environment investigation and the like, including activities such as weather forecast, air quality monitoring, electronic maps, navigation and the like which are closely related to daily life, and the remote sensing plays an important role; in the 21 st century, the remote sensing technology has shown new characteristics of high spatial resolution, high spectral resolution and high time resolution, and has exploited more new application fields to efficiently and conveniently find existing spatial information resources on the internet; remote sensing data with multi-spatial resolution, multi-spectral resolution and multi-temporal resolution acquired by different remote sensing platforms can provide remote sensing information support for users in different application fields, and meanwhile, massive various spatial information resources are opened on the Internet, including a data set opened by a public institution according to a national policy, a result set opened by a scientific research department according to national requirements, an open data set shared by an open source community and a public welfare organization, information service provided by commercial enterprises to the outside and the like; one of the bottlenecks of high-resolution remote sensing data application is that a reliable data source cannot be provided, satellite remote sensing data resources are dispersed in each data center, the data resources of a single data center are very limited, each data center needs to be searched one by one before a project is developed, whether the project has implementation conditions can be judged by integrating the satellite remote sensing data coverage condition, the work is very complicated along with the increase of data companies and the reduction of resource concentration, the index and search service of a quick boundary is improved invisibly, and a user can know whether satellite remote sensing data meeting business requirements exist quickly and how to acquire the data.

Disclosure of Invention

The invention aims to provide a system and a method for quickly searching remote sensing big data, which aim to solve the problems in the background technology.

In order to solve the technical problems, the invention provides the following technical scheme: a quick search method for remote sensing big data comprises the following steps:

step S100: acquiring a retrieval request input by a user and analyzing the retrieval request to generate a combined retrieval tag corresponding to the retrieval request; the form of the combined search tag is such as { master tag; a secondary label; auxiliary tags }; the quick search system generates a corresponding retrieval instruction based on the main label in the combined retrieval label; the quick search system performs data crawling on the data centers of all remote sensing data websites in a large range based on the retrieval instruction to obtain a plurality of combined data information fragments meeting the retrieval request;

step S200: decomposing a plurality of combined data information fragments meeting the retrieval request into a plurality of mapping data pairs between the source data center and the fragment data; sequencing the priority of each combined data information fragment to obtain an initial combined data information fragment sequence; and carrying out primary adjustment on the initial combined data information fragment sequence; sending the combined data information fragment sequence after primary adjustment to a transit data center;

step S300: traversing all mapping data pairs in the combined data information fragment sequence, capturing the difference data pairs which have the same fragment data and are mapped with different source data centers, and converging the captured difference data pairs into a difference mapping pair set; calculating and predicting the activity and the credibility of different source data centers in the difference mapping pair set;

step S400: the priority sequence of each combined data information fragment is adjusted again based on the activity calculation result and the reliability prediction result; the user retrieves and acquires data in the adjusted transfer data center; the user can generate user data feedback after the data is used; the rapid search system saves the priority sequence of each combined data information segment or adjusts the priority sequence of each combined data information segment based on the user data feedback.

Further, step S100 includes:

step S101: correspondingly decomposing a retrieval request input by a user into retrieval condition parameters of each part, wherein the retrieval condition parameters of each part comprise a region, a time range, cloud cover, resolution, a technical source and a field to be applied; the technical source refers to different remote sensors for obtaining various remote sensing data;

step S102: taking two parts of retrieval condition parameters of a technical source and a field to be applied as auxiliary labels of a retrieval request; taking retrieval condition parameters of an area, a time range and a cloud cover as main labels of retrieval requests; taking the resolution as a secondary label of the retrieval request; combining the auxiliary label, the main label and the secondary label to form a form such as { main label; a secondary label; auxiliary tags } a combined retrieval tag;

step S103: sending out the retrieval condition parameters of each part in a json format, and receiving by a server interface by using @ RequestBody; the server analyzes the retrieval condition parameters in the main label, wherein the analysis comprises processing the region coordinate information into a corresponding space data type through an gis algorithm, and converting the received time range information in a character string form into time information in a date format;

step S104: performing data crawling in a data center of each remote sensing data website in a large range based on regional information in a main tag of a retrieval request; using the crawled data as an initial data range; further screening data in an initial data range based on the time range and the cloud cover information in the main label of the retrieval request;

the @ requestBody note is commonly used to handle content that is not the default application/x-www-form-url encoded for content-type, such as application/json or application/xml, which is commonly used to handle application/json types; the @ RequestBody accepts a JSON-formatted character string, and the JSON character string in the retrieval request body can be bound to a corresponding bean or can be bound to a corresponding character string respectively through the @ RequestBody; through the arrangement, the screened data can better accord with the retrieval request and the retrieval efficiency is improved.

Further, step S200 includes:

step S201: respectively carrying out data center traceability on a plurality of combined data information fragments meeting the retrieval request to obtain a traceability set { A) corresponding to each combined data information fragment₁，A₂，…，A_i，…，A_n}; wherein A is_iRepresenting the ith source data center; n represents the total number of source data centers; and (3) carrying out fragment data disassembly on each combined data information fragment based on the corresponding traceable set obtained by tracing to obtain a data fragment set { B }₁，B₂，…，B_k，…，B_m}; wherein, B_kRepresents the kth fragment data; m represents the total number of fragment data; and n is m;

step S202: respectively establishing one-to-one mapping relation between the tracing sets and the data fragment sets of a plurality of combined data information fragments meeting the retrieval request to obtain a plurality of mapping data pairs { A }_i，B_k}; and i ═ k; traversing the mapping data logarithm in all combined data information fragments meeting the retrieval request; sequencing all combined data information fragments from small to large based on respective mapping data logarithms to obtain an initial combined data information fragment sequence;

step S203: extracting combined retrieval tag information of each combined data information fragment in the initial combined data information fragment sequence, and performing primary adjustment on the initial combined data information fragment sequence based on secondary tags and auxiliary tags in the combined retrieval tags of each combined data information fragment;

the mapping data pairs obtained by the steps are beneficial to tracing the source of each metadata center of each obtained combined data information fragment subsequently, and the source composition of each part of data in each combined data information fragment can be reflected by the mapping data pairs of each combined data information fragment, for example, because different combined data information fragments comprise different mapping data pairs and each mapping data pair comprises a source data center, the searching source of the combined data information fragments with few mapping data pairs is simpler because the source of data crawling is simple when all remote sensing data meeting the retrieval request are obtained, and the risk of the credibility of the source data center is low; finishing the sorting of each combined data information fragment based on the consideration; the secondary label and the auxiliary label are used for sequencing and adjusting the obtained remote sensing data based on the quality requirement, so that the user can preferably check the high-quality remote sensing data.

Further, step S300 includes a method for calculating liveness for different source data centers within the set of distinct mapping pairs:

step S301: respectively acquiring historical browsing information, historical data downloading information and historical data fragment copying information of each source data center in a differential data pair; capturing browsing rules of each browsing record in the historical browsing information, calculating browsing frequency, and setting a standard browsing frequency fluctuation interval; accumulating browsing times Q meeting standard browsing frequency fluctuation interval₁Number of browsing Q within fluctuation range of browsing frequency not meeting standard₂；

Step S302: establishing information association between the historical browsing information and historical data downloading information and historical data fragment copying information respectively; cumulative Q₁Number of times L of downloading historical data under association₁，Q₂Number of times L of downloading historical data under association₂(ii) a Cumulative Q₁Associated historical data segment copy times H₁，Q₂Associated historical data segment copy times H₂(ii) a Calculating the activity A of each source data center:

A＝a×Q₁+b×L₁+c×H₁

wherein a is Q₁/Q₂；b＝L₁/L₂；c＝H₁/H₂；

The activity of the source data center is analyzed and calculated, so that bedding analysis before reliability prediction is performed on the source data center, and when reliability analysis data is missing or insufficient, the reliability analysis data is used as reference data for supplementary analysis, so that the scientificity of data sources in the rapid searching process is guaranteed;

in the above step, a, b and c correspond to weight values, and Q is set to be Q₁/Q₂As the weight value a, when Q₂Is far greater than Q₁When it is time, Q₁And Q₂A multiple of difference of (a) is taken as Q₁The weight value of can reflect the Q of the liveness really when the liveness is calculated out in a linear way₁Data of (a) represents meaning and Q₁And Q₂Correlation between, Q₁/Q₂The greater the activity A, the greater Q₁/Q₂The smaller, the lower is Q₁The ratio in the activity calculation process is smaller along with the activity A; in the same way, the L₁/L₂As the weight value b, when L₂Is far greater than L₁At the time of L, the₁And L₂Is taken as L₁The weighted value can be linearly increased to actually reflect the L of the liveness in the liveness calculation₁Data of (3) represents meaning and L₁And L₂Correlation between, L₁/L₂The greater the activity A, the greater L₁/L₂The smaller, the lower is L₁The ratio in the activity calculation process is smaller along with the activity A; h is to be₁/H₂As the weight value c, when H₂Is far more than H₁When H is required, H is₁And H₂Is taken as H₁The weighted value can be linearly increased, and H which can really reflect the activity in the activity calculation₁Represents meaning of data of (A) and H₁And H₂Correlation between them, H₁/H₂The greater the activity A, the greater the activity H₁/H₂The smaller, the lower is H₁The ratio in the activity calculation process is smaller along with the activity A.

Further, step S300 includes a method of predicting trustworthiness of different source data centers within a set of distinct mapping pairs:

step S311: segment data information coverage for different source data centers in each mapping data pairThe coverage rate is calculated as P₁F/G; wherein F represents the times of occurrence of a certain source data center in a different mapping pair set in each mapping data pair of a certain combined data information fragment; g represents the total number of the difference mapping pairs in the difference mapping pair set;

step S312: capturing page information of different source data centers in each mapping data pair; for frequency P of advertisement page appearing in page message₂Capturing; frequency P of technical word irregularity appearing in page information₃Capturing; the technical terms are not standard, and do not conform to the technical terms special for remote sensing data generated based on a large database or common alternative description terms;

step S313: credibility of different source data centers according to formula

Predicting the reliability value;

in the process of the credibility value prediction analysis, the coverage rate P of the fragment data information of different source data centers in the mapping data pair is used₁On the basis of the analysis, a data information coverage rate P is set by default₁A high source data center is lower in confidence risk; because the data reference rate represents the activity and data coverage speciality of the source data center; the credibility analysis program of the non-key source data center in the data retrieval process is reduced; and the frequency P of the appearance of the advertisement pages₂And frequency P of technical denormalization of occurrence₃The data are related data which reduce the credibility risk of the source data center; when P is present₂And P₃The larger the product of (a), the smaller the confidence value K; when P is present₂And P₃The smaller the product of (c), the greater the confidence value K.

Furthermore, capturing the browsing rule of each browsing record in the historical browsing information means capturing an effective browsing record in the historical browsing information; the determination of valid browsing records includes:

the historical browsing information contains a sliding record of a user on a page, and when the average duration of the sliding page is greater than a preset average duration threshold value, a prepared effective record is recorded;

capturing the interval dwell time of each page in the sliding process in the sliding record capturing process; recording as an effective record when the interval stay time is longer than the preset interval stay time, and acquiring the total page number x and the total word number y of the sliding record when the interval stay time is shorter than the preset interval stay time; when y: when x is smaller than a preset ratio threshold value, recording as an effective record; when y: and when x is smaller than a preset ratio threshold value, deleting the part of the sliding record.

Further, the step S400 of adjusting the priority ranking of each combined data information fragment again based on the liveness calculation result and the reliability prediction result includes:

step S401: acquiring liveness calculation results and credibility calculation results of the centralized source data centers by the differential mapping; sequencing the values of the source data centers from large to small according to the reliability calculation results; when the reliability values are equal to or smaller than the deviation threshold value of the reliability values in the sorting process, sequentially sorting according to the activity calculation result;

step S402: traversing all combined data information segments meeting the retrieval request, locking and labeling the combined data information segments covering the distinguishing mapping in each combined data information segment to the combined data information segment of the centralized source data center, and when the sequencing interval between two labeled combined data information segments in the initial combined data information segment sequence is smaller than an interval threshold value; obtaining the sorting of the concentrated source data centers belonging to the difference mapping in the labeled combined data information fragment, and calculating an average sorting value; and the sorting of the two marked combined data information segments is adjusted again based on the size of the average sorting value.

In order to better realize the method, a quick search system is also provided, and the quick search system comprises: the system comprises a retrieval request acquisition and analysis module, a data crawling module, a mapping data pair generation module, a combined data information fragment sequencing primary adjustment module, a distinguishing data pair capturing module, a transit data center generation module and a combined data information fragment sequencing secondary adjustment module;

the retrieval request acquisition and analysis module is used for acquiring a retrieval request input by a user and analyzing the retrieval request to generate a combined retrieval tag corresponding to the retrieval request;

the data crawling module is used for receiving the retrieval request to acquire the combined retrieval tag data in the analysis module, and the rapid search system generates corresponding retrieval instructions based on the main tags in the combined retrieval tags to perform data crawling in the data centers of the remote sensing data websites in a large range to acquire a plurality of combined data information fragments meeting the retrieval request;

the mapping data pair generation module is used for receiving the combined data information fragment data in the data crawling module and decomposing a plurality of combined data information fragments meeting the retrieval request into a plurality of mapping data pairs between the source data center and the fragment data;

the combined data information fragment sequencing module is used for receiving mapping data pair data in the mapping data pair generating module, and sequencing and primarily adjusting the priority of each combined data information fragment based on the mapping data pair;

the transit data center generating module is used for receiving the data in the combined data information fragment sequencing module and storing the data and the sequencing information;

the combined data information fragment sequencing initial adjustment module is used for receiving data in the transit data center generation module and sequencing all combined data information fragments from small to large based on respective mapping data logarithms to obtain an initial combined data information fragment sequence; performing primary adjustment on the initial combined data information fragment sequence based on secondary labels and auxiliary labels in the combined retrieval labels;

the distinguishing data pair capturing and analyzing module is used for capturing distinguishing data pairs of all mapping data pairs in the combined data information fragment meeting the retrieval request and gathering the captured distinguishing data pairs into a distinguishing mapping pair set; calculating and predicting the activity and the credibility of different source data centers in the difference mapping pair set;

and the combined data information fragment sequencing readjustment module is used for receiving the calculation data in the distinguishing data pair capture analysis module and readjusting the sequencing of each combined data information fragment based on the calculation data.

Further, the mapping data pair generation module includes: a source tracing unit and a data disassembling unit;

the source tracing unit is used for carrying out data center source tracing on each combined data information fragment meeting the retrieval request and collecting to obtain a source tracing set corresponding to each combined data information fragment;

and the data disassembling unit is used for receiving the data in the tracing unit and disassembling the fragment data of each combined data information fragment based on the tracing set to obtain a data fragment set.

Further, the distinguishing data pair capturing and analyzing module comprises: the activity degree calculating unit and the reliability degree predicting and calculating unit;

the activity calculation unit is used for acquiring historical browsing information, historical data downloading information and historical data segment copying information of each source data center in the differential data pair; calculating the activity of the message leaching meeting based on the historical browsing information, the historical data downloading information and the historical data segment replication information;

a reliability prediction calculation unit; the system comprises a data acquisition module, a data acquisition module and a data analysis module, wherein the data acquisition module is used for acquiring data information of different source data centers in each mapping data pair; and carrying out credibility prediction calculation of different source data centers based on the coverage rate of the fragment data information and the captured and analyzed page information.

Compared with the prior art, the invention has the following beneficial effects: the problem that data resources of a single data center are limited can be solved, the problem of information isolated islands among the data centers can be solved by establishing a prepared data center, data retrieval can be carried out on the prepared data center with high resource concentration ratio by each search requirement, and data reliability judgment is completed; the invention solves the problem that each data center needs to be searched one by one before the project is developed, improves the searching efficiency, invisibly improves the indexing and searching service of the quick boundary, enables a user to quickly know whether satellite remote sensing data meeting the service requirement exists or not, and judges the source and the reliability of the obtained data; the scientificity and the rigor of data acquisition are improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

FIG. 1 is a schematic flow chart of a method for rapidly searching remote sensing big data according to the invention;

FIG. 2 is a schematic structural diagram of a rapid search system for remote sensing big data.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1-2, the present invention provides the following technical solutions:

a quick search method for remote sensing big data comprises the following steps:

wherein, step S100 includes:

wherein, step S200 includes:

step S300: traversing all mapping data pairs in the combined data information fragment sequence, capturing the difference data pairs which have the same fragment data and are mapped with different source data centers, and converging the captured difference data pairs into a difference mapping pair set; calculating and predicting the liveness and credibility of different source data centers in the difference mapping pair set;

wherein step S300 includes a method of activity calculation for different source data centers within the set of distinct mapping pairs:

step S301: respectively acquiring historical browsing information, historical data downloading information and historical data fragment copying information of each source data center in a differential data pair; capturing browsing rules of each browsing record in the historical browsing information, calculating browsing frequency, and setting a standard browsing frequency fluctuation interval; totalizing the browsers meeting the standard browsing frequency fluctuation intervalNumber of views Q₁Number of browsing Q within fluctuation range of browsing frequency not meeting standard₂；

The browsing rule capture of each browsing record in the historical browsing information refers to capturing of an effective browsing record in the historical browsing information; the determination of valid browsing records includes:

capturing the interval dwell time of each page in the sliding process in the sliding record capturing process; recording as an effective record when the interval stay time is longer than the preset interval stay time, and acquiring the total page number x and the total word number y of the sliding record when the interval stay time is shorter than the preset interval stay time; when y: when x is smaller than a preset ratio threshold value, recording as an effective record; when y: when x is smaller than the preset ratio threshold value, deleting the sliding record of the part

A＝a×Q₁+b×L₁+c×H₁

wherein a is Q₂/Q₁；b＝L₂/L₁；c＝H₂/H₁；

Step S300 includes a method for predicting credibility of different source data centers in a set of distinct mapping pairs:

step S311: calculating the coverage rate of the fragment data information of different source data centers in each mapping data pair, wherein the coverage rate is P₁F/G; where F represents a source within a set of distinct mapping pairsThe number of times that the data center appears in each mapping data pair of a certain combined data information fragment; g represents the total number of the difference mapping pairs in the difference mapping pair set;

step S313: credibility of different source data centers according to formula

Predicting the reliability value;

step S400: the priority sequence of each combined data information fragment is adjusted again based on the activity calculation result and the reliability prediction result; the user retrieves and acquires data in the adjusted transfer data center; the user can generate user data feedback after the data is used; the rapid searching system stores the priority sequence of each combined data information fragment or adjusts the priority sequence of each combined data information fragment based on the user data feedback;

in step S400, readjusting the priority ranking of each combined data information fragment based on the liveness calculation result and the reliability prediction result includes:

step S402: traversing all combined data information segments meeting the retrieval request, locking and labeling the combined data information segments covering the distinguishing mapping in each combined data information segment to the combined data information segment of the centralized source data center, and when the sequencing interval between two labeled combined data information segments in the initial combined data information segment sequence is smaller than an interval threshold value; obtaining the sorting of the concentrated source data centers belonging to the difference mapping in the labeled combined data information fragment, and calculating an average sorting value; the sorting of the two marked combined data information fragments is adjusted again based on the size of the average sorting value;

wherein the mapping data pair generation module comprises: a source tracing unit and a data disassembling unit;

the data disassembling unit is used for receiving the data in the tracing unit and disassembling the fragment data of each combined data information fragment based on the tracing set to obtain a data fragment set;

wherein, the distinguishing data pair capturing and analyzing module comprises: the activity degree calculating unit and the reliability degree predicting and calculating unit;

a reliability prediction calculation unit; the system comprises a data acquisition module, a data acquisition module and a data analysis module, wherein the data acquisition module is used for acquiring data information of different source data centers in each mapping data pair; credibility prediction calculation of different source data centers based on fragment data information coverage rate and captured and analyzed page information

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A quick search method for remote sensing big data is characterized by comprising the following steps:

step S100: acquiring a retrieval request input by a user, analyzing the retrieval request and generating a combined retrieval tag corresponding to the retrieval request; the form of the combined search tag is such as { master tag; a secondary label; auxiliary tags }; the quick search system generates a corresponding retrieval instruction based on the main label in the combined retrieval label; the rapid searching system carries out data crawling on the data center of each remote sensing data website in a large range based on the retrieval instruction to obtain a plurality of combined data information fragments meeting the retrieval request;

step S200: decomposing the combined data information fragments meeting the retrieval request into a plurality of mapping data pairs between the source data centers and the fragment data; sequencing the priority of each combined data information fragment to obtain an initial combined data information fragment sequence; and carrying out primary adjustment on the initial combined data information fragment sequence; sending the combined data information fragment sequence after primary adjustment to a transit data center;

step S400: readjusting the sequence of the combined data information fragments based on the liveness calculation result and the reliability prediction result; and the user retrieves and acquires the data in the readjusted transit data center.

2. The method for rapidly searching remote sensing big data according to claim 1, characterized in that: the step S100 includes:

step S101: correspondingly decomposing the retrieval request input by the user into retrieval condition parameters of each part, wherein the retrieval condition parameters of each part comprise a region, a time range, cloud cover, resolution, a technical source and a field to be applied; the technical source refers to different remote sensors for obtaining various remote sensing data;

step S102: taking two parts of retrieval condition parameters of a technical source and a field to be applied as auxiliary labels of a retrieval request; taking retrieval condition parameters of an area, a time range and a cloud cover as main labels of retrieval requests; taking the resolution as a secondary label of the retrieval request; combining the auxiliary tag, the main tag and the secondary tag into a form such as { main tag; a secondary label; auxiliary tags } a combined retrieval tag;

step S104: performing data crawling in a data center of each remote sensing data website in a large range based on the regional information in the main label; using the crawled data as an initial data range; and further screening data in the initial data range based on the time range and cloud cover information in the main label.

3. The method for rapidly searching remote sensing big data according to claim 1, characterized in that: the step S200 includes:

step S201: respectively carrying out data center traceability on the plurality of combined data information fragments meeting the retrieval request to obtain a traceability set { A) corresponding to each combined data information fragment₁，A₂，…，A_i，…，A_n}; wherein A is_iRepresenting the ith source data center; n represents the total number of source data centers; and (3) carrying out fragment data disassembly on each combined data information fragment based on the corresponding traceable set obtained by tracing to obtain a data fragment set { B }₁，B₂，…，B_k，…，B_m}; wherein, B_kRepresents the kth fragment data; m represents the total number of fragment data; and n is m;

step S202: respectively establishing one-to-one mapping relation between the tracing sets and the data fragment sets of the combined data information fragments meeting the retrieval request to obtain a plurality of mapping data pairs { A }_i，B_k}; and i ═ k; traversing the mapping data logarithm in all combined data information fragments meeting the retrieval request; sequencing all combined data information fragments from small to large based on respective mapping data logarithms to obtain an initial combined data information fragment sequence;

step S203: and extracting combined retrieval tag information of each combined data information fragment in the initial combined data information fragment sequence, and performing primary adjustment on the initial combined data information fragment sequence based on secondary tags and auxiliary tags in the combined retrieval tags of each combined data information fragment.

4. The method for rapidly searching remote sensing big data according to claim 1, characterized in that: the step S300 includes a method of calculating liveness for different source data centers within the set of distinct mapping pairs:

step S301: respectively acquiring historical browsing information, historical data downloading information and historical data fragment copying information of each source data center in the differential data pair; capturing browsing rules of each browsing record in the historical browsing information, calculating browsing frequency, and setting a standard browsing frequency fluctuation interval; accumulating the browsing times Q meeting the standard browsing frequency fluctuation interval₁Browsing times Q in the fluctuation interval of browsing frequency not meeting the standard₂；

Step S302: establishing information association between the historical browsing information and the historical data downloading information and the historical data fragment copying information respectively; cumulative Q₁Number of times L of downloading historical data under association₁，Q₂Number of times L of downloading historical data under association₂(ii) a Cumulative Q₁Associated historical data segment copy times H₁，Q₂Associated historical data segment copy times H₂(ii) a Calculating the activity A of each source data center:

A＝a×Q₁+b×L₁+c×H₁

wherein a is Q₂/Q₁；b＝L₂/L₁；c＝H₂/H₁。

5. The method for rapidly searching remote sensing big data according to claim 1, characterized in that: the step S300 includes a method of predicting credibility of different source data centers within the set of distinct mapping pairs:

step S311: for each mapping data pairCalculating the coverage rate of the fragment data information of different source data centers, wherein the coverage rate is P₁F/G; wherein F represents the times of occurrence of a certain source data center in a different mapping pair set in each mapping data pair of a certain combined data information fragment; g represents the total number of the difference mapping pairs in the difference mapping pair set;

step S312: capturing page information of different source data centers in each mapping data pair; for the frequency P of the advertisement pages in the page message₂Capturing; the frequency P with irregular technical words in the page information₃Capturing; the technical term is not standard, and is not in accordance with the technical term special for remote sensing data generated based on a large database or common alternative description terms;

step S313: credibility of different source data centers according to formula

And predicting the reliability value.

6. The method for rapidly searching remote sensing big data according to claim 4, characterized in that: the step of capturing the browsing rule of each browsing record in the historical browsing information refers to capturing an effective browsing record in the historical browsing information; the judging of the effective browsing record comprises:

capturing the interval dwell time of each page in the sliding process in the sliding record capturing process; recording as an effective record when the interval stay time is longer than a preset interval stay time, and acquiring the total page number x and the total word number y of the sliding record when the interval stay time is shorter than the preset interval stay time; when y: when x is smaller than a preset ratio threshold value, recording as an effective record; when y: and when x is smaller than a preset ratio threshold value, deleting the part of the sliding record.

7. The method for rapidly searching remote sensing big data according to claim 1, characterized in that: the step S400 of adjusting the priority ranking of each combined data information fragment again based on the liveness calculation result and the reliability prediction result includes:

step S402: traversing all combined data information segments meeting the retrieval request, locking and labeling the combined data information segments covering the distinguishing mapping in each combined data information segment to the combined data information segment of the centralized source data center, and when the sequencing interval between two labeled combined data information segments in the initial combined data information segment sequence is smaller than an interval threshold value; obtaining the sorting of the concentrated source data centers belonging to the difference mapping in the labeled combined data information fragment, and calculating an average sorting value; and adjusting the sequencing of the two labeling combined data information fragments again based on the size of the average sequencing value.

8. A fast search system applied to the fast search method of the remote sensing big data of any one of claims 1 to 7, characterized in that: the quick search system includes: the system comprises a retrieval request acquisition and analysis module, a data crawling module, a mapping data pair generation module, a combined data information fragment sequencing primary adjustment module, a distinguishing data pair capturing module, a transit data center generation module and a combined data information fragment sequencing secondary adjustment module;

the data crawling module is used for receiving the combined retrieval tag data in the retrieval request acquisition and analysis module, and the rapid search system generates corresponding retrieval instructions based on main tags in the combined retrieval tags to perform data crawling in a data center of each remote sensing data website in a large range to obtain a plurality of combined data information fragments meeting the retrieval request;

the mapping data pair generation module is used for receiving the combined data information fragment data in the data crawling module and decomposing the combined data information fragments meeting the retrieval request into a plurality of mapping data pairs between the source data centers and the fragment data;

the transit data center generating module is used for receiving the data in the combined data information fragment sequencing module and storing the data and sequencing information;

the combined data information fragment sequencing initial adjustment module is used for receiving the data in the transit data center generation module and sequencing all combined data information fragments from small to large based on respective mapping data logarithm to obtain an initial combined data information fragment sequence; performing primary adjustment on the initial combined data information fragment sequence based on a secondary label and an auxiliary label in the combined retrieval label;

the distinguishing data pair capturing and analyzing module is used for capturing distinguishing data pairs of all mapping data pairs in the combined data information fragment meeting the retrieval request and gathering the captured distinguishing data pairs into a distinguishing mapping pair set; calculating and predicting the liveness and credibility of different source data centers in the difference mapping pair set;

and the combined data information fragment sequencing readjustment module is used for receiving the calculated data in the distinguishing data pair capture analysis module and readjusting the sequencing of each combined data information fragment based on the calculated data.

9. The system for rapidly searching remote sensing big data according to claim 8, wherein: the mapping data pair generation module comprises: a source tracing unit and a data disassembling unit;

and the data disassembling unit is used for receiving the data in the tracing unit and disassembling the data of each combined data information fragment based on the tracing set to obtain a data fragment set.

10. The system for rapidly searching remote sensing big data according to claim 8, wherein: the distinguishing data pair capturing and analyzing module comprises: the activity degree calculating unit and the reliability degree predicting and calculating unit;

the reliability prediction calculation unit; the system comprises a data acquisition module, a data acquisition module and a data analysis module, wherein the data acquisition module is used for acquiring data information of different source data centers in each mapping data pair; and carrying out credibility prediction calculation of different source data centers based on the coverage rate of the fragment data information and the captured and analyzed page information.