CN115329903B

CN115329903B - Spatial data integration method and system applied to digital twin city

Info

Publication number: CN115329903B
Application number: CN202211249273.0A
Authority: CN
Inventors: 周春煦; 张建平; 陈梨春; 姜显贵; 李丹; 谢云飞; 施峰; 吉顺莉; 赵苏政; 倪飞; 施小飞; 戴雨; 王兆能; 曹野; 郑玉能; 仲文正; 陆丁炜; 鲍志鹏; 成海峰; 闵钰强
Original assignee: Fujian Meifang Times Technology Co ltd
Current assignee: Fujian Meifang Times Technology Co ltd
Priority date: 2022-10-12
Filing date: 2022-10-12
Publication date: 2023-05-30
Anticipated expiration: 2042-10-12
Also published as: CN115329903A

Abstract

The invention discloses a spatial data integration method and a system applied to a digital twin city, which relate to the technical field of digital twin cities and comprise the following steps: receiving data of all data sources, uploading the data sources to a server, and searching and analyzing the multi-source data to determine the quantity and format of the data in the library; identifying the subject of the data in the library, and marking the classified data by taking the subject as a label; quantifying the numerical values of a plurality of libraries to obtain standard values of the libraries, and quantitatively evaluating the libraries with the standard values; obtaining an evaluation value of the library according to the conversion difficulty between the data formats; obtaining a library evaluation value PG, and sorting all libraries according to the library evaluation value PG; sequentially acquiring data from the library according to the sorting of the library; the format difficulty value GsN and the standard value BZ of the library are obtained, the library evaluation value PG is obtained, the libraries are ordered according to the library evaluation value PG, the data acquisition strategy is determined, and the comprehensive degree is high when the spatial data acquisition is performed.

Description

Spatial data integration method and system applied to digital twin city

Technical Field

The invention relates to the technical field of digital twin cities, in particular to a spatial data integration method and a system applied to the digital twin cities.

Background

The digital twin is to fully use data such as physical models, sensors, operation histories and the like, integrate the simulation whole process of multidisciplinary, multidisciplinary quantity, multiscale and multiscale probability, complete mapping in a virtual space and reflect the whole life cycle process of corresponding entity equipment.

The digital twin city is based on a building information model and a city three-dimensional geographic information system, all elements of people, things, events, water, electricity, gas and the like of a physical city are digitalized by utilizing the technology of the Internet of things, and then a virtual city which is completely corresponding to the physical city is constructed on a network space, so that the situation that the physical city in physical dimension and the digital city in information dimension coexist and are virtual and real is fused is formed.

When a digital twin city is established, a large amount of space data needs to be acquired, but the space data is large in data size and is in a multi-source and multi-format state, so that the number of libraries of data sources is increased, the data acquisition amount is large, the acquisition difficulty is high, effective data is difficult to acquire when the space is integrated, and the efficiency of space data integration is reduced.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a space data integration method and a system applied to a digital twin city, which are used for searching and analyzing multi-source data to determine the quantity and the format of the data in a library by receiving the data of all data sources; marking the classified data by taking the theme as a label; obtaining standard values of all libraries, and obtaining evaluation values of the libraries according to the conversion difficulty between data formats; sorting all the libraries according to the library evaluation values; sequentially acquiring data; the data acquisition strategy is determined by sequencing a plurality of libraries according to the library evaluation values PG, so that the comprehensive degree is high when spatial data acquisition is performed, and the problems in the background technology are solved.

In order to achieve the above purpose, the invention is realized by the following technical scheme: the spatial data integration method applied to the digital twin city comprises the following steps:

receiving data of all data sources, uploading the data sources to a server, retrieving and analyzing multi-source data, and determining the quantity and format of the data in a library; identifying the subject of the data in the library, and marking the classified data by taking the subject as a label;

quantifying the numerical values of a plurality of libraries to obtain standard values of the libraries, and quantitatively evaluating the libraries with the standard values; according to the conversion difficulty between the data formats, correlating with the standard value of the library to obtain the evaluation value of the library;

obtaining a library evaluation value PG, and sorting all libraries according to the library evaluation value PG; when data acquisition is carried out, the data are sequentially acquired from the library according to the sorting of the library.

Further, the format in the multi-source data is identified, and the total data amount in each library and the format information of the data in the library are determined; classifying the data in the library according to the format information of the data to form different data categories;

sorting the data types according to the data volume under each format; after the total amount of the data of each format in the library is obtained, determining the format with the largest occurrence number in each category, and converting the other formats into the format with the largest occurrence number, so that the formats in the library are unified;

and through a training filter, invalid data in the library is filtered by utilizing the trained data filter, so that noise generated by the invalid data or blank data in the library is reduced, and interference to normal data is reduced.

Further, selecting a plurality of data from one of a plurality of libraries, and respectively taking the data as a theme extraction training set and a theme model test set; training the LDA topic model by using a training set to generate a trained LDA topic model, and testing by using a topic testing set to determine that the trained LDA topic model is error-free;

performing topic extraction on a plurality of data in a library by using the LDA model obtained through training to obtain a plurality of data topics;

judging the similarity among different topics by using a similarity model, and classifying a plurality of acquired topics according to the similarity;

generating a theme tag according to the theme name, adding the theme tag into a corresponding data classification category, and characterizing the category by the theme tag.

Further, the number of the topic labels in all libraries is obtained, and the activity of the topic labels is calculated; acquiring the number of the topic labels in all libraries, and calculating the contribution degree of the topic labels; calculating the similarity between the topic labels in the library and the topics of the data acquisition and analysis strategy, and obtaining similarity data;

acquiring contribution GxL, similarity Xs and total activity ZhY, carrying out normalization processing, and then, associating and summarizing to form a library standard value which is recorded as a library standard value BZ;

the calculation mode accords with the following formula:

；

wherein ,

，/>

，/>

and->

，/>

The specific value of the weight can be adjusted and corrected by the user according to the actual experience by changing +.>

To correct the standard value BZ of the library.

Further, to

Representing the most recent liveness of the subject tags in the library; to->

Representing the total number of occurrences of data topics in the library in the data class represented by the topic label; to->

Is the total data volume in the library; the expression of the expression is that,

；

the value of the correction coefficient is set by a user according to the requirement, so that the correction of the liveness of the theme labels of the library is facilitated;

the number of topic tags in each data category and the corresponding liveness are determined and summarized to form a total liveness, which is denoted as total liveness ZhY.

Further, under the topic label of the data category, all topic numbers are marked as LtS, and the percentage of the data under the topic label in the total data in the database is marked as Zb; the contribution degree of the theme label is GxL;

the calculation method of the contribution GxL conforms to the following expression:

；/>

wherein ,

and the value of the correction coefficient is set by a user according to the requirement to correct the contribution degree of the theme label of the library.

Further, acquiring a strategy for data acquisition and analysis, and extracting a strategy theme through a trained LDA theme model to acquire the strategy theme; obtaining a topic label from a library, judging the similarity between the topic label in the library and a strategy topic by using a similarity model, and quantifying the value of the similarity;

obtaining quantized similarity values of all the theme labels, and sorting to form sorting information; and obtaining the similarity of all the topic labels in the library, summarizing, obtaining the maximum value of the similarity of the topic labels of the library, and determining the maximum value as the similarity Xs.

Further, the data format in each library and the corresponding data quantity are obtained, and format quantity data are determined; grading the conversion difficulty according to the conversion difficulty among the formats, acquiring the grading average value during conversion among different formats, and marking the formats by the grading average value;

taking the product of the score average value and the data quantity in the library as a format difficulty value GsN of the library, and sorting a plurality of libraries according to the format difficulty value GsN; and converting the data formats of the other libraries into the format of the library with the lowest format difficulty value GsN, and unifying the formats of the libraries.

Further, the format difficulty value GsN and the standard value BZ of the library are obtained, and the format difficulty value GsN and the standard value BZ of the library are associated to determine the library evaluation value PG:

the calculation method of the standard value of the library is as follows:

；

wherein ,

for correction coefficients, the library evaluation value PG is corrected,/->

Is->

Is a weight coefficient>

，

And->

，/>

Is->

The value of (2) is set by the user.

A spatial data integration system for digital twinning cities, comprising:

the data analysis module is used for receiving the data of all the data sources, searching and analyzing the multi-source data and determining the quantity and the format of the data in the library;

the theme marking module is used for identifying the theme of the data in the library and marking the classified data by taking the theme as a label;

the data quantization module quantizes the numerical values of a plurality of libraries to obtain the standard value of each library, and the libraries are quantized and evaluated by the standard value;

the data association module is used for associating the data format conversion difficulty with the standard value of the library to obtain the evaluation value of the library;

the sorting module is used for obtaining a library evaluation value PG and sorting all libraries according to the library evaluation value PG; when data acquisition is carried out, the data are sequentially acquired from the library according to the sorting of the library.

The invention provides a spatial data integration method and a spatial data integration system applied to a digital twin city. The beneficial effects are as follows:

the data format can be converted through the distribution condition of the data format of a single library, the workload of format conversion can be reduced, and the strategy of format conversion is formulated through judging the difficulty of format conversion, so that the efficiency and the speed of format conversion can be improved when the data of each library is converted, and the difficulty of multi-format multi-source data identification is reduced when the data is acquired and identified.

By identifying and sequencing the topics of the library, data with high similarity can be preferentially acquired based on the similarity when multi-source data are acquired, so that the requirement of data acquisition can be rapidly met when the multi-source data are acquired, and the efficiency of data acquisition is improved.

The data acquisition strategy is determined by acquiring the format difficulty value GsN and the standard value BZ of the library and acquiring the library evaluation value PG, and sorting a plurality of libraries according to the library evaluation value PG, so that the comprehensive degree is high, and the data acquisition efficiency is improved when the spatial data acquisition is performed.

Drawings

FIG. 1 is a schematic diagram of the structure of the evaluation values of the library of the present invention;

FIG. 2 is a schematic flow chart of the spatial data integration method of the present invention;

FIG. 3 is a schematic diagram of a spatial data integration system according to the present invention.

In the figure:

10. a data analysis module; 20. a theme marking module; 30. a data quantization module; 40. a data association module; 50. and a sequencing module.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1

Referring to fig. 1-3, the present invention provides a spatial data integration method applied to a digital twin city, comprising the following steps:

step 1, receiving data of all data sources, uploading the data sources to a server, searching and analyzing multi-source data, and determining the quantity and format of the data in a library; the method specifically comprises the following steps:

step 101, identifying formats in multi-source data, and determining total data amount in each library and format information of the data in the library;

step 102, classifying the data in the library according to the format information of the data to form different data categories;

step 103, sorting the data types according to the data volume under each format;

104, after the total amount of the data of each format in the library is obtained, determining the format with the largest occurrence number in each category, and converting the other formats into the format with the largest occurrence number, so as to unify the formats in the library;

step 105, through training a filter, filtering invalid data in the database by utilizing the trained data filter, thereby reducing noise generated by the invalid data or blank data in the database and reducing interference to normal data.

When the method is used, the format with the largest occurrence number is determined in the library, the format of other data in the library is converted into the format, the thought of format conversion is determined by utilizing the steps 101 to 104, the workload of format conversion is reduced, and meanwhile, after the formats of all the data in the library are subjected to unified processing, all the data in the library are conveniently identified and analyzed.

Step 2, identifying the subject of the data in the library, and marking the classified data by taking the subject as a label; the method specifically comprises the following steps:

step 201, selecting a plurality of data from one of a plurality of libraries, and respectively taking the data as a theme extraction training set and a theme model test set;

step 202, training an LDA topic model by using a training set to generate a trained LDA topic model, and testing by using a topic testing set to determine that the trained LDA topic model is error-free;

step 203, performing topic extraction on a plurality of data in a library by using the LDA model obtained through training to obtain a plurality of data topics;

204, judging the similarity among different topics by using a similarity model, and classifying a plurality of acquired topics according to the similarity;

step 205, generating a theme label according to the theme name, adding the theme label into the corresponding data classification category, and characterizing the category by the theme label.

In the step 2, the same formatting process is performed on a plurality of format data in a single library, the topics are extracted, classification is performed according to the similarity among the topics, a plurality of topic classifications are obtained, the data classification in the library is completed, and finally the topics are used as classified labels to mark the data classification.

Step 3, quantifying the numerical values of a plurality of libraries to obtain the standard value of each library, and quantitatively evaluating the libraries by using the standard value; the method specifically comprises the following steps:

step T1, obtaining the number of the topic labels in all libraries, and calculating the activity of the topic labels; the method comprises the following steps:

the liveness of the theme tag is defined as: the frequency of occurrence of tags in each library in the data class represented by the subject tag;

to be used for

Representing the most recent liveness of the subject tags in the library; to->

；

determining the number of topic labels in each data category and the corresponding liveness, and summarizing to form total liveness, namely, total liveness ZhY;

in this step, the activity level of the topic label is quantified according to the frequency of occurrence of the topic label, so that the activity level of the topic label can be simply determined, and the number of topic labels in the library is quantified so as to facilitate further evaluation.

Step T2, obtaining the number of the topic labels in all libraries, and calculating the contribution degree of the topic labels; the method comprises the following steps:

the contribution degree of the theme label is defined as: a ratio of a sum of data amounts represented by topics under the topic label with the largest topic number in the library to a total data amount in the library, wherein the ratio is a ratio of the sum of the data amounts represented by the topics under the topic label to the topic number; wherein, under the topic label of the data category, all topic numbers are marked as LtS,

the percentage of the data under the subject label in the total data in the database is recorded as Zb; the contribution degree of the theme label is GxL;

；

wherein ,

and the value of the correction coefficient for the contribution degree is set by a user according to the requirement, so that the contribution degree of the theme label of the library is conveniently corrected.

In this step, the attribute of the library is characterized according to the contribution degree of the topic label after the topic label in the library is determined by determining the contribution degree of the topic label, the contribution degree of the topic label shows the importance degree of the topic label occupied in the library, and if the contribution degree of a certain topic label is highest, the importance degree of the topic label in the library is highest, and the topic label can most characterize the library.

Step T3, calculating the similarity between the topic labels in the library and the topics of the data acquisition and analysis strategy, and obtaining similarity data; wherein, the step T3 includes the following:

step T031, acquiring a strategy for data acquisition and analysis, and extracting a strategy theme through a trained LDA theme model;

step T302, obtaining a theme label from a library, judging the similarity between the theme label in the library and a strategy theme by using a similarity model, and quantifying the value of the similarity;

obtaining quantized similarity values of all the theme labels, and sorting to form sorting information;

step T303, obtaining the similarity of all the topic labels in the library, summarizing, obtaining the maximum value of the similarity of the topic labels in the library, and determining the maximum value as the similarity Xs;

in this step, the value with the highest similarity to the strategy topic in the library is determined by judging the similarity between topics, and since the sequence of data acquisition is determined by the topics, the corresponding library can be characterized by determining the highest similarity to the topic label in the library.

Step T4, taking the problem label in the library as a target, and carrying out quantitative evaluation on the library to form a standard value; the method specifically comprises the following steps:

acquiring contribution GxL, similarity Xs and total activity ZhY obtained in the steps T1 to T3, carrying out normalization processing, and then, associating and summarizing to form a library standard value which is recorded as a library standard value BZ;

the calculation mode accords with the following formula:

；

wherein ,

，/>

，/>

and->

，/>

To correct the standard value BZ of the library.

In the step, the contribution GxL, the similarity Xs and the total activity ZhY of the theme labels in the library are acquired, summarized and correlated to form a standard value BZ of the library, and the library can be quantized on the basis of the theme labels through the standard value BZ of the library so as to acquire the quantized standard value BZ of the library, and the library is evaluated; also based on this, it is possible to sort several libraries according to their standard values BZ when needed, and to collect multi-source, multi-format data according to this sort when needed.

Meanwhile, the sorting also considers the similarity with the collection strategy, so that the data can be collected according to the sorting sequence when the data is collected, so that the data most suitable for the requirement can be quickly obtained, the difficulty of data collection and the data collection time are shortened, and the data source association degree is highest when the data is collected, so that the data collected from the libraries just before and the first libraries are collected, and the requirement can be possibly met.

When data acquisition is carried out, data acquisition can be carried out on a plurality of libraries serving as data sources at the same time, and data can be acquired sequentially along sorting, and if the data acquisition is carried out on a plurality of libraries at the same time, the similarity of the theme labels is already determined in each library, so that the speed and the efficiency in data acquisition can be still realized.

Step 4, according to the conversion difficulty between the data formats, correlating with the standard value of the library to obtain the evaluation value of the library;

step 401, determining format data according to the acquired data formats in each library and the corresponding data quantity;

step 402, scoring the conversion difficulty according to the conversion difficulty among the formats, obtaining the average value of the scores during conversion among different formats, and marking the formats by the average value of the scores;

step 403, taking the product of the score average value and the data quantity in the library as a format difficulty value GsN of the library, and sorting a plurality of libraries according to the format difficulty value GsN;

step 404, converting the data formats of the other libraries into the format of the library with the lowest format difficulty value GsN, and unifying the formats of the libraries.

When the method is used, in the step 4, the conversion difficulty among the data formats is judged, the data formats are scored, and the strategy of converting the data formats in a plurality of libraries can be determined according to the magnitude of the scoring value, so that when the plurality of data formats are subjected to unified processing, the processing difficulty is reduced, the processing efficiency is improved, and finally, all the data formats in the libraries are converted into the format with the highest universality, thereby being convenient for data acquisition and also being convenient for later data and identification.

The method solves the problem of difficult recognition of the multi-source multi-format data by carrying out unified processing on the formats of the data, and improves the efficiency of data recognition analysis.

Step 405, obtaining a format difficulty value GsN and a standard value BZ of the library, and associating the two values to determine a library evaluation value PG: the calculation method of the standard value of the library is as follows:

；

wherein ,

for correction coefficients, the library evaluation value PG is corrected,/->

Is->

Is a weight coefficient>

，

And->

，/>

Is->

The value of (2) is set by the user.

When in use, the format difficulty value GsN and the standard value BZ of the library are connected together in step 405, the library is evaluated and quantized, and the data subject and format in the library are contained, so that the reference factors are comprehensive if the strategy for data identification and collection is formulated based on the library evaluation value PG.

Step 5, obtaining a library evaluation value PG, and sequencing all libraries according to the library evaluation value PG; when data acquisition is carried out, the data are sequentially acquired from the library according to the sorting of the library.

Example 2

Referring to fig. 1-3, the present invention provides a spatial data integration system applied to a digital twin city, comprising:

the data analysis module 10 receives the data of all the data sources, retrieves and analyzes the multi-source data to determine the quantity and the format of the data in the library;

the theme marking module 20 is used for identifying the theme of the data in the library and marking the classified data by taking the theme as a label;

the data quantization module 30 quantizes the values of the plurality of libraries to obtain standard values of the libraries, and performs quantization evaluation on the libraries by using the standard values;

the data association module 40 is associated with the standard value of the library according to the conversion difficulty between the data formats, and acquires the evaluation value of the library;

the sorting module 50 acquires the library evaluation values PG and sorts all the libraries according to the library evaluation values PG; when data acquisition is carried out, the data are sequentially acquired from the library according to the sorting of the library.

Combining step 1 to step 5, in the present application:

The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with the embodiments of the present application are all or partially produced. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is merely a channel underwater topography change analysis system and method logic function division, and other divisions may be implemented in practice, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Finally: the foregoing description of the preferred embodiments of the invention is not intended to limit the invention, but to enable any modification, equivalent or improvement to be made without departing from the spirit and principles of the invention.

Claims

1. The spatial data integration method applied to the digital twin city is characterized by comprising the following steps of: comprising the following steps:

receiving data of all data sources, uploading the data sources to a server, retrieving and analyzing multi-source data, and determining the quantity and format of the data in a library;

identifying the subject of the data in the library, and marking the classified data by taking the subject as a label;

the method comprises the steps of obtaining the number of topic labels in all libraries and calculating the activity of the topic labels; acquiring the number of the topic labels in all libraries, and calculating the contribution degree of the topic labels; calculating the similarity between the topic labels in the library and the topics of the data acquisition and analysis strategy, and obtaining similarity data;

the calculation mode accords with the following formula:

；

wherein ,

，/>

，/>

and->

，/>

Is weight, its concrete value can be obtained by user according to actual practiceEmpirically adjusted and corrected by changing +.>

To correct the standard value BZ of the library;

quantifying the numerical values of a plurality of libraries to obtain standard values of the libraries, and quantitatively evaluating the libraries with the standard values;

according to the conversion difficulty between the data formats, correlating with the standard value of the library to obtain the evaluation value of the library;

2. The spatial data integration method applied to a digital twin city according to claim 1, wherein:

identifying formats in the multi-source data, and determining the total data amount in each library and format information of the data in the library; classifying the data in the library according to the format information of the data to form different data categories;

3. The spatial data integration method applied to a digital twin city according to claim 1, wherein:

selecting a plurality of data from one of a plurality of libraries to be respectively used as a theme extraction training set and a theme model test set; training the LDA topic model by using a training set to generate a trained LDA topic model, and testing by using a topic testing set to determine that the trained LDA topic model is error-free;

4. The spatial data integration method applied to a digital twin city according to claim 1, wherein:

to be used for

Representing the most recent liveness of the subject tags in the library; to->

；

5. The spatial data integration method applied to a digital twin city according to claim 1, wherein:

under the topic label of the data category, the number of all topics is LtS, and the percentage of the data under the topic label in the total data in the database is Zb; the contribution degree of the theme label is GxL;

；

wherein ,

6. The spatial data integration method applied to a digital twin city according to claim 1, wherein:

acquiring a strategy for data acquisition and analysis, and extracting a strategy theme through a trained LDA theme model; obtaining a topic label from a library, judging the similarity between the topic label in the library and a strategy topic by using a similarity model, and quantifying the value of the similarity;

7. The spatial data integration method applied to a digital twin city according to claim 1, wherein:

the data format in each library and the corresponding data quantity are obtained, and format quantity data are determined; grading the conversion difficulty according to the conversion difficulty among the formats, acquiring the grading average value during conversion among different formats, and marking the formats by the grading average value;

8. The spatial data integration method applied to a digital twin city according to claim 1, wherein:

the format difficulty value GsN and the standard value BZ of the library are acquired, and the format difficulty value GsN and the standard value BZ of the library are associated to determine the library evaluation value PG:

the calculation method of the standard value of the library is as follows: