CN115357581A

CN115357581A - Distributed storage method for massive BIM data

Info

Publication number: CN115357581A
Application number: CN202210997017.3A
Authority: CN
Inventors: 赵亮
Original assignee: Zhuzhijian Technology Chongqing Co ltd
Current assignee: Zhuzhijian Technology Chongqing Co ltd
Priority date: 2022-08-19
Filing date: 2022-08-19
Publication date: 2022-11-18
Anticipated expiration: 2042-08-19
Also published as: CN115357581B

Abstract

The invention provides a distributed storage method of massive BIM data, which comprises the following steps: s1, obtaining BIM data, then classifying the data, and dividing the classified data into a common database and a private database; s2, storing different types of data aiming at different databases, dividing the data into data sets, and classifying the divided data sets; and S3, screening the classified data set, identifying repeated data through a screening model, and compressing.

Description

Distributed storage method for massive BIM data

Technical Field

The invention relates to the technical field of constructional engineering informatization, in particular to the technical field of BIM application, and specifically relates to a distributed storage method for processing massive BIM data.

Background

The building industry is a big data industry with the largest data volume and the largest service scale, but is also the industry with the lowest digitalization and informatization level in all the industries at present, the management innovation capability is weak, and the transformation and upgrading of enterprises and industries are difficult. The core of the BIM (Building Information Modeling) is data, and the BIM is used as a source code of the Building industry, which can not only process common data of project level, but also has the greatest advantage of bearing massive private data. With the development and popularization of BIM, the big data age of the construction industry is certainly promoted. Each building is composed of a plurality of components, each component has abundant information such as geometry, physics and the like, along with the progress of a building project, information such as engineering quantity, construction cost, construction progress, operation and maintenance conditions and the like can be brought, only a database is established to collect massive information in the whole life cycle of the project, the whole condition of the project can be timely and accurately reflected through adjustment, addition, modification and the like of the data information, and then the decision progress is accelerated and the decision quality is improved through correlation between the data, so that the project quality is improved, the project cost is reduced, and the project profit is increased. Meanwhile, the storage, management and use of massive BIM data are challenged. The existing data distributed storage method aims at a processing method of general data, and cannot meet the processing requirements of BIM data and massive component-level BIM data, so that a technical person in the field is urgently needed to solve corresponding technical problems.

Disclosure of Invention

The invention aims to at least solve the technical problems in the prior art, and particularly creatively provides a distributed storage method for massive BIM data.

In order to achieve the above object, the present invention provides a distributed storage method for mass BIM data, which is characterized by comprising the following steps:

s1, obtaining BIM data, then classifying the data, and dividing the classified data into a common database and a private database;

s2, storing different types of data aiming at different databases, dividing the data into data sets, and classifying the divided data sets;

and S3, screening the classified data set, identifying repeated data through a screening model, and compressing.

According to the above technical solution, preferably, the S1 includes:

s1-1, classifying the class data, wherein the common data acquisition source comprises the class data which can be legally acquired in a big data website;

s1-2, a master database and a slave database of the shared data are created, the BIM shared data are stored on a master database server, and any adding, deleting and updating operation of the data only aims at the master database; then mirroring or synchronizing the data to the slave database;

s1-3, splitting private data according to rules; dividing private data according to belonged projects, and storing BIM data belonging to the same project into the same database server; each item has a unique number, i.e., ID, and the items are assigned to different databases according to the sector of the ID.

According to the above technical solution, preferably, the S2 includes:

s2-1, storing the corresponding relation between the BIM project and a database server, and inquiring whether a record of the corresponding relation already exists in the database or not when the BIM project is created; if the corresponding relation does not exist, acquiring the information of the database server which belongs to the database through the project ID according to the rule, storing the corresponding relation into the database, and then subsequently inquiring; if the data exists, no operation is performed, and repeated creation is avoided;

s2-2, when the project is accessed, inquiring corresponding database server information through the project ID, and then establishing connection with the database;

s2-3, uniquely identifying the database table by using a mode of adding project information and table information; the "table name prefix + _ + item ID + _ + table name" is adopted.

According to the above technical solution, preferably, the S3 includes:

s3-1, shielding database bottom layer database partitioning implementation details for an application layer, and providing a database middle layer;

s3-2, defining a data structure according to a database table structure; adding label information in a data structure to indicate whether the data is common data or private data;

s3-3, all operations of the application layer on the database must be carried out through a database intermediate layer;

s3-4, when the intermediate layer of the database receives an operation request of an application layer, firstly judging whether the operation is directed at common data or private data; if the data is the common data, the information of the server where the common data is located is obtained firstly, then the connection with the database is established, and the database table data is directly accessed through the connection;

if the data is private data, the application layer must set an item ID to be accessed before using the database middle layer; later access to the database is not required to take the item ID until another item needs to be accessed.

According to the above technical solution, preferably, the S3 further includes:

s3-5, searching a database server where private data is located in a mapping table of a project of a database and the database server and establishing connection with the database server according to the project ID by the database intermediate layer, and then performing all database operations of the application layer through the connection;

s3-6, the database intermediate layer acquires the table name from the application layer request and generates a complete table name; when the database intermediate layer receives an operation request of the application layer, the cached item ID is taken out, and a complete table name is spliced according to the mode 1;

and the intermediate layer of the database automatically generates a final sql statement according to the project ID, the table name and the operation, and the final sql statement is executed and returned through database connection.

Preferably, according to the above technical solution, the screening method comprises:

S-A, the attribute set A of the prior conditional probability distribution of the common datse:Sup>A attributes is,

wherein

The attribute values of a design types corresponding to the ith storage part and the corresponding mth matching model in the common data I of the BIM are obtained;

the attribute set B of the prior conditional probability distribution of private data attributes is,

wherein

B design types corresponding to the jth storage part in the private data J of the BIM and attribute values corresponding to the nth matching model are obtained;

conditional probabilities Q (a | C), Q (B | C) when the class attribute C in the BIM data is set as a condition; the calculation method is as follows,

wherein Q (A, C) represents the joint probability distribution of the sets A and C, and the values of A and C are traversed to obtain the conditional probability distribution Q (A | C); and Q (B, C) represents the joint probability distribution of the sets B and C, the values of B and C are traversed to obtain the conditional probability distribution Q (B | C), Q (A) is the attribute condition of the attribute set A, Q (B) is the attribute condition of the attribute set B, and Q (C) is the class attribute C.

Preferably, according to the above technical solution, the screening method further comprises:

S-B, calculating a conditional function Q (A, B | C) of each attribute node in the attribute set A and each attribute node in the attribute set B according to the following method;

after the joint probability distribution of Q (A, C) and Q (B, C) is calculated, values of A, B and C are traversed, derivation is carried out on the probability distribution, and then the derivation is multiplied by the sum of attribute values of common data and private data.

According to the preferable technical scheme, the screening method further comprises:

S-C, when the class attribute node C performs condition screening in the BIM data; setting particle groups M and N according to the construction capacity of the shared data and the private data in the BIM data, if the shared data particle group M satisfies the condition

s is a random real number between 0 and 1, t is the number of the common data particle groups with the condition of zero, u is a deviation value,

the method is used for limiting the number of particle swarms for limiting the conditional weight, and substituting private data of BIM data into the private data of the private data particle swarms N after screening the common data, wherein

Mu is the coefficient of the nonlinear mapping condition, S _k Match model predictive calculation value for kth private data, U _k And k is an actual calculated value of the k-th private data matching model, and k is a positive integer.

S-D, calculating a screening function of the class attribute C as F (A, B, C) = g · nQ (A, C) = n Q (B, C) + F, all nodes of the attribute A meet the conditional probability product of the class attribute C, all nodes of the attribute B meet the conditional probability product of the class attribute C, extracting useless data needing to be removed from the private data in BIM data through the conditional probability, setting a screening optimal solution g and evaluating a coefficient F.

In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:

through effectively dividing mass BIM data, after the BIM data are classified through model calculation, the BIM data can be efficiently and quickly stored, the storage space is reasonably utilized through distributed management, the storage efficiency is improved, and the storage space is effectively saved.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a general schematic of the present invention;

FIG. 2 is a schematic diagram of an embodiment of the present invention;

FIG. 3 is a schematic diagram of an embodiment of the present invention;

FIG. 4 is a schematic diagram of an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.

As shown in fig. 1-4, the present invention aims to solve the storage problem of mass BIM data and provide a solution of common data for design, construction, calculation amount, cost management, operation and maintenance, etc.

The invention discloses a distributed storage method of massive BIM data, which comprises the following steps:

The S1 comprises:

s1-1, only one part of common data is in the whole system, data mirroring is carried out in consideration of access speed, and when a large amount of concurrent access is carried out, the common data are distributed to different database examples according to load; the common data acquisition source comprises class data which can be legally acquired from a big data website, and the class data are classified;

s1-2, a master database and a slave database of shared data are created, BIM shared data are stored on a master database server, and any operations of adding, deleting and updating the data only aim at the master database; then mirroring or synchronizing the data to the slave database;

The S2 comprises:

s2-1, storing the corresponding relation between the project and a database server into a database, and when the project is created, firstly inquiring whether a record of the corresponding relation exists in the database; if the corresponding relation does not exist, acquiring the information of the database server which belongs to the database through the project ID according to the rule, storing the corresponding relation into the database, and then subsequently inquiring; if the data exists, no operation is performed, and repeated creation is avoided;

s2-3, uniquely identifying a database table by using a mode of adding project information and table information; the "table name prefix + _ + item ID + _ + table name" is adopted.

The S3 comprises the following steps:

if the data is private data, the application layer must set an item ID to be accessed before using the database middle layer; when the database is accessed later, the project ID is not required to be taken until another project is required to be accessed;

s3-5, searching a database server where the private data is located in a mapping table of the project of the database and the database server by the database intermediate layer according to the project ID and establishing connection with the database server, and then performing all database operations of the application layer through the connection;

The screening method comprises the following steps:

calculating statistical information in the BIM data through attribute prior probability distribution in the BIM data;

wherein

the attribute set B of the a priori conditional probability distribution of private data attributes is,

wherein

setting conditional probabilities Q (A | C) and Q (B | C) under the condition that the class attribute C in the BIM data is set as a condition; the calculation method is as follows,

wherein Q (A, C) represents the joint probability distribution of the sets A and C, and the values of A and C are traversed to obtain the conditional probability distribution Q (A | C); wherein Q (B, C) represents the joint probability of sets B and CDistributing, namely traversing the values of B and C to obtain conditional probability distribution Q (B | C), wherein Q (A) is an attribute condition of an attribute set A, Q (B) is an attribute condition of the attribute set B, and Q (C) is a class attribute C;

S-B, calculating a conditional function Q (A, B | C) aiming at each attribute node in the attribute set A and each attribute node in the attribute set B as follows;

after the joint probability distribution of Q (A, C) and Q (B, C) is calculated, values of A, B and C are traversed, derivative is carried out on the probability distribution, and then the sum of attribute values of common data and private data is multiplied, so that a screening condition in BIM data is obtained preliminarily;

S-C, when the class attribute node C performs condition screening in the BIM data; setting particle groups M and N according to the construction capacity of common data and private data in BIM data, if the common data particle group M satisfies the condition

the weight value is a limiting condition weight value and is used for limiting the number of the particle swarms, after the common data are screened, the private data particle swarms N are substituted into the private data of the BIM data, and conditions are set for matching screening; obtaining a complete network for screening and sorting BIM data, wherein

Mu is a nonlinear mapping condition coefficient, S _k Predicted calculation value, U, for the matching model of the kth private data _k Matching an actual calculation value of the model for the kth private data, wherein k is a positive integer;

S-D, calculating a screening function of the class attribute C as F (A, B, C) = g · nQ (A, C) = B, C) + F, wherein all nodes of the attribute A meet the conditional probability product of the class attribute C, all nodes of the attribute B meet the conditional probability product of the class attribute C, useless data needing to be removed from private data in the BIM data are extracted through the conditional probability, an optimal screening solution g is set, and an evaluation coefficient F is set, so that the BIM data can be screened under the constraint condition of screening the optimal solution and the evaluation coefficient through the screening function, and the effective BIM data are compressed and stored in a database.

Referring to fig. 1, it is a schematic flowchart of an embodiment of a distributed storage method for mass BIM data provided by the present invention, and the method includes steps 101 to 105.

In step 101, BIM data is divided into common data and private data

In the present embodiment, the common data includes, but is not limited to

National norms, e.g. construction engineering quantity bill pricing norms, plain law atlas

Provincial and municipal region directory

Unit list, e.g. developer, design unit, construction unit, etc

Material warehouse including materials related to building, installing machinery and the like

A price library including price information of various materials in each time period of each region

In step 102, the common data is stored separately. Further, as shown in fig. 2, a master database and a slave database of common data are created, BIM common data is stored on a master database server, and any addition, deletion, and update operations of data are only performed for the master database; then mirroring or synchronizing the data to the slave database to improve performance and speed;

in step 103, the private data is split according to rules. As shown in fig. 2, as an example of this embodiment, private data is divided according to items to which the private data belongs, and BIM data belonging to the same item is stored in the same database server. Each item has a unique number, i.e. ID, and the items are assigned to different databases according to the sector of the ID. Such as ID values of 1-1000000 to database server 1, 1000001-2000000 to database server 2, 2000001-3000000 to database server 3, and so on.

Further, storing the corresponding relation between the project and the database server in the database

When a project is created, firstly, whether a record of a corresponding relation already exists in a database is inquired; if the corresponding relation does not exist, acquiring the information of the database server which belongs to the database through the project ID according to the rule, storing the corresponding relation into the database, and then subsequently inquiring; if the data exists, no operation is performed, and repeated creation is avoided;

when accessing the project, inquiring corresponding database server information through the project ID, and then establishing connection with the database;

in step 104, as an example of this embodiment, each project includes massive abundant data such as design, engineering quantity, progress, etc., and the structures of each kind of data are different, and need to be stored by different database tables. Meanwhile, each database stores data of a plurality of items, and different table names are required to be assigned to tables with the same structure of different items.

In particular, a database table is uniquely identified using project information plus table information. As an example of the present embodiment, a manner of "table name prefix + _ + item ID + _ + table name" is adopted. Assuming that the item ID is 0000001, and the table in which each component information is stored is elementinfo, the table name finally in the database is proj — 0000001_elementinfo;

and further, shielding database bottom layer library division implementation details for the application layer, and providing a database middle layer.

Firstly, defining a data structure according to a database table structure;

adding label information in a data structure to indicate whether the data is common data or private data;

all operations of the application layer on the database must be carried out through the database intermediate layer;

when the database intermediate layer receives an operation request of an application layer, firstly, judging whether the operation is directed at common data or private data;

if the data is the common data, the information of the server where the common data is located is obtained firstly, then the connection with the database is established, and the database table data is directly accessed through the connection;

if the data is private data, the application layer must set an item ID to be accessed before using the database middle layer; later access to the database without taking the item ID until another item needs to be accessed

The database intermediate layer searches and establishes connection with the database server where the private data is located in the project and database server mapping table of the database according to the project ID, and then all database operations of the application layer can be carried out through the connection

The intermediate layer of the database acquires the table name from the application layer request and generates a complete table name; taking the table elementinfo with an access item ID of 0000001 as an example, the complete table name proj _0000001_elementinfo;

when the database intermediate layer receives an operation request of the application layer, the cached item ID is taken out, and a complete table name is spliced according to the mode 1;

In the present embodiment, the types of BIM data are very rich, including but not limited to numerical type, literal type, graphic data, etc.; these data can be divided into

Simple types such as numeric types, literal types, etc. Such data can be stored directly in a database using a string

Complex types, e.g. graphics, or combinations of more than one simple type, which require conversion before they can be stored in a string format in a database

In step 105, whether the data exists in the database is searched, if so, the corresponding storage position is directly returned, otherwise, a new record is created and the storage position is returned; in this embodiment, the storage location of data is mainly represented by an ID or a keyword.

As an example of this embodiment, as shown in fig. 3, for simple type data, in order to further reduce the storage space, the data is uniformly defined, and the used places are associated by keywords; as shown in fig. 4, the data are sorted to avoid repeated storage due to different storage contents caused by data sequence;

as an example of this embodiment, the complex type graphics data is classified into the following two types:

basic figures including straight line segments, arcs, ellipses, bezier curves, and the like;

a complex figure composed of a plurality of different types of basic shapes;

the basic graphics may be expressed in graphic type plus numeric parameters. The character strings of the [ "straight line", [ "starting point", "end point ]", can completely express straight line segments, and the character strings of the [ "arc", [ "starting point", "end point ]", can completely express arc.

A complex graphic can be decomposed into a plurality of basic shapes, and thus its storage format can be converted into a combination of basic shapes.

As an example of this embodiment, each project contains a huge number of components, varying from tens of thousands to millions; each component contains a plurality of attribute data, and the number of the attribute data is from several to dozens; meanwhile, the number and the value of the attributes of the members of the same type are mostly the same.

Further, a set of all attributes of the member is used as a complex type of data, and the complex type of data is converted into a character string format of a simple type data set and is stored in a specified table.

Storing information pointing to an actual storage location of the property collection at a location where the property collection of the component is stored;

when a certain attribute value of the component is changed, generating a new attribute set character string according to the method, comparing the new attribute set character string with the existing record of the database, and if the new attribute set character string exists, updating the recorded position information to a place for storing the component attribute set; otherwise, a new record is created and the location information of the new record is updated to where the set of attributes of the component is stored.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. A distributed storage method of massive BIM data is characterized by comprising the following steps:

2. The distributed storage method of mass BIM data according to claim 1, wherein S1 comprises:

s1-1, classifying the class data, wherein the common data acquisition source comprises the class data legally acquired in a big data website;

3. The distributed storage method of mass BIM data according to claim 1, wherein the S2 comprises:

s2-1, storing the corresponding relation between the BIM project and a database server, and inquiring whether a record of the corresponding relation already exists in the database or not when the BIM project is created; if the corresponding relation does not exist, acquiring the information of the affiliated database server through the project ID according to the rule, storing the corresponding relation into a database and then subsequently inquiring; if the data exists, no operation is performed, and repeated creation is avoided;

4. The distributed storage method of massive BIM data according to claim 1, wherein said S3 comprises:

s3-4, when the intermediate layer of the database receives an operation request of an application layer, firstly judging whether the operation is directed at common data or private data; if the data is the common data, firstly acquiring the information of the server where the common data is located, then establishing connection with the database, and directly accessing the database table data through the connection;

5. The distributed storage method of massive BIM data according to claim 4, wherein said S3 further comprises:

and the intermediate layer of the database automatically generates a final sql statement according to the item ID, the table name and the operation, and the final sql statement is executed and returned through the database connection.

6. The distributed storage method of massive BIM data according to claim 1, wherein the screening method comprises:

wherein

wherein

bar in case of setting class attribute C in BIM data as a conditionPiece probabilities Q (a | C), Q (B | C); the calculation method is as follows,

7. The distributed storage method of massive BIM data as claimed in claim 6, wherein the screening method further comprises:

after the joint probability distribution of Q (A, C) and Q (B, C) is calculated, the values of A, B and C are traversed, the probability distribution is derived, and then the sum of the attribute values of the common data and the private data is multiplied.

8. The distributed storage method of massive BIM data as claimed in claim 6, wherein said screening method further comprises:

s is a random real number between 0 and 1, t is the number of common data particle groups with the condition of zero, u is an offset value,

limiting the conditional weight for limiting the number of the particle swarm, screening the common data, and substituting the private data of the BIM data into the private data of the private data particle swarm N, wherein

Mu is the coefficient of the nonlinear mapping condition, S _k Predicted calculation value, U, for the matching model of the kth private data _k And k is an actual calculated value of the k-th private data matching model, and k is a positive integer.

9. The distributed storage method of massive BIM data as claimed in claim 6, wherein the screening method further comprises:

and S-D, calculating a screening function of the class attribute C as F (A, B, C) = g · nQ (A, C) nQ (B, C) + F, and all nodes of the nQ (A, C) attribute A meet the conditional probability product of the class attribute C, all nodes of the nQ (B, C) attribute B meet the conditional probability product of the class attribute C, extracting useless data needing to be removed from the private data in the BIM data through the conditional probability, setting a screening optimal solution g, and evaluating a coefficient F.