CN115357581B

CN115357581B - Distributed storage method for massive BIM data

Info

Publication number: CN115357581B
Application number: CN202210997017.3A
Authority: CN
Inventors: 赵亮
Original assignee: Zhuzhijian Technology Chongqing Co ltd
Current assignee: Zhuzhijian Technology Chongqing Co ltd
Priority date: 2022-08-19
Filing date: 2022-08-19
Publication date: 2023-05-05
Anticipated expiration: 2042-08-19
Also published as: CN115357581A

Abstract

The invention provides a distributed storage method of massive BIM data, which comprises the following steps: s1, acquiring BIM data, classifying the data, and dividing the classified data into a shared database and a private database; s2, storing different types of data aiming at different databases, dividing the data sets, and classifying the divided data sets; and S3, data screening is carried out on the classified data sets, repeated data are identified through a screening model, and compression processing is carried out.

Description

Distributed storage method for massive BIM data

Technical Field

The invention relates to the technical field of building engineering informatization, in particular to the technical field of BIM application, and specifically relates to a distributed storage method for processing massive BIM data.

Background

The construction industry is a big data industry with the largest data volume and the largest business regulation, but is the industry with the lowest digitizing and informatization level in the current industries, has weak management and innovation capability, and is difficult in transformation and upgrading of enterprises and industries. The core of the BIM (Building Information Modeling, namely a building information model) is data, and the BIM is used as a source code of the building industry, and can process common data at the project level, and has the greatest advantage of bearing massive private data. With the development and popularization of BIM, the development and popularization of BIM will tend to promote the arrival of the building industry in the big data era. Each building is composed of a plurality of components, each component is provided with abundant information such as geometry, physics and the like, along with the progress of building projects, the information such as engineering quantity, manufacturing cost, construction progress, operation and maintenance conditions and the like can be carried, only a database is established to collect massive information in the whole life cycle of the project, the whole condition of the project can be timely and accurately reflected through adjustment, addition, modification and the like of the data information, and then the decision progress is quickened and the decision quality is improved through the correlation between the data, so that the project quality is improved, the project cost is reduced, and the project profit is increased. At the same time, this also presents challenges for storing, managing, and using massive amounts of BIM data. The existing data distributed storage method cannot meet the processing requirements of BIM data and mass component level BIM data aiming at the processing method of general data, so that a person skilled in the art is required to solve the corresponding technical problems.

Disclosure of Invention

The invention aims at least solving the technical problems existing in the prior art, and particularly creatively provides a distributed storage method for massive BIM data.

In order to achieve the above object of the present invention, the present invention provides a distributed storage method for massive BIM data, which is characterized by comprising the following steps:

s1, acquiring BIM data, classifying the data, and dividing the classified data into a shared database and a private database;

s2, storing different types of data aiming at different databases, dividing the data sets, and classifying the divided data sets;

and S3, data screening is carried out on the classified data sets, repeated data are identified through a screening model, and compression processing is carried out.

According to the above technical solution, the step S1 preferably includes:

s1-1, classifying class data which can be legally acquired in a big data website from common data acquisition sources;

s1-2, creating a master database and a slave database of shared data, storing BIM shared data on a master database server, and performing any operations of adding, deleting and updating on the data only aiming at the master database; the data is then mirrored or synchronized to a slave database;

s1-3, splitting private data according to rules; dividing private data according to the item to which the private data belongs, and storing BIM data belonging to the same item to the same database server; each item has a unique number, i.e., ID, and the items are assigned to different databases by sections of the ID.

According to the above technical solution, the S2 preferably includes:

s2-1, storing the corresponding relation between BIM items and a database server, and firstly inquiring whether a record of the corresponding relation exists in the database when the BIM items are created; if the corresponding relation exists, acquiring the information of the attributed database server through the item ID according to the rule, and storing the corresponding relation into a database for subsequent inquiry; if the operation exists, no operation is performed, and repeated creation is avoided;

s2-2, inquiring corresponding database server information through an item ID when accessing the item, and then establishing connection with the database;

s2-3, uniquely identifying a database table by using a mode of adding item information and table information; the "table name prefix +_+ entry ID +_+ table name is employed.

According to the above technical solution, the step S3 preferably includes:

s3-1, shielding database bottom layer database implementation details by an application layer, and providing a database middle layer;

s3-2, defining a data structure according to the database table structure; adding tag information into a data structure to indicate whether the data is common data or private data;

s3-3, all operations of the application layer on the database must be performed through the database middle layer;

s3-4, when the database middle layer receives an operation request of the application layer, firstly judging whether the operation is aimed at the common data or the private data; if the shared data is the shared data, firstly acquiring server information of the shared data, then establishing connection with a database, and directly accessing database table data through the connection;

if the data is private data, the application layer must set the item ID to be accessed before using the database middle layer; thereafter, the database is accessed without having to take the item ID until another item is needed.

According to the above technical solution, preferably, the S3 further includes:

s3-5, searching a database server where private data are located in a mapping table of the project of the database and the database server according to the project ID and establishing connection with the database server, and then performing all database operations of the application layer through the connection;

s3-6, the database middle layer acquires the table name from the application layer request and generates a complete table name; when the database middle layer receives an operation request of an application layer, the cached item ID is taken out, and the complete table name is spliced in a mode of 1;

and automatically generating a final sql statement by the database middle layer according to the item ID, the table name and the operation, and executing and returning a result through database connection.

According to the above technical scheme, preferably, the screening method includes:

S-A, wherein the attribute set A of the prior conditional probability distribution of the shared datse:Sup>A attribute is,

wherein->

The method comprises the steps that a design types corresponding to an ith storage part in the common data I of BIM and attribute values corresponding to an mth matching model are obtained;

the attribute set B of the prior conditional probability distribution of private data attributes is,

wherein->

B design types corresponding to the J-th storage part in the private data J of the BIM and attribute values corresponding to the n-th matching model;

setting conditional probabilities Q (a|c), Q (b|c) in the case where class attribute C in the BIM data is a condition; the method of calculation is as follows,

wherein Q (A, C) represents the joint probability distribution of the sets A and C, traversing the values of A and C to obtain the conditional probability distribution Q (A|C) thereof; wherein Q (B, C) represents the joint probability distribution of the sets B and C, the values of B and C are traversed to obtain the conditional probability distribution Q (B|C), Q (A) is the attribute condition of the attribute set A, Q (B) is the attribute condition of the attribute set B, and Q (C) is the class attribute C.

According to the above technical scheme, preferably, the screening method further includes:

S-B, aiming at the condition function Q (A, b|C) of each attribute node in the attribute set A and each attribute node in the attribute set B, the calculation method is as follows;

after the joint probability distribution of Q (A, C) and Q (B, C) is calculated, values of A, B and C are traversed, the probability distribution is derived, and then the sum of attribute values of the common data and the private data is multiplied.

According to the above technical scheme, preferably, the screening method further comprises:

S-C, when the class attribute node C performs condition screening in BIM data; setting the particle swarm M and N according to the construction capacity of the common data and the private data in the BIM data, if the common data particle swarm M meets the condition that

s is a random real number between 0 and 1, t is the number of common data particle swarms with zero condition, u is the deviation value,>

limiting condition weight for limiting the number of particle swarm, substituting the private data particle swarm N into the private data of BIM data after screening the common data, wherein +.>

Mu is nonlinear mapping condition coefficient, S _k Predicted calculation value for k private data matching model, U _k The actual calculated value of the model is matched for the kth private data, and k is a positive integer.

S-D, calculating a screening function of class attribute C as F (A, B, C) =g.pi.Q (A, C) pi.Q (B, C) +F, wherein all nodes of pi.Q (A, C) attribute A meet the conditional probability product of class attribute C, all nodes of pi.Q (B, C) attribute B meet the conditional probability product of class attribute C, extracting useless data which need to be removed from private data in BIM data through conditional probability, setting a screening optimal solution g, and evaluating coefficient F.

In summary, due to the adoption of the technical scheme, the beneficial effects of the invention are as follows:

through effectively splitting the mass BIM data, after classifying through model calculation, the BIM data can be efficiently and quickly stored, and the storage space is reasonably utilized through distributed management, so that the storage efficiency is improved, and the storage space is effectively saved.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:

FIG. 1 is a general schematic of the present invention;

FIG. 2 is a schematic illustration of an embodiment of the present invention;

FIG. 3 is a schematic representation of an embodiment of the present invention;

FIG. 4 is a schematic diagram of an embodiment of the present invention.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.

As shown in fig. 1-4, the present invention aims to solve the problem of storing massive BIM data, and provides a solution for common data for design, construction, calculation, cost management, operation and maintenance, etc.

The invention discloses a distributed storage method of massive BIM data, which comprises the following steps:

The S1 comprises the following steps:

s1-1, only one share of common data is in the whole system, data mirroring is carried out in consideration of access speed, and when a large number of concurrent accesses are carried out, the shared data are distributed to different database examples according to loads; the common data acquisition sources comprise class data which can be legally acquired in a big data website, and the class data are classified;

The step S2 comprises the following steps:

s2-1, storing the corresponding relation between the project and a database server into a database, and firstly inquiring whether a record of the corresponding relation exists in the database when the project is created; if the corresponding relation exists, acquiring the information of the attributed database server through the item ID according to the rule, and storing the corresponding relation into a database for subsequent inquiry; if the operation exists, no operation is performed, and repeated creation is avoided;

The step S3 comprises the following steps:

if the data is private data, the application layer must set the item ID to be accessed before using the database middle layer; then, when accessing the database, the user does not need to take the item ID until another item needs to be accessed;

The screening method comprises the following steps:

calculating statistical information in BIM data through attribute prior probability distribution in the BIM data;

wherein->

wherein->

wherein Q (A, C) represents the joint probability distribution of the sets A and C, traversing the values of A and C to obtain the conditional probability distribution Q (A|C) thereof; wherein Q (B, C) represents the joint probability distribution of the sets B and C, traversing the values of B and C to obtain the conditional probability distribution Q (B|C), Q (A) is the attribute condition of the attribute set A, Q (B) is the attribute condition of the attribute set B, and Q (C) is the class attribute C;

by computing the joint probability distribution of Q (A, C) and Q (B, C) and then traversing the values of A, B and C, the probability distribution is calculatedPerforming derivation, and multiplying the sum of attribute values of the common data and the private data, so as to preliminarily obtain screening conditions in BIM data;

limiting condition weight values, which are used for limiting the number of particle swarms, screening the common data, substituting the private data particle swarm N into the private data of BIM data, and setting conditions for matching screening; obtaining a complete network for BIM data screening and sorting, wherein +.>

Mu is nonlinear mapping condition coefficient, S _k Predicted calculation value for k private data matching model, U _k The actual calculated value of the model is matched for the kth private data, and k is a positive integer;

S-D, calculating a screening function of class attribute C as F (A, B, C) =g.PiQ (A, C) PiQ (B, C) +F, wherein all nodes of PiQ (A, C) attribute A meet the conditional probability product of class attribute C, all nodes of PiQ (B, C) attribute B meet the conditional probability product of class attribute C, extracting useless data which need to be removed from private data in BIM data through conditional probability, setting a screening optimal solution g and an evaluation coefficient F, screening BIM data under the constraint condition of screening optimal solution and evaluation coefficient through the screening function, compressing effective BIM data, and storing the effective BIM data in a database.

Referring to fig. 1, a flowchart of an embodiment of a method for storing mass BIM data in a distributed manner according to the present invention is shown, where the method includes steps 101 to 105.

In step 101, BIM data is divided into common data and private data

In this embodiment, the common data includes, but is not limited to

National regulations, e.g. construction engineering quantity list pricing regulations, plain-law atlas, etc

Urban area directory

Unit catalogs, e.g. developer, design unit, construction unit, etc

Material warehouse including materials related to construction, installation of electromechanics and other professionals

Price base including price information of various materials in various time periods in various regions

In step 102, the common data is stored separately. Further, as shown in fig. 2, a master database and a slave database of the shared data are created, the BIM shared data are stored on the master database server, and any operations of adding, deleting and updating the data are only directed to the master database; the data is then mirrored or synchronized to a slave database to improve performance and speed;

in step 103, the private data is split according to rules. As an example of the present embodiment, as shown in fig. 2, private data is divided according to items to which the private data belongs, and BIM data belonging to the same item is stored on the same database server. Each item has a unique number, i.e., ID, and the items are assigned to different databases by sections of the ID. Such as ID values of 1-1000000 to database server 1, 1000001-2000000 to database server 2, 2000001-3000000 to database server 3, and so on.

Further, storing the corresponding relation between the item and the database server into a database

When creating a project, firstly inquiring whether a record of a corresponding relation exists in a database; if the corresponding relation exists, acquiring the information of the attributed database server through the item ID according to the rule, and storing the corresponding relation into a database for subsequent inquiry; if the operation exists, no operation is performed, and repeated creation is avoided;

when accessing the project, inquiring corresponding database server information through the project ID, and then establishing connection with the database;

in step 104, as an example of the present embodiment, each item includes massive rich data such as design, engineering amount, progress, etc., and each data has a different structure and needs to be stored in a different database table. At the same time, each database stores data of a plurality of items, and different table names need to be assigned to tables with the same structure of different items.

Specifically, the database tables are uniquely identified using item information plus table information. As an example of this embodiment, a method of "table name prefix+_+entry id+_+table name" is adopted. Assuming that the item ID is 0000001 in which the table storing each component information is elementinfo, the table name eventually in the database is proj_0000001_elementinfo;

further, the application layer shields the database bottom layer database implementation details and provides a database middle layer.

Firstly, defining a data structure according to a database table structure;

adding tag information into a data structure to indicate whether the data is common data or private data;

all operations of the application layer on the database must be performed through the database middle layer;

when the middle layer of the database receives an operation request of the application layer, firstly judging whether the operation is aimed at the shared data or the private data;

if the shared data is the shared data, firstly acquiring server information of the shared data, then establishing connection with a database, and directly accessing database table data through the connection;

if the data is private data, the application layer must set the item ID to be accessed before using the database middle layer; thereafter accessing the database without taking the item ID until another item is needed

According to the item ID, the database middle layer searches the database server where the private data is located in the mapping table of the item and the database server of the database and establishes connection with the database server, and then all database operations of the application layer are carried out through the connection

The database middle layer obtains the table name from the application layer request and generates a complete table name; taking the table elementinfo with access item ID 0000001 as an example, the complete table name proj_0000001_elementinfo;

when the database middle layer receives an operation request of an application layer, the cached item ID is taken out, and the complete table name is spliced in a mode of 1;

In this embodiment, the types of BIM data are very rich, including but not limited to numerical value type, text type, graphic data, etc.; these data can be divided into

Simple types, such as numeric, text, etc. Such data may be stored directly in the database using strings

Complex types, such as graphics or combinations of simple types, which require conversion before being stored in a database in strings

In step 105, firstly searching whether the data exists in the database, if so, directly returning to the corresponding storage position, otherwise, creating a new record and returning to the storage position; in the present embodiment, the storage location of the ID and keyword representation data is mainly used.

As an example of this embodiment, as shown in fig. 3, for simple type data, in order to further reduce the storage space, first, data is defined uniformly, and the places used are associated by keywords; as shown in fig. 4, the data are ordered, so that repeated storage caused by different storage contents due to the data order is avoided;

as an example of the present embodiment, for the complex type of graphics data, two types are classified as follows:

basic graphics including straight line segments, arcs, ellipses, bezier curves, etc.;

a complex graphic composed of a plurality of different types of basic shapes;

the basic graphics can be expressed by a graphics type plus a numerical parameter. The character string "[" straight line ", [" start point "," end point "] ]" can completely express straight line segments, and the character string "[" arc ", [" origin "," start point "," end point "] ]" can completely express arcs.

Complex graphics can break down multiple basic shapes and thus its storage format can be converted into a combination of basic shapes.

As an example of this embodiment, each project contains a huge number of components, ranging from tens of thousands to millions; each component contains a plurality of attribute data, and the number of the attribute data varies from a few to tens of attribute data; meanwhile, the number of attributes of the same type of components is the same as the attribute value.

Further, the collection of all the attributes of the component is used as a complex type of data, and the complex type of data is converted into a character string format of a simple type data collection and stored in a designated table.

Storing information pointing to the actual storage location of the attribute set at the place where the attribute set of the member is stored;

when a certain attribute value of the component is changed, generating a new attribute set character string according to the method, comparing the character string with the existing records of the database, and if the character string exists, updating the position information of the records to the place where the attribute set of the component is stored; otherwise, a new record is created and the location information of the new record is updated to where the set of member attributes is stored.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. The distributed storage method of the massive BIM data is characterized by comprising the following steps of:

s1-3, splitting private data according to rules; dividing private data according to the item to which the private data belongs, and storing BIM data belonging to the same item to the same database server; each item has a unique number, namely ID, and the items are distributed into different databases according to the sections of the ID;

s2-1, storing the corresponding relation between BIM items and a database server, and firstly inquiring whether a record of the corresponding relation exists in the database when the BIM items are created; if not, acquiring attributive database server information through the item ID according to the rule of S1-3, storing the corresponding relation into a database, and then inquiring later; if the operation exists, no operation is performed, and repeated creation is avoided;

s2-3, uniquely identifying a database table by using a mode of adding item information and table information; the "table name prefix +_+ item ID +_+ table name" is employed;

s3, data screening is carried out on the classified data sets, repeated data are identified through a screening model, and compression processing is carried out;

2. The method for distributed storage of massive BIM data according to claim 1, wherein the S3 further includes:

s3-6, the database middle layer acquires the table name from the application layer request and generates a complete table name; when the database middle layer receives an operation request of the application layer, the cached item ID is taken out, and the complete table name is spliced according to the mode that the database middle layer acquires the table name from the application layer request;

3. The method for distributed storage of massive BIM data according to claim 1, wherein the screening method includes:

wherein->

wherein->

4. A method of distributed storage of massive BIM data according to claim 3, wherein the screening method further includes:

by joint probability of Q (A, C) and Q (B, C)After the distribution calculation, values of A, B and C are traversed, the probability distribution is derived, and then the sum of attribute values of the common data and the private data is multiplied.

5. A method of distributed storage of massive BIM data according to claim 3, wherein the screening method further includes:

6. A method of distributed storage of massive BIM data according to claim 3, wherein the screening method further includes: