CN115828926B - Construction quality hidden danger data mining model training method and mining system - Google Patents

Construction quality hidden danger data mining model training method and mining system Download PDF

Info

Publication number
CN115828926B
CN115828926B CN202211522702.7A CN202211522702A CN115828926B CN 115828926 B CN115828926 B CN 115828926B CN 202211522702 A CN202211522702 A CN 202211522702A CN 115828926 B CN115828926 B CN 115828926B
Authority
CN
China
Prior art keywords
model
quality hidden
hidden danger
construction quality
data mining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211522702.7A
Other languages
Chinese (zh)
Other versions
CN115828926A (en
Inventor
钟波涛
潘杏
骆汉宾
胡啸威
沈罗昕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202211522702.7A priority Critical patent/CN115828926B/en
Publication of CN115828926A publication Critical patent/CN115828926A/en
Application granted granted Critical
Publication of CN115828926B publication Critical patent/CN115828926B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a construction quality hidden danger data mining model training method and a mining system, wherein the training method comprises the following steps: training first to third deep learning network models, wherein the three base models respectively determine word semantic weights based on local features, PMI values between word semantics and semantics, tf-idf values between word semantics and documents and keywords; integrating the three base models to obtain a data mining model; encrypting each local mining model parameter, uploading the encrypted local mining model parameter to an interstellar file system through a blockchain, and storing a hash value in the blockchain; the method comprises the steps that a plurality of encryption model parameters stored in an interstellar file system are weighted and averaged through a federal averaging algorithm, and each local user downloads updated model parameters through a block chain; the training and sharing parameters process is repeated until the model converges. And the three complementary base classifiers are integrated to obtain a data mining model, and the integrated block chain technology realizes the sharing of a plurality of local model parameters, so that the generalization capability and the classification precision of the data mining model are improved.

Description

Construction quality hidden danger data mining model training method and mining system
Technical Field
The invention belongs to the technical field of construction quality diagnosis and control, and particularly relates to a construction quality hidden danger data mining model training method and a mining system.
Background
In recent years, house construction quality safety accidents frequently occur, so that not only casualties are caused, but also huge economic losses are caused. In the construction and construction process of building products, the problem of engineering quality is unavoidable, and the engineering quality is related to the adaptability of engineering and the investment effect of construction projects, and the life and property safety of people. The on-site inspection serves as an important ring of supervision on engineering quality control, aiming at problems found in engineering construction processes, supervision usually contacts construction units in a written form of quality hidden danger correction sheets to correct the found quality problems on time, the quality hidden danger correction sheets ensure that construction quality reaches standards, the hidden danger of construction quality is eliminated, the engineering quality of a building main body is ensured to play a great role, and the method is a necessary choice for building lean construction before intelligent construction and digital construction are realized in current construction.
The quality hidden trouble correction list is a procedural form which is issued to a construction unit by a supervision unit to correct quality problems in a construction project, and contains rich quality problem information, and knowledge acquisition and utilization of the quality problem information are beneficial to improvement of quality control level of construction engineering by engineering personnel. Because the quality problem information is unstructured text information and is dispersed in different quality hidden trouble correction sheets, the acquisition and analysis of the quality problem information by engineering personnel is a time-consuming and labor-consuming process, so that the engineering personnel cannot effectively utilize the existing knowledge to play a sufficient reference role for quality management of the building engineering, and further influence the quality control and decision of the building engineering project in real time and accuracy, and therefore knowledge modeling and information extraction on the unstructured engineering text describing the quality hidden trouble are significant for improving the management efficiency of the engineering text, enhancing the utilization of implicit knowledge and enhancing engineering benefits.
Disclosure of Invention
Aiming at the defects or improvement demands of the prior art, the invention provides a construction quality hidden danger data mining model training method and a mining system, and aims to quickly classify engineering problems in engineering construction quality hidden danger correction reports and improve engineering text management efficiency.
In order to achieve the above object, according to an aspect of the present invention, there is provided a construction quality hidden danger data mining model training method, comprising:
step S1: respectively training a first deep learning network model, a second deep learning network model and a third deep learning network model which can classify engineering problems corresponding to quality hidden trouble descriptions in engineering construction quality hidden trouble correction reports by using local data, wherein the first deep learning network model determines word semantic weights based on local features of analysis quality hidden trouble descriptions, the second deep learning network model determines word semantic weights based on PMI values between word semantics of analysis quality hidden trouble descriptions and semantics and tf-idf values between word semantics and documents, and the third deep learning network model determines word semantic weights based on keywords of extraction quality hidden trouble descriptions;
step S2: integrating the first deep learning network model, the second deep learning network model and the third deep learning network model trained by each local to obtain a corresponding local construction quality hidden danger data mining model;
step S3: encrypting each local construction quality hidden danger data mining model parameter through a local differential privacy technology, uploading the encrypted model parameters to an interstellar file system through a blockchain, and storing hash values of the encrypted model parameters in the blockchain;
step S4: the method comprises the steps of carrying out weighted average on a plurality of encryption model parameters stored in an interstellar file system through a federal average algorithm to form updated model parameters, storing the updated model parameters in the interstellar file system, and storing hash values of the updated model parameters on a block chain;
step S5: each local user downloads updated model parameters in the interstellar file system through the block chain;
step S6: judging whether all the local construction quality hidden danger data mining models after updating the parameters are converged, if not, continuing to perform local training on the non-converged construction quality hidden danger data mining models until the non-converged construction quality hidden danger data mining models are converged, and jumping to the step S3; if yes, ending the current training.
In one embodiment, in step S6, if all the local construction quality hidden danger data mining models after updating the parameters converge, the current training is ended and the process jumps to S7:
step S7: and (3) carrying out model evaluation on the trained model, outputting a final model if the model meets the standard, and adding a local training sample and jumping to the step (S3) to continue training if the model does not meet the standard.
In one embodiment, the model evaluation includes an evaluation of accuracy, recall, and F1 values.
In one embodiment, prior to model training, the following steps are performed:
collecting a certain number of local engineering construction quality hidden trouble correction reports, and labeling defined hidden trouble labels for quality hidden trouble description;
the method comprises the steps of segmenting a quality hidden trouble description, and converting the segmented description into a word vector matrix to obtain training data;
and when model training is carried out, the classification result of the predicted engineering problem is made to approach to the corresponding hidden danger label.
In one embodiment, the word segmentation of the description of the quality hidden danger is performed and then converted into a word vector matrix, including:
the quality hidden trouble description is segmented by means of the jieba segmentation technology;
and converting the segmented quality hidden trouble description into a Word vector matrix by means of Word2vec Word vectors.
In one embodiment, the first deep learning network model comprises a convolution layer, a rule activation function and a maximum pooling layer, and word semantic weights are determined by analyzing local features;
the second deep learning network model comprises a calculation structure for calculating tf-idf values between comprehensive words and documents and PMI values between words, wherein word semantic weights are determined by calculating tf-idf values between words and documents to calculate PMI values between word semantics and words;
the third deep learning network model comprises a bidirectional LSTM, a tanh nonlinear activation function and an attention mechanism module, and word semantic weights are determined by extracting keywords.
In one embodiment, in step S2, after weights of the first deep learning network model, the second deep learning network model and the third deep learning network model are optimized by a sequential quadratic programming algorithm, weights of the first deep learning network model, the second deep learning network model and the third deep learning network model are integrated by using a stacking strategy, so as to obtain a construction quality hidden danger data mining model.
In one embodiment, the local data for the current training is stored in the blockchain while the hash value of each local construction quality hazard data mining model parameter is stored in the blockchain.
According to another aspect of the present invention, there is provided a construction quality hazard data mining system comprising a data acquisition module and a construction quality hazard data mining model, wherein,
the data acquisition module is used for collecting quality hidden danger description in the engineering construction quality hidden danger correction report, preprocessing data and inputting the construction quality hidden danger data mining model;
the construction quality hidden danger data mining model is a construction quality hidden danger data mining model trained based on the construction quality hidden danger data mining model training method and is used for classifying engineering problems corresponding to quality hidden danger description in an engineering construction quality hidden danger correction report.
In one embodiment, the system further comprises a data management module for analyzing the space-time distribution characteristics of the quality hidden trouble after outputting a plurality of reports of the modification of the different engineering construction quality hidden trouble.
In general, the above technical solutions conceived by the present invention, compared with the prior art, enable the following beneficial effects to be obtained:
(1) The method comprises the steps of respectively constructing a first deep learning network model, a second deep learning network model and a third deep learning network model by considering the characteristics of different long and short languages, unstructured and the like of construction quality hidden trouble records, and training each model by using local data so as to realize classification of engineering problems in engineering construction quality hidden trouble correction reports. The three deep learning network models have different characteristics which are emphasized by the classification learning. The first deep learning network model is mainly used for analyzing local features of the quality hidden trouble description to determine word semantic weights. The second deep learning network model determines word semantic weights based on PMI values between word semantics and semantics of the analysis quality hidden trouble descriptions and tf-idf values between the word semantics and documents. The third deep learning network model is mainly used for determining word semantic weights based on keywords for extracting quality hidden trouble descriptions. And (3) training and optimizing weights in each model to enable the models to converge and realize classification. After the training of the three deep learning network models is completed, the three deep learning network models are used as a base classifier for integration, and a final required data mining model is obtained. As three basic classifiers are learned in the early stage, each basic classifier complements each other, the generalization capability of the data mining model can be enhanced after integration, and the convergence speed and the classification precision of the data mining model are effectively improved.
(2) After a plurality of local data mining models are obtained through training, the integrated block chain technology realizes sharing of a plurality of local model parameters, the local model parameters are further optimized through sharing of the plurality of local model parameters, the local training is continued, and the process is repeated until the training is completed after all the local models are converged. The invention combines the blockchain technology to realize the sharing of a plurality of local model parameters, widens the channel of data acquisition, ensures the reliability of data sources, and further improves the generalization capability and classification precision of the data mining model.
(3) During the realization of model parameter sharing, the invention firstly encrypts the model parameters through a local differential privacy technology, then uploads the encrypted parameters to the corresponding interstellar file system through the blockchain, simultaneously stores the hash value of the encrypted parameters in the blockchain, and can prevent an attacker from inquiring or falsifying the uploaded related parameters through a differential privacy encryption technology and broadcasting the hash value on the blockchain, thereby avoiding the leakage of the security of the user privacy information and improving the security and reliability of the information in the whole training process.
(4) By utilizing the construction quality hidden danger data mining model obtained through training, various categories (categories such as subordinate projects, hidden danger problems, hidden danger solutions and the like) in the project quality texts can be automatically identified and classified, and the project text management efficiency is improved, so that quality hidden danger association rules are mined, the space-time distribution characteristics of the quality hidden dangers are revealed, and finally the quality diagnosis and control scheme decision making is supported.
Drawings
FIG. 1 is a partial hidden danger tag in one embodiment;
FIG. 2 is a flow chart of steps of a construction quality hazard data mining model training method in one embodiment;
FIG. 3 is a schematic diagram of the storage and sharing of different local model parameters in one embodiment;
FIG. 4 is a schematic diagram of a construction quality hazard data mining model training method in one embodiment.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
Before training, the engineering construction quality hidden trouble correction report needs to be collected and preprocessed to be used for training a model, wherein the preprocessing comprises labeling and hidden trouble expression conversion. The method specifically comprises the following steps.
Step S001: and collecting a certain number of local engineering construction quality hidden trouble correction reports.
Specifically, the quality hidden trouble correction report is issued to the construction unit by the supervision unit to correct the quality problems in the construction project, and the quality hidden trouble correction report contains abundant quality problem information.
Step S002: and marking corresponding hidden danger labels on hidden danger description in the engineering construction quality hidden danger correction report by an expert.
The hidden danger labeling labels of the construction quality of the good process are defined firstly, corresponding hidden danger labels are labeled according to description of hidden danger at each position in the report, the hidden danger labels are the engineering problem classification of hidden danger, and the classification can be one-level classification or multi-level classification. FIG. 1 shows a partial hidden danger tag in an embodiment, which includes a plurality of hidden danger membership projects, each membership project is divided into a plurality of specific hidden danger problems, and each hidden danger problem corresponds to a corresponding hidden danger solution. The objective of construction quality hidden danger data mining model training is that when the hidden danger description in a construction quality hidden danger correction report is input, the data mining model can automatically analyze which hidden danger problem in which hidden danger membership engineering the current hidden danger description belongs to, and give out a corresponding hidden danger solution. For example, when the input quality hidden trouble description is "the waterproof layer of the bathroom is not done, the leakage condition is serious", the model can automatically identify the hidden trouble description membership engineering class "waterproof engineering" and the hidden trouble problem class "waterproof layer bulge and crack".
Specifically, the hidden danger labeling label of the engineering construction quality is defined according to the document or part of the content related to the monitoring and examination of the engineering construction quality in the field of the building engineering. Sources include: the term standard in the current standard specification of the construction engineering (such as unified standard for inspection and acceptance of construction quality of the construction engineering (GB 50300-2013) and evaluation standard for construction quality of the construction engineering (GB/T50375-2016)); the term definition part in various standard documents (such as a manual for preventing and controlling common diseases of construction engineering quality, a construction quality hidden trouble correction procedure and the like); expert experience summary; related research literature.
Step S003: the quality hidden trouble description is segmented and then converted into a word vector matrix, so that training data are obtained.
Specifically, the quality hidden trouble description is segmented by means of jieba segmentation. For example, "the bathroom waterproof layer is not done, and the leakage condition is serious" is divided into "the bathroom waterproof layer is not done, and the leakage condition is serious". And then converting the segmented quality hidden trouble description into a Word vector matrix by means of Word2vec Word vectors. For example, the "the bathroom waterproof layer is not in good leakage condition seriously" is converted into a matrix w= [6,128] with a word vector of 128 dimensions, and training data which can be input into a neural network is obtained.
After the training data is obtained, the training data can be divided into a training set and a testing set, and the corresponding proportion of the training set and the testing set can be 0.7:0.3 or 0.8:0.2. Training the model to enable the classification result of the engineering problem predicted by the model to approach to the corresponding hidden danger label.
Fig. 2 is a flowchart showing steps of a training method for a construction quality hidden danger data mining model in an embodiment, which mainly includes the following steps.
Step S100: and respectively training a first deep learning network model, a second deep learning network model and a third deep learning network model which can classify engineering problems corresponding to the quality hidden trouble description in the engineering construction quality hidden trouble correction report by using the local data.
The method comprises the steps that a first deep learning network (textCNN) model can extract local features of an engineering construction quality hidden danger correction report, word semantic weights of quality hidden danger description are determined through analysis of the local features to achieve classification, a second deep learning network (textGCN) model can extract global features of the engineering construction quality hidden danger correction report, word semantic weights are determined through calculation of PMI values between word semantics of the quality hidden danger description and semantics and tf-idf values between word semantics and documents to achieve classification, and a third deep learning network (textRNN+technology) model can extract key features of the engineering construction quality hidden danger correction report and determine the semantic weights through analysis of key words of the quality hidden danger description to achieve classification.
Specifically, the first neural network model includes a convolutional layer, a rule activation function, and a max-pooling layer. After the quality hidden trouble description is input, the local features of the quality hidden trouble description can be analyzed, namely the word semantic weight of the quality hidden trouble description is calculated, and the classification result is converged towards the expected through training the optimization weight.
The second neural network model has a tf-idf value and PMI value calculation structure, after the quality hidden trouble description is input, all features of the quality hidden trouble description can be analyzed, namely, tf-idf values between word semantics and documents and PMI (Point-wise Mutual Information) values between the word semantics and the semantics are calculated, word semantic weights are determined based on the tf-idf values and the PMI values, and classification results are converged towards expectations through training optimization weights.
The third neural network model comprises a bidirectional LSTM, a tanh nonlinear activation function and an attention mechanism module, can extract keywords in the quality hidden trouble description, determine word semantic weights, and enable classification results to converge towards expectations by training optimization weights.
Step S200: and integrating the first deep learning network model, the second deep learning network model and the third deep learning network model trained by each local to obtain a corresponding local construction quality hidden danger data mining model.
And integrating the three trained deep learning network models serving as a base classifier, and particularly optimizing weights among the first deep learning network model, the second deep learning network model and the third deep learning network model through a Sequential Quadratic Programming (SQP) algorithm, and integrating local features of the first deep learning network (TextCNN) model, key features of the third deep learning network (TextRNN) model and global features of the second deep learning network (text-GCN) model by using a stacking strategy to obtain the construction quality hidden danger data mining model. The integrated construction quality hidden danger data mining model has stronger generalization capability and classification precision and higher convergence speed in subsequent training due to the advantages of integrating three basic classifiers.
Step S300: encrypting each local construction quality hidden danger data mining model parameter through a local differential privacy technology, uploading the encrypted model parameters to a corresponding interstellar file system through a blockchain, and storing the hash value of the encrypted model parameters in the blockchain.
Specifically, the steps can be specifically divided into the following steps:
step S310: the trained local model parameters are encrypted by means of a local differential privacy technique.
Step S320: and constructing a blockchain network, and constructing a alliance chain network by means of a super ledger platform.
Step S330: and calling a plurality of locally trained model parameters by using the Go language, uploading the plurality of encrypted local model parameters to an IPFS (interstellar file system) through a blockchain network, and simultaneously, storing hash values of the model parameters in the constructed blockchain network. And through a P2P peer-to-peer network based on a blockchain, the encryption sharing of the quality hidden trouble data among different local areas is realized.
Step S400: and carrying out weighted average on a plurality of encryption model parameters stored in the interstellar file system through a federal average algorithm to form updated model parameters, storing the updated model parameters in the interstellar file system, and storing hash values of the updated model parameters on a blockchain.
The local model parameters stored in the IPFS are weighted and averaged through the blockchain, global model parameters are updated by means of a federal averaging algorithm, the model parameters after optimization are optimized, the optimized model parameters are continuously stored in an IPFS database, and hash values of the model parameters are stored in the blockchain network and broadcasted. In one embodiment, the federal average is a weighted average based on the number of training samples per local model. For example, the total number of samples of all engineering items in the current federal iteration period is n, and the number of training samples of the kth engineering item in the current federal iteration period is n k When federation average is performed in the current federation iteration period, the weight of the local model parameter corresponding to the kth engineering project is n k And/n, aggregating the weighted model parameters to form updated model parameters.
Step S500: each local user downloads updated model parameters in the interstellar file system via the blockchain.
Step S600: judging whether all the local construction quality hidden danger data mining models after updating the parameters are converged, if not, continuing to perform local training on the non-converged construction quality hidden danger data mining models until the non-converged construction quality hidden danger data mining models are converged, and jumping to the step S300; if yes, ending the current training.
As shown in fig. 3, a constructor, a manager, a business owner and a government are taken as benefit participation bodies of one building engineering project, then a blockchain network taking the building engineering projects 1,2,3 and 4 as blockchain network nodes is constructed, each building engineering project trains a data mining model on a local database (the local database 1,2,3 and 4), then the trained local model parameters (the CNN classification model parameters 1,2,3 and 4) are encrypted by means of a local differential privacy technology and uploaded to an IPFS database, and meanwhile hash values of the parameters are stored in the constructed blockchain network, and the encryption sharing of quality hidden trouble data among different projects is realized through a P2P peer-to-peer network based on the blockchain.
In one embodiment, after all models converge, the current training is ended and jumps to
Step S700: and (3) carrying out model evaluation on the trained model, outputting a final model if the model meets the standard, and adding a local training sample and jumping to the step S300 to continue training if the model does not meet the standard.
Specifically, the measurement can be performed by adopting Precision, recall and F1 value indexes, and the calculation formulas of Precision and Recall are as follows:
among them, true Positive (TP): representing the number of samples actually being predicted as Positive, false Positive (FP): representing the number of samples that are actually Negative but predicted to be positive, false Negative (FN): representing the number of samples that are actually positive but predicted to be negative.
Analyzing the output Precision, recall and F1 values, and if the values are within a preset range, transmitting the result obtained through deep learning processing to a blockchain network; otherwise, the collected sample is added again for training until the accurate value is within a preset range.
In one embodiment, the local data for the current training is stored in the blockchain while the hash value of each local construction quality hazard data mining model parameter is stored in the blockchain.
In an embodiment, the modules can be trained by using new building engineering quality hidden danger data periodically or aperiodically according to actual requirements, and internal parameters of each module are optimized.
As shown in fig. 3 and fig. 4, integration of a plurality of local model parameters is achieved through a blockchain, engineering quality hidden danger data are stored safely and shared effectively, the local model parameters are encrypted by means of a local differential privacy technology, the encrypted local model parameters are uploaded to an IPFS database, and meanwhile hash values of the parameters and the engineering quality hidden danger data are stored in the blockchain network. And carrying out weighted average on the plurality of local model parameters through a federal average algorithm, and updating global model parameters. And comparing and calculating the quality hidden danger labeling label automatically labeled by the optimized integrated learning construction hidden danger classification model with the quality hidden danger labeling label manually labeled by means of evaluation indexes such as accuracy, recall rate and F1 value, and evaluating the optimized integrated learning construction hidden danger classification model. If the accuracy is not good, repeating the operation steps until the requirements are met: the optimized classification model parameters are stored in the IPFS, and the parameter hash value and engineering quality hidden danger data are stored in the block chain network and broadcast. Meanwhile, the local user automatically identifies and classifies various categories (subordinate projects, hidden danger problems and solutions) in the project quality text by means of a finally trained project construction quality hidden danger data mining model, so that each participant can inquire the quality hidden danger data and trace the responsibility of the quality hidden danger.
Correspondingly, the application also relates to a construction quality hidden danger data mining system which comprises a data acquisition module and a construction quality hidden danger data mining model obtained through training by the method.
The data acquisition module is used for collecting quality hidden danger description from the engineering construction quality hidden danger correction report, the quality hidden danger correction record is a procedural report which is issued to the construction unit for correcting the quality problems in the construction project by the supervision unit, and the quality hidden danger correction report contains rich quality hidden danger information. In an embodiment, the data acquisition module is further configured to segment the quality hidden trouble description and convert the quality hidden trouble description into a word vector matrix, then input the word vector matrix into a construction quality hidden trouble data mining model, and the construction quality hidden trouble data mining model automatically classifies the input quality hidden trouble description and outputs the membership engineering, hidden trouble problems and solutions. In an embodiment, the system further comprises a data management module, which can grasp the space-time distribution characteristics of the quality hidden trouble according to the quality hidden trouble association rule and the visual chart and make a quality diagnosis and control scheme decision.
The training method and the mining system for the construction quality hidden danger data mining model provided by the invention have the following effects:
(1) According to the invention, the construction quality hidden danger data mining model is constructed by considering redundancy of the engineering quality hidden danger rectifying text record information, and various categories (categories such as subordinate engineering, hidden danger problems, hidden danger solutions and the like) in the engineering quality text are automatically identified and classified, so that quality hidden danger association rules are mined, the space-time distribution characteristics of the quality hidden danger are revealed, and finally the quality diagnosis and control scheme is supported to make decisions.
(2) According to the invention, characteristics of different long and short languages, unstructured and the like of construction quality hidden trouble record expression are considered, the text local characteristics of quality hidden trouble based on textCNN are integrated, the text global characteristics of quality hidden trouble based on textGCN of the graph neural network are integrated, the text key information characteristics of quality hidden trouble based on textNN+atttion are finally formed, and a construction quality hidden trouble data mining model with strong generalization capability and high precision is finally formed;
(3) The invention integrates the blockchain technology and the deep learning algorithm, is oriented to the automatic processing quality hidden trouble text information classification task, combines the blockchain technology and the deep learning technology, and improves the speed and the precision of text classification structuring. Meanwhile, the block chain technology and the deep learning algorithm are considered to integrate pain points for engineering quality hidden trouble data, for example: the parameters of the shared training model (such as weight values of deep neural network training) still have the problem of revealing the privacy information security of users, and an attacker can acquire partial privacy information of the participating users in analysis engineering of query differences of related parameters uploaded by the client training, especially attack (such as malicious users, untrusted servers and the like) threats initiated by internal entities; meanwhile, the local model parameters stored in the IPFS are weighted and averaged by utilizing a federal average algorithm, global model parameters are updated, and model parameters are optimized; the local classification model parameters and the updated global model parameters are stored in the IPFS, and only hash values of the parameters are stored in the blockchain, thereby reducing redundancy of the blockchain network.
(4) The invention widens the channel of data acquisition by means of the block chain technology and ensures the reliability of data sources. Various Internet and quality supervision platforms are mostly centralized systems in technical systems, are led by governments or trusted third parties, have the risk of data tampering, challenge the quality data credibility, have poor traceability and difficult responsibility tracing, and reduce the enthusiasm of mutual trust and sharing quality information of the participants; the blockchain converts contract trust into machine trust and code trust by the characteristics of distributed storage information, difficult tampering and easy traceability, automatically executed intelligent contracts and the like, provides a further trust basis for internet + quality supervision, and is hopeful to change the mode and flow of quality management.
In general, the invention can realize the automatic mining and storage of the information value of the hidden danger of the construction quality, and is favorable for the study and retrieval of construction safety information by combining the visual and image structural representation.
It will be readily appreciated by those skilled in the art that the foregoing is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (10)

1. The construction quality hidden danger data mining model training method is characterized by comprising the following steps of:
step S1: respectively training a first deep learning network model, a second deep learning network model and a third deep learning network model which can classify engineering problems corresponding to quality hidden trouble descriptions in engineering construction quality hidden trouble correction reports by using local data, wherein the first deep learning network model determines word semantic weights based on local features of analysis quality hidden trouble descriptions, the second deep learning network model determines word semantic weights based on PMI values between words of analysis quality hidden trouble descriptions and tf-idf values between words and documents, and the third deep learning network model determines word semantic weights based on keywords of extraction quality hidden trouble descriptions;
step S2: integrating the first deep learning network model, the second deep learning network model and the third deep learning network model trained by each local to obtain a corresponding local construction quality hidden danger data mining model;
step S3: encrypting each local construction quality hidden danger data mining model parameter through a local differential privacy technology, uploading the encrypted model parameters to an interstellar file system through a blockchain, and storing hash values of the encrypted model parameters in the blockchain;
step S4: the method comprises the steps of carrying out weighted average on a plurality of encryption model parameters stored in an interstellar file system through a federal average algorithm to form updated model parameters, storing the updated model parameters in the interstellar file system, and storing hash values of the updated model parameters on a block chain;
step S5: each local user downloads updated model parameters in the interstellar file system through the block chain;
step S6: judging whether all the local construction quality hidden danger data mining models after updating the parameters are converged, if not, continuing to perform local training on the non-converged construction quality hidden danger data mining models until the non-converged construction quality hidden danger data mining models are converged, and jumping to the step S3; if yes, ending the current training.
2. The construction quality hidden danger data mining model training method according to claim 1, wherein in step S6, if all the local construction quality hidden danger data mining models after updating the parameters converge, the current training is ended and the process jumps to S7:
step S7: and (3) carrying out model evaluation on the trained model, outputting a final model if the model meets the standard, and adding a local training sample and jumping to the step (S3) to continue training if the model does not meet the standard.
3. The construction quality hidden danger data mining model training method according to claim 2, wherein the model evaluation includes evaluation of accuracy, recall and F1 value.
4. The construction quality hidden danger data mining model training method according to claim 1, wherein prior to performing the model training, performing:
collecting a certain number of local engineering construction quality hidden trouble correction reports, and labeling defined hidden trouble labels for quality hidden trouble description;
the method comprises the steps of segmenting a quality hidden trouble description, and then converting the segmented quality hidden trouble description into a word vector matrix to obtain training data;
and when model training is carried out, the classification result of the predicted engineering problem is made to approach to the corresponding hidden danger label.
5. The construction quality hidden danger data mining model training method of claim 4, wherein the word segmentation is performed on the quality hidden danger description and then the quality hidden danger description is converted into a word vector matrix, comprising:
the quality hidden trouble description is segmented by means of the jieba segmentation technology;
and converting the segmented quality hidden trouble description into a Word vector matrix by means of Word2vec Word vectors.
6. The construction quality hidden danger data mining model training method according to claim 1, wherein the first deep learning network model comprises a convolution layer, a rule activation function and a maximum pooling layer, and word semantic weights are determined by analyzing local features;
the second deep learning network model comprises a calculation structure for calculating tf-idf values between comprehensive words and documents and PMI values between words, wherein word semantic weights are determined by calculating tf-idf values between words and documents to calculate PMI values between words;
the third deep learning network model comprises a bidirectional LSTM, a tanh nonlinear activation function and an attention mechanism module, and word semantic weights are determined by extracting keywords.
7. The construction quality hidden danger data mining model training method according to claim 1, wherein in step S2, weights of the first deep learning network model, the second deep learning network model and the third deep learning network model are integrated by using a stacking strategy after the weights of the first deep learning network model, the second deep learning network model and the third deep learning network model are optimized by a sequential quadratic programming algorithm, so as to obtain the construction quality hidden danger data mining model.
8. The construction quality hidden danger data mining model training method according to claim 1, wherein the hash value of each local construction quality hidden danger data mining model parameter is stored in the blockchain, and the local data used for the current training is stored in the blockchain.
9. The construction quality hidden danger data mining system is characterized by comprising a data acquisition module and a construction quality hidden danger data mining model, wherein,
the data acquisition module is used for collecting quality hidden danger description in the engineering construction quality hidden danger correction report, preprocessing data and inputting the construction quality hidden danger data mining model;
the construction quality hidden danger data mining model is a construction quality hidden danger data mining model trained based on the construction quality hidden danger data mining model training method according to any one of claims 1 to 8, and is used for classifying engineering problems corresponding to quality hidden danger description in an engineering construction quality hidden danger correction report.
10. The construction quality hazard data mining system of claim 9, further comprising a data management module for analyzing the spatiotemporal distribution characteristics of the quality hazard after outputting a plurality of reports of different engineering construction quality hazard modifications.
CN202211522702.7A 2022-11-30 2022-11-30 Construction quality hidden danger data mining model training method and mining system Active CN115828926B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211522702.7A CN115828926B (en) 2022-11-30 2022-11-30 Construction quality hidden danger data mining model training method and mining system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211522702.7A CN115828926B (en) 2022-11-30 2022-11-30 Construction quality hidden danger data mining model training method and mining system

Publications (2)

Publication Number Publication Date
CN115828926A CN115828926A (en) 2023-03-21
CN115828926B true CN115828926B (en) 2023-08-04

Family

ID=85533217

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211522702.7A Active CN115828926B (en) 2022-11-30 2022-11-30 Construction quality hidden danger data mining model training method and mining system

Country Status (1)

Country Link
CN (1) CN115828926B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109189925A (en) * 2018-08-16 2019-01-11 华南师范大学 Term vector model based on mutual information and based on the file classification method of CNN
CN112651224A (en) * 2020-12-24 2021-04-13 天津大学 Intelligent search method and device for engineering construction safety management document text
CN113254573A (en) * 2020-02-12 2021-08-13 北京嘀嘀无限科技发展有限公司 Text abstract generation method and device, electronic equipment and readable storage medium
CN113536382A (en) * 2021-08-09 2021-10-22 北京理工大学 Block chain-based medical data sharing privacy protection method by using federal learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109189925A (en) * 2018-08-16 2019-01-11 华南师范大学 Term vector model based on mutual information and based on the file classification method of CNN
CN113254573A (en) * 2020-02-12 2021-08-13 北京嘀嘀无限科技发展有限公司 Text abstract generation method and device, electronic equipment and readable storage medium
CN112651224A (en) * 2020-12-24 2021-04-13 天津大学 Intelligent search method and device for engineering construction safety management document text
CN113536382A (en) * 2021-08-09 2021-10-22 北京理工大学 Block chain-based medical data sharing privacy protection method by using federal learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Identification of accident-injury type and bodypart factors from construction accident reports: A graph-based deep learning framework;Xing Pan et al.;Advanced Engineering Informatics;第1-12页 *

Also Published As

Publication number Publication date
CN115828926A (en) 2023-03-21

Similar Documents

Publication Publication Date Title
CN106682527B (en) A kind of data security control method and system based on data classification classification
Li et al. Notice of retraction: intelligent transportation system in Macao based on deep self-coding learning
CN108830308B (en) Signal-based traditional feature and depth feature fusion modulation identification method
Meng et al. Rating the crisis of online public opinion using a multi-level index system
CN110852601B (en) Big data application method and system for environmental monitoring law enforcement decision
CN111787090B (en) Intelligent treatment platform based on block chain technology
CN109523021A (en) A kind of dynamic network Structure Prediction Methods based on long memory network in short-term
CN112087442A (en) Time sequence related network intrusion detection method based on attention mechanism
CN110011990A (en) Intranet security threatens intelligent analysis method
CN110716957B (en) Intelligent mining and analyzing method for class case suspicious objects
CN115828926B (en) Construction quality hidden danger data mining model training method and mining system
CN113674846A (en) Hospital intelligent service public opinion monitoring platform based on LSTM network
CN116952654B (en) Environment monitoring and early warning system for administrative supervision
Hui Construction of information security risk assessment model in smart city
CN113222109A (en) Internet of things edge algorithm based on multi-source heterogeneous data aggregation technology
Yu et al. Sports Event Model Evaluation and Prediction Method Using Principal Component Analysis.
CN114491168B (en) Method and system for regulating and controlling cloud sample data sharing, computer equipment and storage medium
CN116108445A (en) Intelligent risk early warning management method and system for information system
CN114238738A (en) Rumor detection method based on attention mechanism and bidirectional GRU
CN113779125A (en) Construction safety information management method and system
Wang Retracted: Multi‐data multiple gray clustering analysis based on layered mining for ubiquitous clouds and social internet of things
Wang et al. Multiple imputation of maritime search and rescue data at multiple missing patterns
Yang [Retracted] Evaluation and Analysis of Multimedia Collaborative Building Design Relying on Particle Swarm Optimization Algorithm
Wang et al. [Retracted] Application of the Data Mining Model in Smart Mobile Education
El Bour et al. Crime Prediction in the Era of Big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant