CN114997278A - Engineering digital information analysis method based on computer algorithm model - Google Patents

Engineering digital information analysis method based on computer algorithm model Download PDF

Info

Publication number
CN114997278A
CN114997278A CN202210497845.0A CN202210497845A CN114997278A CN 114997278 A CN114997278 A CN 114997278A CN 202210497845 A CN202210497845 A CN 202210497845A CN 114997278 A CN114997278 A CN 114997278A
Authority
CN
China
Prior art keywords
information
engineering
data
digitalized
algorithm model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210497845.0A
Other languages
Chinese (zh)
Other versions
CN114997278B (en
Inventor
杨晨
邓水光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202210497845.0A priority Critical patent/CN114997278B/en
Publication of CN114997278A publication Critical patent/CN114997278A/en
Application granted granted Critical
Publication of CN114997278B publication Critical patent/CN114997278B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9027Trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an engineering digital information analysis method based on a computer algorithm model, which comprises the following steps: (S1) data acquisition, acquiring different types of data information from the engineering digitalized information database; (S2) dividing the different types of data information obtained from the engineering digitalized information database into N data packets; (S3) constructing bat algorithm model classifier, (S4) setting root node and child node of the engineering digital information, and converting different engineering digital information into data structure of binary tree model in tree-like mode; (S5) constructing a similarity matrix function; (S6) searching different classification attributes of each root node or leaf node in the constructed decision tree model through a particle swarm optimization algorithm model; the invention greatly improves the classification capability of the engineering digitalized information by converting the engineering digitalized information into the computer algorithm model.

Description

Engineering digital information analysis method based on computer algorithm model
Technical Field
The invention relates to the technical field of Internet, in particular to an engineering digital information analysis method based on a computer algorithm model.
Background
With the rapid development of the internet, the data volume of the human society is rapidly increased, statistics is made that the data generated by human in one year is equal to the sum of all histories before the human enters the modernization, the development of the internet service is usually explosive, the service volume is likely to be suddenly and explosively increased by thousands of times in a short month, and the corresponding data is also likely to be rapidly increased from the original hundreds of GB to hundreds of TB. If the system is unstable or inaccessible at the critical moment of the outbreak, it is a devastating hit for the business. The engineering digitization technology is also expanding in an explosion manner, and if the engineering digitization information analysis with a great amount of data is performed, the key to be solved urgently is to perform deeper analysis and excavation on the engineering digitization information analysis. In the prior art, a manual analysis method is generally adopted, and various data information is analyzed and calculated in a manual mode, so that the method is low in efficiency, and has the problems of analysis errors and the like in the analysis process.
Disclosure of Invention
Aiming at the technical defects, the invention discloses an engineering digital information analysis method based on a computer algorithm model, which improves the classification capability of engineering digital information by constructing an improved C5.0 decision tree model, improves the classification capability of the C5.0 decision tree model in processing engineering digital information by integrating a bat algorithm model classifier, and improves the data analysis capability by applying the computer algorithm model.
In order to achieve the technical effects, the invention adopts the following technical scheme: a
A method for analyzing engineering digital information based on a computer algorithm model comprises the following steps:
(S1) data acquisition, acquiring different types of data information from the engineering digitalized information database;
the data type at least comprises digital management information, digital service information, intelligent tool information, digital engineering data information or financial data information;
(S2) dividing the data information of different types acquired from the engineering digitalized information database into N data packets, and setting sample nodes according to different data types of the digitalized management information, the digitalized service information, the intelligent tool information, the digitalized engineering data information or the financial data information; setting attribute marks on different data sample nodes;
(S3) constructing a bat algorithm model classifier, and searching the optimal classification data information from the engineering digital information with different data attributes by a hair style searching method so as to improve the node searching capability of the engineering digital information;
(S4) setting a root node and a child node of the engineering digital information, and converting different engineering digital information into a data structure of a binary tree model in a tree-like manner;
(S5) constructing a similarity matrix function to eliminate useless data information in the binary tree model data structure; constructing a decision tree function for outputting the digitalized information of the engineering;
the judgment standard is shown as formula 1:
Figure BDA0003633552620000021
as shown in formula 1, if the sub-rule elements in the sub-scheme are the same, the similarity pair is considered to be 1, and the similarity is considered to be 0;
the judgment standard of the numerical type elements is shown as formula 2:
Figure BDA0003633552620000022
as shown in formula 2, when a numerical element is mainly determined by upper and lower data limits, such as { a > k1}, where k1 is the numerical value of the element, and both are ">" in element comparison, a numerical similarity determination is performed, where k1 and k2 are the upper and lower numerical limits, and t is the maximum similarity that can be tolerated by the two elements; expressed by the formula:
Figure BDA0003633552620000023
the similarity matrix for establishing any two decision trees is shown as formula 4:
Figure BDA0003633552620000024
the relationship between engineering digitized information input and output can be expressed as:
Figure BDA0003633552620000025
wherein P (x) i /y i ) For detecting engineering digitalized information, P (x) i /y i ) In any two decision trees
Data receiving quantity, digital information x of data detection engineering i While receiving data y simultaneously i The probability number of the time, then the probability can be expressed as:
∑P(x i /y i )=1 (6)
by calculating data information between the data sending terminal and the data receiving terminal, the data receiving and transmitting conditions of the engineering digital information in the analysis process can be obtained;
uncertainty function E (x) of engineering digital information in the process of constructing decision tree i ) Expressed as:
Figure BDA0003633552620000031
in the formula (7), the distribution probability of the engineering digitalized information which has been imported into the decision tree model is p (x), and when Y ═ Y exists in the received data information i When the probability distribution of the engineering digital information transmitting terminal is P (X/y) i ) Then the following quantitative relationship exists:
Figure BDA0003633552620000032
in equation (8), when there is external interference, the information gain for classifying data may be:
Gains(X,Y)=E(X)-E(X/Y) (9)
the uncertain factors of the engineering digitalized information in the decision tree model can be eliminated through the formula (9), and the classification capability of the engineering digitalized information is improved;
(S6) searching different classification attributes of each root node or leaf node in the constructed decision tree model through a particle swarm optimization algorithm model, and finally achieving whether the constructed decision tree model is reasonable or not by searching the root node or the leaf node to obtain the optimal local data solution.
As a further technical scheme of the invention, the bat algorithm model classifier comprises a classifier main control module, a data interface connected with the classifier main control module, an optimal classification node searching module, an iterative computation module, a global optimal value computation module and a data output terminal.
As a further technical scheme of the invention, the method for searching the optimal classification point by the optimal classification node searching module comprises the following steps:
setting the optimal position X of the current population of the engineering digital information * If different data information in the engineering digitalized information is analogized to particle information, then:
Figure BDA0003633552620000041
Figure BDA0003633552620000042
x new =x old +ε×A t (12)
wherein
Figure BDA0003633552620000043
Represents the ith quantity of engineering digitalized information at time t, A t Representing the iterative counting of the whole engineering digitalized information group in the same generationAverage loudness in the calculation, X * Representing the current local optimal solution position of the security data information group of the computer network when updating different positions; wherein ε represents a constant value between 0.5-1; r is i Representing the updating frequency of the engineering digitalized information in the analysis process; f. of i Representing engineering digitized information by v i The output frequency of the transmission rate.
As a further technical scheme of the invention, the method for realizing iterative computation by the iterative computation module is to continuously generate local data information in the engineering digitalized information random number set, wherein the iteration times are more than 50.
As a further technical scheme of the invention, the iterative computation times of the global optimal value computation module are 100 times.
As a further technical scheme, the particle swarm optimization algorithm model comprises a particle swarm optimization algorithm model main control module, and a Logistic chaotic mapping module, a particle speed updating module, a fitness calculating module and a screening module which are connected with the particle swarm optimization algorithm model main control module, wherein the Logistic chaotic mapping module is also provided with a variable interval module.
As a further technical scheme of the invention, the particle swarm optimization algorithm model main control module is a main control module based on a programmable controller.
As a further technical scheme of the invention, the Logistic chaotic mapping module is used for generating initial particles, and the function model is
P i,n =4P i-1,n (1-P i-1,n ); (13)
P i,n Representing the data information of particles continuously generated by the engineering digitalized information, and mapping the engineering digitalized information in the Logistic chaotic mapping module in the interval [0,1 ]]In, P i-1,n Representing the data information output by the Logistic chaotic mapping module at the previous moment, wherein the variable interval of the engineering digitalized information output by the Logistic chaotic mapping module is [ a ] n ,b n ](ii) a Wherein i is 2, 3.
As a further technical solution of the present invention, the particle velocity update module is configured to continuously update particle data information, and the update function is:
Figure BDA0003633552620000051
where xi is the vector representation of the ith particle in the D-dimensional vector, which can be expressed as:
x i =(x i1 ,x i2 ,...,x iD ) T (15)
where i 1, 2.. multidot.m, the location of the ith particle in the D-dimensional vector space is represented by x i To indicate.
As a further technical scheme of the invention, the calculation function of the fitness calculation module is as follows:
Figure BDA0003633552620000052
wherein, for the fitness of i particles, the letter f is used i Indicating that the average fitness of the particle group particles is currently used
Figure BDA0003633552620000053
Expressed by f, the normalization factor is expressed by f.
The fitness variance adopted by the screening module is expressed by a formula as follows:
Figure BDA0003633552620000054
when the amount calculated by the formula (17) is larger than the set value ε (ε > 0), [0,1 ] is reset]Mapping of chaotic interval of [ a ] n ,b n ]Within the variable interval of (c).
Positive and advantageous effects
The invention discloses a method for analyzing engineering digital information by applying a computer algorithm model, which improves the classification capability of engineering digital information by constructing an improved C5.0 decision tree model, improves the classification capability of the C5.0 decision tree model in processing the engineering digital information by integrating a bat algorithm model classifier, improves the speed of searching the engineering digital information in the classification process of the engineering digital information by a particle swarm optimization algorithm model, and improves the classification capability of the engineering digital information in constructing the C5.0 decision tree model.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without inventive exercise, wherein:
FIG. 1 is a flow chart of a method for analyzing engineering digitized information according to the present invention;
FIG. 2 is a schematic diagram of the structural principle of the engineering digitalized information analysis in the present invention;
FIG. 3 is a schematic diagram of the structural principle of the bat algorithm model classifier in the present invention;
FIG. 4 is a schematic diagram of a principle of a particle swarm optimization model master control module structure in the invention;
FIG. 5 is a schematic structural diagram of an embodiment of a C5.0 decision tree classification model according to the present invention;
FIG. 6 is a flow chart of an embodiment of the bat algorithm model classifier of the present invention;
FIG. 7 is a schematic flow chart of an embodiment of a particle swarm optimization algorithm model according to the present invention.
Detailed Description
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, and it should be understood that the embodiments described herein are merely for the purpose of illustrating and explaining the present invention and are not intended to limit the present invention.
As shown in fig. 1, a method for analyzing engineering digitalized information based on a computer algorithm model includes the following steps:
(S1) data acquisition, acquiring different types of data information from the engineering digitalized information database;
the data type at least comprises digital management information, digital service information, intelligent tool information, digital engineering data information or financial data information;
(S2) dividing the data information of different types obtained from the engineering digitalized information database into N data packets, and setting sample nodes according to different data types of the digitalized management information, the digitalized service information, the intelligent tool information, the digitalized engineering data information or the financial data information; setting attribute marks on different data sample nodes;
such as engineering management data type, financial type, personnel type and the like, and setting different engineering digitalized information into labels convenient to manage according to different management attributes;
(S3) constructing a bat algorithm model classifier, and searching the optimal classification data information from the engineering digital information with different data attributes by a hair style searching method so as to improve the node searching capability of the engineering digital information;
the technology applies the bionics principle, ultrasonic waves are sent out through the bat throat, then the direction is distinguished and the target is detected according to the response of the ultrasonic waves, and different data branches in engineering digital information can be retrieved by applying the bionics principle.
(S4) setting a root node and a child node of the engineering digital information, and converting different engineering digital information into a data structure of a binary tree model in a tree-like manner;
in an embodiment, as shown in fig. 5, the set a of the engineering digitized information is { "a > X", "Class _ 1" }, and as a solution four, … may be represented by the set D { "a < ═ X", "B > Y", "C > Z", "Class _ 4" }, so that the set of binary tree models in this embodiment is { A, B, C, D }, and a similarity matrix of the decision tree is established by comparing similarities of bottom-layer elements of the set of decision tree algorithms, thereby improving the engineering digitized information analysis capability.
(S5) constructing a similarity matrix function to remove useless data information in the binary tree model data structure; constructing a decision tree function for outputting the digitalized information of the engineering;
in a specific embodiment, the similarity determination process first compares the elements of the sub-scheme. For the comparison of non-numerical elements, the judgment standard is as shown in formula 1:
Figure BDA0003633552620000071
as shown in formula 1-fig. 2, if the sub-rule elements in the sub-scheme are the same, the similarity pair is considered to be 1, and the similarity is considered to be 0;
the judgment standard of the numerical element is shown as formula 2:
Figure BDA0003633552620000072
as shown in formula 2, when a numerical element is mainly determined by upper and lower limits of data, such as { a > k1}, where k1 represents a numerical value of the element, and both are ">" in element comparison, a numerical similarity determination is performed, where k1 and k2 represent upper and lower limits of the numerical value, and t represents a maximum similarity that can be tolerated by two elements; expressed by the formula:
Figure BDA0003633552620000073
by calculating the similarity between all decision trees in the random forest algorithm, a similarity matrix of any two decision trees can be established as shown in formula 4:
Figure BDA0003633552620000074
similarity Sim n,m The similarity between the nth and the m decision trees is shown, the similarity of any two decision tree algorithms can be seen through a similarity matrix, and repeated part algorithm nodes are deletedAnd the calculation efficiency of the algorithm is improved.
The relationship between engineering digitized information input and output can be expressed as:
Figure BDA0003633552620000081
wherein P (x) i /y i ) For detecting engineering digitalized information, P (x) i /y i ) Data receiving quantity and data detection engineering digitalized information x in the process of carrying out similarity calculation of any two decision tree algorithms i While simultaneously receiving data y i The probability number of the time, then the probability can be expressed as:
∑P(x i /y i )=1 (6)
by calculating data information between the data sending terminal and the data receiving terminal, the data receiving and transmitting conditions of the engineering digital information in the analysis process can be obtained;
here, a mathematical expectation is introduced, assuming the letter E (x) i ) Expressing that the prior entropy expresses the prior uncertainty of the data, and then the uncertainty function E (x) of the engineering digital information in the process of constructing the decision tree i ) Expressed as:
Figure BDA0003633552620000082
in formula (7), the distribution probability of the engineering digitalized information that has been imported into the decision tree model is p (x), and when Y ═ Y exists in the received data information i When the probability distribution of the engineering digital information sending terminal is P (X/y) i ) Then the following quantitative relationship exists:
Figure BDA0003633552620000083
in equation (8), when there is external interference, the information gain for classifying data may be:
Gains(X,Y)=E(X)-E(X/Y) (9)
the uncertain factors of the engineering digitalized information in the decision tree model can be eliminated through the formula (9), and the classification capability of the engineering digitalized information is improved;
(S6) searching different classification attributes of each root node or leaf node in the constructed decision tree model through a particle swarm optimization algorithm model, and finally achieving whether the constructed decision tree model is reasonable or not by searching the root node or the leaf node to obtain the optimal local data solution.
In the above embodiment, since the standard particle swarm algorithm does not have the diversity characteristic of the particles, the particles are easy to lose during the searching and optimizing process, and are easy to deviate from the original path direction, so that the whole population is divided into different sub-populations. Traversing the whole data information in the constructed decision tree model to enable the fitness value of each sub-population to be calculated quickly; when a preset period is reached, the global position is optimally updated, the local optimal solution is finally searched, and the situation that the root node or the leaf node is selected improperly in the process of constructing the decision tree model is avoided by dividing the sub population in the decision tree model.
In the above embodiment, the bat algorithm model classifier includes a classifier main control module, and a data interface, an optimal classification node search module, an iterative computation module, a global optimal value computation module and a data output terminal which are connected to the classifier main control module.
In decision tree construction, a forest is built randomly with a plurality of different decision trees, and each decision tree in the random forest algorithm model is not connected with one another. After the forest model is built, each decision tree in the forest model is independently judged when a new engineering digital information input sample appears, which type the engineering digital information sample should belong to is judged, and the class with higher occurrence probability is used as a final data analysis selection result. And (3) further upgrading the decision tree algorithm, wherein when a random forest model is generated, the generation of a plurality of decision trees can be determined by the method. When determining the branch node of the decision tree, a mode that the branch node gradually recurs and branches is adopted, when recursing and branching, extraction is needed from other data characteristics, the extraction mode still adopts the mode that part of characteristics are randomly extracted, and the sub-branch is determined again. After the nodes and the sub-nodes are determined by the method, a decision tree model is established. Each set of data samples is then trained using the method described above, thereby building a plurality of different decision trees. When the decision trees are increased step by step, the constructed decision trees can be stored. And finally, judging whether the quantity of the constructed decision trees can meet the requirements of the user, if not, retraining and learning according to the method, and re-determining the category of the new input sample according to the voting principle (minority obeys majority). When the user requirements are met, a random forest model is generated, the method needs to continuously construct classifiers, the first output is output through a weak learner algorithm, then a plurality of weak classifiers are subjected to repeated iterative computation, and finally a strong classifier is output. Although the method can improve the data calculation speed, the learning process is complicated due to continuous data calculation, and the research adopts a bat algorithm model classifier, so that the data classification capability is greatly improved. As shown in fig. 6.
In the above embodiment, as shown in fig. 3, the main control module of the classifier is a main control module based on a programmable controller.
In the above embodiment, the method for the optimal classification node search module to search for the optimal classification point includes:
as shown in FIGS. 4-5, the optimal position X of the current population of the engineering digitalized information is set * And then continuously updating the data position according to the following formula, wherein different data information in the engineering digitalized information is analogized to particle information, then:
Figure BDA0003633552620000101
Figure BDA0003633552620000102
x new =x old +ε×A t (12)
wherein
Figure BDA0003633552620000103
Representing the ith quantity of engineering digitized information at time t, A t Represents the average loudness, X, of the entire population of engineering digitized information in the same iterative computation process * Representing the current local optimal solution position of the security data information group of the computer network when updating different positions; wherein ε represents a constant value between 0.5-1; r is i Representing the updating frequency of the engineering digitalized information in the analysis process; f. of i Representing engineering digitized information by v i The output frequency of the transmission rate of (c).
In the above embodiment, the iterative computation module implements iterative computation by continuously generating local data information in the engineering digitized information random number set, where the iteration number is greater than 50.
In another embodiment, the output data is assumed to be [0,1 ]]The generated random number of (2) is rand1 when rand1>r i And outputting the optimal decision tree identification points of the engineering digitalized information, and then continuously generating local data information, optimal data information and the like, thereby finally realizing the optimal data updating of the engineering digitalized information. In another embodiment, assume a data set [0,1 ] in the engineering digitized information]Noted above as random number rand2, in rand2<A i In case, and f (x) i )<f(X * ) Then the position is the best position.
In the above embodiment, the number of iterative computations of the global optimum value computation module is 100.
In the above embodiment, as shown in fig. 7, the particle swarm optimization algorithm model includes a particle swarm optimization algorithm model main control module, and a Logistic chaotic mapping module, a particle velocity updating module, a fitness calculating module, and a screening module connected to the particle swarm optimization algorithm model main control module, where the Logistic chaotic mapping module is further provided with a variable interval module.
In the above embodiment, the particle swarm optimization algorithm model main control module is a main control module based on a programmable controller.
In the above embodiment, the Logistic chaotic mapping module is used to generate the initial particles, and the function model is
P i,n =4P i-1,n (1-P i-1,n ); (13)
P i,n Representing the data information of particles continuously generated by the engineering digitalized information, and mapping the engineering digitalized information in the Logistic chaotic mapping module in the interval [0,1 ]]In, P i-1,n Representing the data information output by the Logistic chaotic mapping module at the previous moment, wherein the variable interval of the engineering digitalized information output by the Logistic chaotic mapping module is [ a ] n ,b n ](ii) a Wherein i is 2, 3.
In the above embodiment, the particle velocity update module is configured to continuously update the particle data information, and the update function is
Figure BDA0003633552620000111
Wherein x i For the vector representation of the ith particle in the D-dimensional vector, the set can be represented as:
x i =(x i1 ,x i2 ,...,x iD ) T
(15)
where i 1, 2.. multidot.m, the location of the ith particle in the D-dimensional vector space is represented by x i To represent;
(16)
in the above embodiment, the calculation function of the fitness calculation module is:
Figure BDA0003633552620000112
wherein, for the fitness of i particles, the letter f is used i Indicates, current grainAverage fitness of subgroup particles
Figure BDA0003633552620000113
Expressed in terms of f, the normalization factor is expressed in terms of f.
In the above embodiment, the fitness variance adopted by the screening module is expressed by the following formula:
Figure BDA0003633552620000114
when the quantity calculated by the formula (18) is larger than the set value ε (ε > 0), then [0,1 ] is reset]Mapping of chaotic interval of [ a ] n ,b n ]Within the variable interval of (c).
Although specific embodiments of the invention have been described herein, it will be understood by those skilled in the art that these embodiments are merely illustrative and that various omissions, substitutions and changes in the form and details of the methods and systems described may be made by those skilled in the art without departing from the spirit and scope of the invention. For example, it is within the scope of the present invention to combine the steps of the above-described methods to perform substantially the same function in substantially the same way to achieve substantially the same result. Accordingly, the scope of the invention is to be limited only by the following claims.

Claims (10)

1. A computer algorithm model-based engineering digital information analysis method is characterized in that: the method comprises the following steps:
(S1) data acquisition, acquiring different types of data information from the engineering digitalized information database;
the data type at least comprises digital management information, digital service information, intelligent tool information, digital engineering data information or financial data information;
(S2) dividing the data information of different types acquired from the engineering digitalized information database into N data packets, and setting sample nodes according to different data types of the digitalized management information, the digitalized service information, the intelligent tool information, the digitalized engineering data information or the financial data information; setting attribute marks on different data sample nodes;
(S3) constructing a bat algorithm model classifier, and searching the optimal classification data information from the engineering digital information with different data attributes by a hair style searching method so as to improve the node searching capability of the engineering digital information;
(S4) setting a root node and a child node of the engineering digital information, and converting different engineering digital information into a data structure of a binary tree model in a tree-like manner;
(S5) constructing a similarity matrix function to eliminate useless data information in the binary tree model data structure; constructing a decision tree function for outputting the digitalized information of the engineering;
the judgment standard is shown as formula 1:
Figure FDA0003633552610000011
as shown in formula 1, if the sub-rule elements in the sub-scheme are the same, the similarity pair is considered to be 1, and the similarity is considered to be 0;
the judgment standard of the numerical type elements is shown as formula 2:
Figure FDA0003633552610000012
as shown in formula 2, when a numerical element is mainly determined by upper and lower data limits, such as { a > k1}, where k1 is the numerical value of the element, and both are ">" in element comparison, a numerical similarity determination is performed, where k1 and k2 are the upper and lower numerical limits, and t is the maximum similarity that can be tolerated by the two elements; expressed by the formula:
Figure FDA0003633552610000021
the similarity matrix for establishing any two decision trees is shown as formula 4:
Figure FDA0003633552610000022
the relationship between engineering digitized information input and output can be expressed as:
Figure FDA0003633552610000023
wherein P (x) i /y i ) For detecting engineering digitalized information, P (x) i /y i ) In any two decision trees
Data receiving quantity, digital information x of data detection engineering i While simultaneously receiving data y i The probability number of the time can be expressed as:
∑P(x i /y i )=1 (6)
by calculating data information between the data sending terminal and the data receiving terminal, the data receiving and transmitting conditions of the engineering digital information in the analysis process can be obtained;
uncertainty function E (x) of engineering digital information in the process of constructing decision tree i ) Expressed as:
Figure FDA0003633552610000024
in formula (7), the distribution probability of the engineering digitalized information that has been imported into the decision tree model is p (x), and when Y ═ Y exists in the received data information i When the probability distribution of the engineering digital information transmitting terminal is P (X/y) i ) Then the following quantitative relationship exists:
Figure FDA0003633552610000025
in equation (8), when there is external interference, the information gain for classifying data may be:
Gains(X,Y)=E(X)-E(X/Y) (9)
the uncertain factors of the engineering digitalized information in the decision tree model can be eliminated through the formula (9), and the classification capability of the engineering digitalized information is improved;
(S6) searching different classification attributes of each root node or leaf node in the constructed decision tree model through a particle swarm optimization algorithm model, and finally achieving whether the constructed decision tree model is reasonable or not by searching the root node or the leaf node to obtain the optimal local data solution.
2. The method for analyzing engineering digitalized information based on computer algorithm model according to claim 1, characterized in that: the bat algorithm model classifier comprises a classifier main control module, a data interface connected with the classifier main control module, an optimal classification node searching module, an iterative calculation module, a global optimal value calculation module and a data output terminal.
3. The method for analyzing engineering digitalized information based on computer algorithm model according to claim 2, characterized in that: the method for searching the optimal classification point by the optimal classification node searching module comprises the following steps:
setting the optimal position X of the current population of the engineering digital information * If different data information in the engineering digital information is analogized to particle information, then:
Figure FDA0003633552610000031
Figure FDA0003633552610000032
x new =x old +ε×A t (12)
wherein
Figure FDA0003633552610000033
Representing the ith quantity of engineering digitized information at time t, A t Represents the average loudness, X, of the entire population of engineering digitized information in the same iterative computation process * Representing the current local optimal solution position of the security data information group of the computer network when updating different positions; wherein ε represents a constant value between 0.5-1; r is i Representing the updating frequency of the engineering digitalized information in the analysis process; f. of i Representing engineering digitized information by v i The output frequency of the transmission rate of (c).
4. The method for analyzing engineering digitalized information based on computer algorithm model according to claim 2, characterized in that: the iterative calculation module is used to realize the iterative calculation method, which is to generate the local data information continuously in the engineering digitalized information random number set, and the iteration times is more than 50.
5. The method for analyzing engineering digitalized information based on computer algorithm model according to claim 2, characterized in that: the iterative computation times of the global optimal value computation module are 100 times.
6. The method for analyzing engineering digitalized information based on computer algorithm model according to claim 1, characterized in that: the particle swarm optimization algorithm model comprises a particle swarm optimization algorithm model main control module, and a Logistic chaotic mapping module, a particle speed updating module, a fitness calculating module and a screening module which are connected with the particle swarm optimization algorithm model main control module, wherein the Logistic chaotic mapping module is also provided with a variable interval module.
7. The method for analyzing engineering digitalized information based on computer algorithm model as claimed in claim 6, wherein: the particle swarm optimization algorithm model main control module is a main control module based on a programmable controller.
8. The method for analyzing engineering digitalized information based on computer algorithm model as claimed in claim 6, wherein: the Logistic chaotic mapping module is used for generating initial particles and has a function model of
P i,n =4P i-1,n (1-P i-1,n ); (13)
P i,n Representing the data information of particles continuously generated by the engineering digitalized information, and mapping the engineering digitalized information in the Logistic chaotic mapping module in the interval [0,1 ]]In, P i-1,n Representing the data information output by the Logistic chaotic mapping module at the previous moment, wherein the variable interval of the engineering digitalized information output by the Logistic chaotic mapping module is [ a ] n ,b n ](ii) a Wherein i is 2, 3.
9. The method for analyzing engineering digitalized information based on computer algorithm model as claimed in claim 6, wherein: the particle speed updating module is used for continuously updating particle data information, and the updating function is as follows:
Figure FDA0003633552610000041
wherein x is i For the vector representation of the ith particle in the D-dimensional vector, the set can be represented as:
x i =(x i1 ,x i2 ,...,x iD ) T (15)
where i 1, 2.. multidot.m, the location of the ith particle in the D-dimensional vector space is represented by x i To indicate.
10. The method for analyzing engineering digitalized information based on computer algorithm model as claimed in claim 6, wherein: the calculation function of the fitness calculation module is as follows:
Figure FDA0003633552610000042
wherein, for the fitness of i particles, the letter f is used i Indicating that the average fitness of the particle group particles is currently used
Figure FDA0003633552610000051
Expressed by f, the normalization factor is expressed by f.
The fitness variance adopted by the screening module is expressed by a formula as follows:
Figure FDA0003633552610000052
when the amount calculated by the formula (17) is larger than the set value ε (ε > 0), [0,1 ] is reset]Mapping of chaotic Interval on to [ a n ,b n ]Within the variable interval of (2).
CN202210497845.0A 2022-05-09 2022-05-09 Engineering digital information analysis method based on computer algorithm model Active CN114997278B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210497845.0A CN114997278B (en) 2022-05-09 2022-05-09 Engineering digital information analysis method based on computer algorithm model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210497845.0A CN114997278B (en) 2022-05-09 2022-05-09 Engineering digital information analysis method based on computer algorithm model

Publications (2)

Publication Number Publication Date
CN114997278A true CN114997278A (en) 2022-09-02
CN114997278B CN114997278B (en) 2023-04-07

Family

ID=83024950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210497845.0A Active CN114997278B (en) 2022-05-09 2022-05-09 Engineering digital information analysis method based on computer algorithm model

Country Status (1)

Country Link
CN (1) CN114997278B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110131093A1 (en) * 2009-11-30 2011-06-02 Yahoo! Inc. System and method for optimizing selection of online advertisements
US20150127419A1 (en) * 2013-11-04 2015-05-07 Oracle International Corporation Item-to-item similarity generation
CN105930872A (en) * 2016-04-28 2016-09-07 上海应用技术学院 Bus driving state classification method based on class-similar binary tree support vector machine
CN108205805A (en) * 2016-12-20 2018-06-26 北京大学 The dense corresponding auto-creating method of voxel between pyramidal CT image
CN109753680A (en) * 2018-11-20 2019-05-14 南京南瑞集团公司 A kind of swarm of particles intelligent method based on chaos masking mechanism
CN110334108A (en) * 2019-06-18 2019-10-15 浙江工业大学 A kind of three-dimensional CAD model similarity calculation method based on discrete bat algorithm
CN111199154A (en) * 2019-12-20 2020-05-26 重庆邮电大学 Fault-tolerant rough set-based polysemous word expression method, system and medium
CN114022698A (en) * 2021-10-15 2022-02-08 华中科技大学 Multi-tag behavior identification method and device based on binary tree structure
CN114241381A (en) * 2021-12-17 2022-03-25 广东开放大学(广东理工职业学院) Event extraction and prediction method based on time sequence event and semantic background

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110131093A1 (en) * 2009-11-30 2011-06-02 Yahoo! Inc. System and method for optimizing selection of online advertisements
US20150127419A1 (en) * 2013-11-04 2015-05-07 Oracle International Corporation Item-to-item similarity generation
CN105930872A (en) * 2016-04-28 2016-09-07 上海应用技术学院 Bus driving state classification method based on class-similar binary tree support vector machine
CN108205805A (en) * 2016-12-20 2018-06-26 北京大学 The dense corresponding auto-creating method of voxel between pyramidal CT image
CN109753680A (en) * 2018-11-20 2019-05-14 南京南瑞集团公司 A kind of swarm of particles intelligent method based on chaos masking mechanism
CN110334108A (en) * 2019-06-18 2019-10-15 浙江工业大学 A kind of three-dimensional CAD model similarity calculation method based on discrete bat algorithm
CN111199154A (en) * 2019-12-20 2020-05-26 重庆邮电大学 Fault-tolerant rough set-based polysemous word expression method, system and medium
CN114022698A (en) * 2021-10-15 2022-02-08 华中科技大学 Multi-tag behavior identification method and device based on binary tree structure
CN114241381A (en) * 2021-12-17 2022-03-25 广东开放大学(广东理工职业学院) Event extraction and prediction method based on time sequence event and semantic background

Also Published As

Publication number Publication date
CN114997278B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
Prajwala A comparative study on decision tree and random forest using R tool
CN111008337B (en) Deep attention rumor identification method and device based on ternary characteristics
US8165979B2 (en) System and method for resource adaptive classification of data streams
US20080126556A1 (en) System and method for classifying data streams using high-order models
CN109218223B (en) Robust network traffic classification method and system based on active learning
CN114816909A (en) Real-time log detection early warning method and system based on machine learning
CN111343171B (en) Intrusion detection method based on mixed feature selection of support vector machine
CN107145516A (en) A kind of Text Clustering Method and system
CN112087447A (en) Rare attack-oriented network intrusion detection method
CN113922985A (en) Network intrusion detection method and system based on ensemble learning
CN110826617A (en) Situation element classification method and training method and device of model thereof, and server
CN111539444A (en) Gaussian mixture model method for modified mode recognition and statistical modeling
CN111507504A (en) Adaboost integrated learning power grid fault diagnosis system and method based on data resampling
CN105827603A (en) Inexplicit protocol feature library establishment method and device and inexplicit message classification method and device
CN113011889A (en) Account abnormity identification method, system, device, equipment and medium
CN113705099A (en) Social platform rumor detection model construction method and detection method based on contrast learning
CN112508726A (en) False public opinion identification system based on information spreading characteristics and processing method thereof
CN107832611B (en) Zombie program detection and classification method combining dynamic and static characteristics
CN112528554A (en) Data fusion method and system suitable for multi-launch multi-source rocket test data
CN111428821A (en) Asset classification method based on decision tree
CN114997278B (en) Engineering digital information analysis method based on computer algorithm model
Singh et al. Knowledge based retrieval scheme from big data for aviation industry
CN111931874B (en) Adjoint bait generation method and device based on deep learning and data clustering
Ravichandran et al. Comparative study on decision tree techniques for mobile call detail record
CN114004989A (en) Power safety early warning data clustering processing method based on improved K-means algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant