CN110378569A - Industrial relations chain building method, apparatus, equipment and storage medium - Google Patents
Industrial relations chain building method, apparatus, equipment and storage medium Download PDFInfo
- Publication number
- CN110378569A CN110378569A CN201910548138.8A CN201910548138A CN110378569A CN 110378569 A CN110378569 A CN 110378569A CN 201910548138 A CN201910548138 A CN 201910548138A CN 110378569 A CN110378569 A CN 110378569A
- Authority
- CN
- China
- Prior art keywords
- industry
- data
- industrial
- history
- list item
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/067—Enterprise or organisation modelling
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Theoretical Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to industry chain building, discloses a kind of industrial relations chain building method, apparatus, equipment and storage medium and history industry data are pre-processed this method comprises: reading history industry data to obtain effective industry data;Clustering is carried out to effective industry data using clustering algorithm, to obtain the industry data to be associated of different cluster classifications;Affairs type data are converted by industry data to be associated based on cluster classification, and inter-industry analysis is carried out to affairs type data using association analysis algorithm, obtain Industrial Correlation result;Industrial relations chain is constructed according to Industrial Correlation result, by being then based on history industry data, industry cluster and association mining are carried out to the industry data in history industry data by clustering, association analysis algorithm, all industries, industry broad covered area, while accuracy with higher can be covered to ensure that the industrial chain finally constructed almost.
Description
Technical field
The present invention relates to industrial chain technical field more particularly to a kind of industrial relations chain building method, apparatus, equipment and deposit
Storage media.
Background technique
Existing industrial chain modeling comes from according to each human subject in production, work, the medium big data content obtained of life
Corresponding big data model is so constructed, mainly industrial chain is excavated from the angle of enterprise, mostly according to the letter of enterprise's microcosmic point
Breath proposes industrial chain modeling and analysis method, lacks the concern to industry global index, or be directed to some specific industry
Analysis on Industry Chain is carried out, global analysis is not carried out to industry.In government regulation work, lacks whole effective industrial chain and know
Other method.Therefore, how industrial chain is comprehensively and effectively constructed, to realize the accurate positionin to Different Industries in industrial chain, at
For a urgent problem to be solved.
Above content is only used to facilitate the understanding of the technical scheme, and is not represented and is recognized that above content is existing skill
Art.
Summary of the invention
The main purpose of the present invention is to provide a kind of industrial relations chain building method, apparatus, equipment and storage medium,
The technical issues of aiming to solve the problem that the prior art can not comprehensively and effectively construct industrial chain, guaranteeing Industry positioning accuracy.
To achieve the above object, the present invention provides a kind of industrial relations chain building method, the method includes following steps
It is rapid:
History industry data are read, the history industry data are pre-processed to obtain effective industry data;
Clustering is carried out to effective industry data using clustering algorithm, to obtain the to be associated of different cluster classifications
Industry data;
Affairs type data are converted by the industry data to be associated based on the cluster classification, and are calculated using association analysis
Method carries out inter-industry analysis to the affairs type data, obtains Industrial Correlation result;
Industrial relations chain is constructed according to the Industrial Correlation result.
Preferably, the reading history industry data pre-process to obtain and effectively produce the history industry data
The step of industry data, comprising:
History industry data are read, the corresponding data type of all industry data in the history industry data is obtained;
The data type is detected, the target data type for being not belonging to preset data type is being detected the presence of
When, data type conversion is carried out to the corresponding Object Industry data of the target data type;
Using the history industry data after data type conversion as effective industry data.
Preferably, the history industry data using after data type conversion are also wrapped as the step of effective industry data
It includes:
Preset electronic table is written into history industry data after data type conversion, to obtain initial industry tables of data;
There are when missing values list item in detecting the initial industry tables of data, numerical value is carried out to the missing values list item
Filling, to obtain entire industries tables of data;
The numeric data stored in the entire industries tables of data is converted into the numeric data of default dimension to be had
Effect industry tables of data, and using the data in the valid data table as effective industry data.
Preferably, it is described in detecting the initial industry tables of data there are when missing values list item, to the missing values
List item carries out numerical value filling, the step of to obtain entire industries tables of data, comprising:
There are when missing values list item in detecting the initial industry tables of data, the missing values list item column is obtained
Adjacent list item before be worth after value and adjacent list item;
The average value being worth after value before the adjacent list item and the adjacent list item is calculated, and according to the average value to institute
It states missing values list item and carries out numerical value filling, to obtain entire industries tables of data.
Preferably, described that clustering is carried out to effective industry data using clustering algorithm, to obtain different clusters
The step of industry data to be associated of classification, comprising:
All industry data in effective industry data are gathered according to preset time granularity using clustering algorithm
Class obtains cluster data;
The similarity in effective industry data between each category of industry is determined according to the cluster data;
The category of industry that the similarity is more than preset threshold is determined as same cluster classification, traverses the cluster data
To obtain the industry data to be associated of different cluster classifications.
Preferably, described that affairs type data, and benefit are converted for the industry data to be associated based on the cluster classification
The step of are carried out by inter-industry analysis, obtains Industrial Correlation result for the affairs type data with association analysis algorithm, comprising:
The corresponding Transaction Identifier of each category of industry is generated based on the preset time granularity and the cluster classification;
The category of industry identified with same transaction is subjected to binding acquisition industry pair, and by the industry to preservation to thing
Business tables of data;
Inter-industry analysis is carried out to the affairs type data stored in the Transaction Information table using association analysis algorithm, is obtained
Take Industrial Correlation result.
Preferably, described the step of industrial relations chain is constructed according to the Industrial Correlation result, comprising:
Corresponding promotion degree between each industry pair for including in the Industrial Correlation result is read, the industry centering is at least
Including two category of industry;
Detect whether the promotion degree is more than preset value, determines the industry to for association industry pair if being more than;
The benchmark industry of the association industry centering is determined to corresponding support and confidence level according to the association industry
Classification and preposition category of industry;
Industrial relations chain is determined to corresponding benchmark category of industry and preposition category of industry according to each association industry.
In addition, to achieve the above object, the present invention also proposes a kind of industrial relations chain building device, described device includes:
Data processing module pre-processes to obtain the history industry data for reading history industry data
Effective industry data;
Data clusters module, for carrying out clustering to effective industry data using clustering algorithm, to obtain not
With the industry data to be associated of cluster classification;
Association analysis module, for converting affairs type number for the industry data to be associated based on the cluster classification
According to, and inter-industry analysis is carried out to the affairs type data using association analysis algorithm, obtain Industrial Correlation result;
Industrial Correlation module, for constructing industrial relations chain according to the Industrial Correlation result.
In addition, to achieve the above object, the present invention also proposes that a kind of industrial relations chain building equipment, the equipment include:
Memory, processor and the industrial relations chain building program that is stored on the memory and can run on the processor,
The industrial relations chain building program is arranged for carrying out the step of industrial relations chain building method as described above.
In addition, to achieve the above object, the present invention also proposes a kind of storage medium, industry is stored on the storage medium
Relation chain construction procedures, the industrial relations chain building program realize industrial relations chain as described above when being executed by processor
The step of construction method.
The present invention pre-processes history industry data by reading history industry data to obtain effective industry number
According to;Clustering is carried out to effective industry data using clustering algorithm, to obtain the industry data to be associated of different cluster classifications;
Convert affairs type data for industry data to be associated based on cluster classification, and using association analysis algorithm to affairs type data into
Row inter-industry analysis obtains Industrial Correlation result;Industrial relations chain is constructed according to Industrial Correlation result, by being then based on history
Industry data, by clustering, association analysis algorithm come in history industry data industry data carry out industry cluster with
And association mining, to ensure that the industrial chain broad covered area finally constructed, while accuracy with higher.
Detailed description of the invention
Fig. 1 is the structural representation of the industrial relations chain building equipment for the hardware running environment that the embodiment of the present invention is related to
Figure;
Fig. 2 is the flow diagram of industrial relations chain building method first embodiment of the present invention;
Fig. 3 is the flow diagram of industrial relations chain building method second embodiment of the present invention;
Fig. 4 is the flow diagram of industrial relations chain building method 3rd embodiment of the present invention;
Fig. 5 is the structural block diagram of industrial relations chain building device first embodiment of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that described herein, specific examples are only used to explain the present invention, is not intended to limit the present invention.
Referring to Fig.1, Fig. 1 is the industrial relations chain building equipment knot for the hardware running environment that the embodiment of the present invention is related to
Structure schematic diagram.
As shown in Figure 1, the industrial relations chain building equipment may include: processor 1001, such as central processing unit
(Central Processing Unit, CPU), communication bus 1002, user interface 1003, network interface 1004, memory
1005.Wherein, communication bus 1002 is for realizing the connection communication between these components.User interface 1003 may include display
Shield (Display), input unit such as keyboard (Keyboard), optional user interface 1003 can also include that the wired of standard connects
Mouth, wireless interface.Network interface 1004 optionally may include standard wireline interface and wireless interface (such as Wireless Fidelity
(WIreless-FIdelity, WI-FI) interface).Memory 1005 can be the random access memory (Random of high speed
Access Memory, RAM) memory, be also possible to stable nonvolatile memory (Non-Volatile Memory,
), such as magnetic disk storage NVM.Memory 1005 optionally can also be the storage device independently of aforementioned processor 1001.
It will be understood by those skilled in the art that structure shown in Fig. 1 is not constituted to industrial relations chain building equipment
It limits, may include perhaps combining certain components or different component layouts than illustrating more or fewer components.
As shown in Figure 1, as may include operating system, data storage mould in a kind of memory 1005 of storage medium
Block, network communication module, Subscriber Interface Module SIM and industrial relations chain building program.
In industrial relations chain building equipment shown in Fig. 1, network interface 1004 is mainly used for carrying out with network server
Data communication;User interface 1003 is mainly used for carrying out data interaction with user;In industrial relations chain building equipment of the present invention
Processor 1001, memory 1005 can be set in industrial relations chain building equipment, and the industrial relations chain building equipment is logical
It crosses processor 1001 and calls the industrial relations chain building program stored in memory 1005, and execute provided in an embodiment of the present invention
Industrial relations chain building method.
The embodiment of the invention provides a kind of industrial relations chain building methods, are industrial relations of the present invention referring to Fig. 2, Fig. 2
The flow diagram of chain building method first embodiment.
In the present embodiment, the industrial relations chain building method the following steps are included:
Step S10: history industry data are read, the history industry data are pre-processed to obtain effective industry number
According to;
It is set it should be noted that the executing subject of the method for the present invention can be calculating service having data processing function
It is standby, such as smart phone, tablet computer, PC terminal device (hereinafter referred to as modeling terminal).The history industry data
Industry data information including different dimensions different year, as industry title, the time, season (or monthly), regional GDP,
Speedup is year-on-year, new registration enterprise number, the above enterprise's number of scale, scale value added above, the range of loss, total profit, finished product are deposited
Goods etc..Under normal conditions, which saves in a tabular form.As shown in table 1 below, table 1 is history industry tables of data.
1 history industry tables of data of table
In the concrete realization, modeling terminal can obtain the history industry tables of data prestored from database, then from described
History industry data are read in history industry tables of data, and then the history industry data are pre-processed to obtain and effectively produce
Industry data.
Wherein, the pretreatment carries out data type conversion, missing values to the industry data in history industry tables of data
The processing such as filling and/or dimension standardization.The data type conversion, i.e., be converted to numeric data for string data;
" blank " list item that do not fill in above table is filled by the Missing Data Filling;The dimension standardization, i.e., by institute
There is the numeric data of different dimensions unified to identical dimension.
Step S20: clustering is carried out to effective industry data using clustering algorithm, to obtain different cluster classifications
Industry data to be associated;
It should be noted that the clustering algorithm, that is, k- mean algorithm (k-means algorithm) is that input is poly- in the present embodiment
Class number k, and the database comprising n data object, output meet a kind of algorithm of variance minimum sandards k cluster.k-
Means algorithm receives input quantity k;Then n data object is divided into k cluster to meet cluster obtained:
Object similarity in same cluster is higher;And the object similarity in different clusters is smaller.
In addition, the cluster classification is the classification logotype being calculated by clustering algorithm, with identical cluster classification
There are higher similarities, such as agricultural and textile industry, agricultural and textile industry etc. for the type of industry.
By above-mentioned table 1 it is found that due to the industry data in history industry tables of data be all with the time (as year, season or
It is monthly) it is what unit was counted.Therefore in the present embodiment, modeling terminal can be using K-means algorithm respectively to history industry
Data carry out clustering according to preset time granularity (such as year or season), to obtain the production to be associated of different cluster classifications
Industry data.For example, temporally granularity " the 2017 annual first quarter " (can be industrial sectors of national economy classification to all type of industry
In 97 major class or public document in common industrial classification bore, such as 7 great strategy new industries, 5 big Mirae Corp.
Deng) carry out clustering.
In the concrete realization, modeling terminal can be using clustering algorithm to all industry data in effective industry data
It is clustered according to preset time granularity, obtains cluster data;Then effective industry number is determined according to the cluster data
Similarity between each category of industry;The category of industry that the similarity is more than preset threshold is determined as same cluster again
Classification traverses the cluster data finally to obtain the industry data to be associated of different cluster classifications.
Further, in this embodiment terminal is modeled after the industry data to be associated for getting different cluster classifications,
It can be stored in a tabular form.As shown in table 2 below, table 2 is the corresponding industry to be associated of the industry data to be associated
Tables of data.
The industry tables of data to be associated of table 2
Category of industry | Year | Season | Cluster classification |
Agricultural | 2017 | 1 | 1 |
Fishery | 2017 | 1 | 1 |
Real estate | 2017 | 1 | 3 |
…… | …… | …… | …… |
Step S30: affairs type data are converted for the industry data to be associated based on the cluster classification, and utilize pass
Join parser and inter-industry analysis is carried out to the affairs type data, obtains Industrial Correlation result;
It should be understood that the association analysis algorithm (association analysis) is also known as association mining, refer in transaction data, relationship
In data or other information carrier, search be present in frequent mode between project set or object set, association, correlation or
Causal structure.
In this step, modeling terminal is first based on the cluster classification and converts affairs type number for the industry data to be associated
According to then to affairs type data progress inter-industry analysis, acquisition Industrial Correlation result.
It is corresponded to specifically, modeling terminal can generate each category of industry based on the preset time granularity and the cluster classification
Transaction Identifier;The category of industry identified with same transaction is subjected to binding acquisition industry pair, and by the industry to preservation
To Transaction Information table;Industrial Correlation point is carried out to the affairs type data stored in the Transaction Information table using association analysis algorithm
Analysis obtains Industrial Correlation result.
Further, in this embodiment the Transaction Identifier can be according to default generation strategy " time granularity (year/
Season)+cluster classification " the mark character that generates, such as set Transaction Identifier as the number of 8 digits, according to digit slave height to
Low is successively preceding 4 bit digital for year, and 5-6 bit digital is season (or monthly), and 7-8 bit digital is cluster classification, for
0 filling of front if being unsatisfactory for 2 of season (or monthly) and/or cluster classification, such as: the first quarter in 2017 clusters 1 affairs
Mark are as follows: 20170101.
In the concrete realization, modeling terminal can will have phase after getting the corresponding Transaction Identifier of each category of industry
Category of industry with Transaction Identifier carries out binding acquisition industry pair, and by the industry to preservation to Transaction Information table.Such as 3 institute of table
Show, table 3 is Transaction Information table.
3 Transaction Information table of table
Affairs ID | Affairs ID |
20170101 | Agricultural, fishery ... |
20170103 | Real estate ... |
…… | …… |
It should be noted that the i.e. described Transaction Identifier of affairs ID (Identification) described in above-mentioned table 3.The production
Industry pair, i.e., (agricultural, fishery) in the corresponding associated category of industry of same Transaction Identifier, such as upper table 3 is industry pair.
Further, modeling terminal is after getting the Transaction Information table, i.e., using association analysis algorithm to described
The affairs type data stored in Transaction Information table carry out inter-industry analysis, obtain Industrial Correlation result.
Step S40: industrial relations chain is constructed according to the Industrial Correlation result.
It should be understood that the method for association analysis can be found that the connection between affairs, such as correlation rule or frequent item set.
A correlation rule is measured using three indexs under normal conditions, these three indexs are respectively: support, confidence level and promotion
Degree.Wherein, Support (support): indicating while including that the affairs of A and B account for the ratio of all affairs.If indicated with P (A)
Using the ratio of A affairs, then Support=P (A&B);Confidence (confidence level): it indicates using in the affairs comprising A
It simultaneously include the ratio of B affairs, i.e., the affairs comprising A and B account for the ratio comprising A affairs simultaneously.Formula expression: Confidence
=P (A&B)/P (A);Lift (promotion degree): indicating " simultaneously including the ratio of B affairs in the affairs comprising A " and " includes B affairs
Ratio " ratio.Formula expression: Lift=(P (A&B)/P (A))/P (B)=P (A&B)/P (A)/P (B).The reflection of promotion degree
The correlation of A and B in correlation rule, promotion degree>1 and it is higher show that positive correlation is higher, promotion degree<1 and lower show
Negative correlation is higher, and promotion degree=1 shows no correlation.
In the concrete realization, modeling terminal carries out inter-industry analysis to the affairs type data stored in Transaction Information table
Afterwards, the Industrial Correlation result got can be as shown in table 4 below, and table 4 is the association list.
4 association list of table
It should be noted that every a line correlation rule is indicated to meet support be ɑ, confidence level β, is mentioned in above-mentioned table 4
Under the premise of liter degree is θ, the category of industry B with incidence relation can be derived by category of industry A.Wherein, ɑ and β value is 0
To between 1, using category of industry B as benchmark category of industry in the case where, category of industry A is then preposition category of industry.
In the concrete realization, the present embodiment models terminal after getting the Industrial Correlation result, that is, can be read described
Corresponding promotion degree between each industry pair for including in Industrial Correlation result, the industry centering include at least two industry classes
Not;Detect whether the promotion degree is more than preset value (being traditionally arranged to be 1), determines the industry to for association industry if being more than
It is right;Then the benchmark industry class of the association industry centering is determined to corresponding support and confidence level according to the association industry
Other and preposition category of industry;Industry is finally determined to corresponding benchmark category of industry and preposition category of industry according to each association industry
Relation chain.
Furthermore it should be noted that some preposition category of industry may also be the corresponding base of another preposition category of industry
Quasi- category of industry, such as " agricultural " are the preposition category of industry of " textile industry ", and the benchmark of " agricultural " " being wood-processing industry "
Category of industry, therefore, the corresponding industrial relations chain of this three are just " wood-processing industry=> agricultural=> textile industry ";But practical feelings
In condition, between associated industry pair can preposition category of industry each other, such as in industry between " agricultural, textile industry ", currently
Set category of industry be " agricultural " when, then " textile industry " be benchmark category of industry;When preposition category of industry is " textile industry ", then
" agricultural " is benchmark category of industry.Therefore, the corresponding industry of " agricultural, the textile industry, wood-processing industry " three finally obtained is closed
Tethers is " wood-processing industry=> agricultural≤> textile industry ".
The present embodiment pre-processes history industry data by reading history industry data to obtain effective industry number
According to;Clustering is carried out to effective industry data using clustering algorithm, to obtain the industry data to be associated of different cluster classifications;
Convert affairs type data for industry data to be associated based on cluster classification, and using association analysis algorithm to affairs type data into
Row inter-industry analysis obtains Industrial Correlation result;Industrial relations chain is constructed according to Industrial Correlation result, by being then based on history
Industry data, by clustering, association analysis algorithm come in history industry data industry data carry out industry cluster with
And association mining, to ensure that the industrial chain broad covered area finally constructed, while accuracy with higher.
With reference to Fig. 3, Fig. 3 is the flow diagram of industrial relations chain building method second embodiment of the present invention.
Based on above-mentioned first embodiment, in the present embodiment, the step S10 includes:
Step S101: reading history industry data, obtains the corresponding number of all industry data in the history industry data
According to type;
It should be understood that the data type that terminal saves data is not identical due under different scenes, modeling terminal from
In the history industry data read in exterior terminal also and not all data (such as Time of Day etc.) relevant to numerical value all
It is being stored with the format of numeric data (such as 2017,2018,1,2,3), therefore models terminal and need to these non-numbers
The data of Value Data type format, to guarantee the unification of data format.
It will be appreciated that data type, that is, data format described in the present embodiment (data format), is that description data are protected
There are the rules in file or record, can be the text formatting of character style or the compressed format of binary data form.
In the concrete realization, modeling terminal can call parseInt () function by the JavaScript script write in advance
The corresponding data format of all industry data is obtained, the data then formatted to needs format.
Step S102: detecting the data type, is detecting the presence of the target for being not belonging to preset data type
When data type, data type conversion is carried out to the corresponding Object Industry data of the target data type;
It should be noted that the preset data type can be preset data format, described in the present embodiment
Preset data type is preferably numeric data.
Specifically, modeling terminal the data type got can be detected, detect the presence of be not belonging to it is described pre-
If target data type (such as the character string class of binary system, octal system, hexadecimal or other any systems of data type
Type) when, call parseInt () function that the corresponding Object Industry data of target data type are converted to numeric data type
Data.
Step S103: using the history industry data after data type conversion as effective industry data.
In the concrete realization, string data all in history industry data is being converted to numeric data by modeling terminal
It afterwards, can be using the history industry data after data type conversion as effective industry data, for subsequent carry out clustering.
The present embodiment is initial to obtain by the way that preset electronic table is written in the history industry data after data type conversion
Industry tables of data;There are when missing values list item in detecting initial industry tables of data, numerical value filling is carried out to missing values list item,
To obtain entire industries tables of data;The numeric data stored in entire industries tables of data is converted to the numeric data of default dimension
To obtain effective industry tables of data, and using the data in valid data table as effective industry data, so that clustering
Targeted industry data dimension is unified, accuracy is high, ensure that the efficiency of clustering, improves the accurate of analysis result
Degree.
With reference to Fig. 4, Fig. 4 is the flow diagram of industrial relations chain building method 3rd embodiment of the present invention.
Based on the various embodiments described above, in the present embodiment, the step S103 may particularly include following steps:
Step S1030: being written preset electronic table for the history industry data after data type conversion, to obtain initial produce
Industry tables of data;
It should be understood that the preset electronic table (Spreadsheet), also known as spreadsheet, be an analoglike paper
The computer program of upper computation sheet can show a series of grid being made of row and columns.Can be stored in cell numerical value,
Calculating formula or text.Preset electronic table described in the present embodiment is preferably Excel table.
In the concrete realization, modeling terminal can count these after carrying out data type conversion to history industry data
It is written in Excel table according to the list item (such as industry title, time, season, regional GDP etc.) according to setting, to obtain
Take initial industry tables of data.
Step S1031: there are when missing values list item in detecting the initial industry tables of data, to the missing values table
Item carries out numerical value filling, to obtain entire industries tables of data;
It should be noted that in the missing values list item, that is, initiating electron table vacancy or non-fill substance list item,
For the smooth building for guaranteeing Follow-up Industry chain, there is missing in modeling terminal in detecting initial industry tables of data in the present embodiment
When being worth list item, numerical value filling will be carried out to the missing values list item.Specifically, modeling terminal is detecting the initial industry number
After in table there are value and adjacent list item when missing values list item, is obtained before the adjacent list item of the missing values list item column
Value;The average value being worth after value before the adjacent list item and the adjacent list item is calculated, and is lacked according to the average value to described
Mistake value list item carries out numerical value filling, to obtain entire industries tables of data.
It should be understood that guarantee the accuracy when filling of missing values list item, it is contemplated that under normal conditions, in all kinds of tables
The corresponding dimension of the data of each column or type are identical, and terminal is modeled in the present embodiment and is detecting the initial industry tables of data
In there are when missing values list item, obtain to be worth after value and adjacent list item before the adjacent list item of the missing values list item column;Meter
The average value being worth after value before the adjacent list item and the adjacent list item is calculated, and according to the average value to the missing values table
Numerical value filling is carried out, such as corresponding regional GDP in 2017 second quarter in time of the type of industry " agricultural " is missing values list item,
Modeling terminal knows that " agricultural " 2017 first quarter in time and corresponding regional GDP in the third quarter are respectively 20,000,000,000 Hes by inquiry
30000000000, it at this time then can be by calculating the average regional GDP " (200+300)/2=,250 hundred million " in the first quarter and the third quarter, then
By 25,000,000,000 as corresponding regional GDP in " agricultural " 2017 second quarter in time.
Certainly, in the present embodiment, the filling of the missing values can also be and obtain the missing values list item column pair
The permutation average value answered, then using the permutation average value as the missing values carry out numerical value filling, be also possible to directly with 0 value come
Instead of, specific filling mode, the present embodiment is without restriction.
Step S1032: the numeric data stored in the entire industries tables of data is converted to the numerical value number of default dimension
Effective industry tables of data is obtained accordingly, and using the data in the valid data table as effective industry data.
It should be understood that the dimension or unit of the data of different dimensions are not to the utmost due to the data dimension multiplicity of industry data
It is identical, such as regional GDP is 0-1000 hundred million, the above enterprise's number of scale is 0-100 etc., and the corresponding dimension of the two is not
Identical, for the computational efficiency for improving modeling terminal, modeling terminal will also be to storing in entire industries tables of data in the present embodiment
Numeric data carries out dimension conversion, they is uniformly arrived identical default dimension (such as mean value is 0, the range that variance is 1), and or
It is by each columns Value Data by sorting from small to large, then waits frequency divisions case to several sections, the unified number in each section
Word indicates that the frequency such as such as one group of number of the enterprise data (21,23,24,26,32,38,40,45) is mapped to 4 sections, can use 1,
2,3,4, it indicates, it may be assumed that 21=1;23=1;24=2;26=2;32=3;38=3;40=4;45=4 obtain (21,23,24,
26,32,38,40,45)=(1,1,2,2,3,3,4,4).
In the concrete realization, the numeric data stored in entire industries tables of data can be converted to default dimension by modeling terminal
Numeric data to obtain effective industry tables of data, then using the data in valid data table as effective industry data carry out after
Continuous clustering.
The present embodiment is by, there are when missing values list item, obtaining missing values list item institute in detecting initial industry tables of data
It is worth after value and adjacent list item before the adjacent list item of column;The average value being worth after value before adjacent list item and adjacent list item is calculated,
And numerical value filling is carried out to the missing values list item according to average value, to obtain entire industries tables of data.The present embodiment is for number
According to missing values list item present in table, it is filled out by calculating the average value of the adjacent list item of missing values list item column
It fills, ensure that the accuracy of data filling, reduce accidental error.
In addition, the embodiment of the present invention also proposes a kind of storage medium, industrial relations chain structure is stored on the storage medium
Program is built, the industrial relations chain building program realizes industrial relations chain building method as described above when being executed by processor
The step of.
It is the structural block diagram of industrial relations chain building device first embodiment of the present invention referring to Fig. 5, Fig. 5.
As shown in figure 5, the industrial relations chain building device that the embodiment of the present invention proposes includes:
Data processing module 501 pre-processes to obtain the history industry data for reading history industry data
Take effective industry data;
Data clusters module 502, for carrying out clustering to effective industry data using clustering algorithm, to obtain
The industry data to be associated of difference cluster classification;
Association analysis module 503, for converting affairs type for the industry data to be associated based on the cluster classification
Data, and inter-industry analysis is carried out to the affairs type data using association analysis algorithm, obtain Industrial Correlation result;
Industrial Correlation module 504, for constructing industrial relations chain according to the Industrial Correlation result.
The present embodiment pre-processes history industry data by reading history industry data to obtain effective industry number
According to;Clustering is carried out to effective industry data using clustering algorithm, to obtain the industry data to be associated of different cluster classifications;
Convert affairs type data for industry data to be associated based on cluster classification, and using association analysis algorithm to affairs type data into
Row inter-industry analysis obtains Industrial Correlation result;Industrial relations chain is constructed according to Industrial Correlation result, by being then based on history
Industry data, by clustering, association analysis algorithm come in history industry data industry data carry out industry cluster with
And association mining, to ensure that the industrial chain broad covered area finally constructed, while accuracy with higher.
Based on the above-mentioned industrial relations chain building device first embodiment of the present invention, industrial relations chain building dress of the present invention is proposed
The second embodiment set.
In the present embodiment, the data processing module 501 is also used to read history industry data, obtains the history
The corresponding data type of all industry data in industry data;The data type is detected, is not belonged to detecting the presence of
When the target data type of preset data type, data class is carried out to the corresponding Object Industry data of the target data type
Type conversion;Using the history industry data after data type conversion as effective industry data.
Further, the data processing module 501 is also used to the history industry data write-in after data type conversion
Preset electronic table, to obtain initial industry tables of data;There are missing values list items in detecting the initial industry tables of data
When, numerical value filling is carried out to the missing values list item, to obtain entire industries tables of data;It will be deposited in the entire industries tables of data
The numeric data put is converted to the numeric data of default dimension to obtain effective industry tables of data, and will be in the valid data table
Data as effective industry data.
Further, the data processing module 501 is also used to exist in detecting the initial industry tables of data scarce
When mistake value list item, obtains and be worth before the adjacent list item of the missing values list item column after value and adjacent list item;Calculate the phase
The average value being worth after value and the adjacent list item before adjacent list item, and the missing values list item is counted according to the average value
Value filling, to obtain entire industries tables of data.
Further, the data clusters module 502 is also used to using clustering algorithm in effective industry data
All industry data are clustered according to preset time granularity, obtain cluster data;Have according to cluster data determination
Similarity in effect industry data between each category of industry;The category of industry that the similarity is more than preset threshold is determined as together
One cluster classification traverses the cluster data to obtain the industry data to be associated of different cluster classifications.
Further, the association analysis module 503 is also used to based on the preset time granularity and the cluster classification
Generate the corresponding Transaction Identifier of each category of industry;The category of industry identified with same transaction is subjected to binding acquisition industry pair,
And by the industry to preservation to Transaction Information table;Using association analysis algorithm to the affairs type stored in the Transaction Information table
Data carry out inter-industry analysis, obtain Industrial Correlation result.
Further, the Industrial Correlation module 504 is also used to read each industry in the Industrial Correlation result included
The corresponding promotion degree between, the industry centering include at least two category of industry;Detect whether the promotion degree is more than pre-
If value, determine the industry to for association industry pair if being more than;According to the association industry to corresponding support and confidence
Degree determines the benchmark category of industry and preposition category of industry of the association industry centering;According to each association industry to corresponding benchmark
Category of industry and preposition category of industry determine industrial relations chain.
It is real that the other embodiments or specific implementation of industrial relations chain building device of the present invention can refer to above-mentioned each method
Example is applied, details are not described herein again.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row
His property includes, so that the process, method, article or the system that include a series of elements not only include those elements, and
And further include other elements that are not explicitly listed, or further include for this process, method, article or system institute it is intrinsic
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do
There is also other identical elements in the process, method of element, article or system.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases
The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art
The part contributed out can be embodied in the form of software products, which is stored in a storage medium
In (such as read-only memory/random access memory, magnetic disk, CD), including some instructions are used so that a terminal device (can
To be mobile phone, computer, server, air conditioner or the network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair
Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills
Art field, is included within the scope of the present invention.
Claims (10)
1. a kind of industrial relations chain building method, which is characterized in that the described method includes:
History industry data are read, the history industry data are pre-processed to obtain effective industry data;
Clustering is carried out to effective industry data using clustering algorithm, to obtain the industry to be associated of different cluster classifications
Data;
Affairs type data are converted by the industry data to be associated based on the cluster classification, and utilize association analysis algorithm pair
The affairs type data carry out inter-industry analysis, obtain Industrial Correlation result;
Industrial relations chain is constructed according to the Industrial Correlation result.
2. the method as described in claim 1, which is characterized in that the reading history industry data, to the history industry number
The step of according to being pre-processed to obtain effective industry data, comprising:
History industry data are read, the corresponding data type of all industry data in the history industry data is obtained;
The data type is detected, it is right when detecting the presence of the target data type for being not belonging to preset data type
The corresponding Object Industry data of the target data type carry out data type conversion;
Using the history industry data after data type conversion as effective industry data.
3. method according to claim 2, which is characterized in that the history industry data using after data type conversion as
The step of effective industry data, further includes:
Preset electronic table is written into history industry data after data type conversion, to obtain initial industry tables of data;
There are when missing values list item in detecting the initial industry tables of data, numerical value is carried out to the missing values list item and is filled out
It fills, to obtain entire industries tables of data;
The numeric data stored in the entire industries tables of data is converted into the numeric data of default dimension to obtain effective production
Industry tables of data, and using the data in the valid data table as effective industry data.
4. method as claimed in claim 3, which is characterized in that it is described in detecting the initial industry tables of data exist lack
When mistake value list item, numerical value filling is carried out to the missing values list item, the step of to obtain entire industries tables of data, comprising:
There are when missing values list item in detecting the initial industry tables of data, the phase of the missing values list item column is obtained
It is worth after value and adjacent list item before adjacent list item;
The average value being worth after value before the adjacent list item and the adjacent list item is calculated, and is lacked according to the average value to described
Mistake value list item carries out numerical value filling, to obtain entire industries tables of data.
5. method as claimed in claim 4, which is characterized in that described to be carried out using clustering algorithm to effective industry data
Clustering, with obtain it is different cluster classifications industry data to be associated the step of, comprising:
All industry data in effective industry data are clustered according to preset time granularity using clustering algorithm, are obtained
Obtain cluster data;
The similarity in effective industry data between each category of industry is determined according to the cluster data;
The category of industry that the similarity is more than preset threshold is determined as same cluster classification, traverses the cluster data to obtain
Take the industry data to be associated of different cluster classifications.
6. method as claimed in claim 5, which is characterized in that described to be based on the cluster classification for the industry number to be associated
According to being converted into affairs type data, and inter-industry analysis is carried out to the affairs type data using association analysis algorithm, obtains and produce
The step of industry association results, comprising:
The corresponding Transaction Identifier of each category of industry is generated based on the preset time granularity and the cluster classification;
The category of industry identified with same transaction is subjected to binding acquisition industry pair, and by the industry to preservation to number of transactions
According to table;
Inter-industry analysis is carried out to the affairs type data stored in the Transaction Information table using association analysis algorithm, obtains and produces
Industry association results.
7. such as method as claimed in any one of claims 1 to 6, which is characterized in that described to be constructed according to the Industrial Correlation result
The step of industrial relations chain, comprising:
Corresponding promotion degree between each industry pair for including in the Industrial Correlation result is read, the industry centering includes at least
Two category of industry;
Detect whether the promotion degree is more than preset value, determines the industry to for association industry pair if being more than;
The benchmark category of industry of the association industry centering is determined to corresponding support and confidence level according to the association industry
With preposition category of industry;
Industrial relations chain is determined to corresponding benchmark category of industry and preposition category of industry according to each association industry.
8. a kind of industrial relations chain building device, which is characterized in that described device includes:
Data processing module pre-processes the history industry data effective to obtain for reading history industry data
Industry data;
Data clusters module, it is different poly- to obtain for carrying out clustering to effective industry data using clustering algorithm
The industry data to be associated of class classification;
Association analysis module, for converting affairs type data for the industry data to be associated based on the cluster classification, and
Inter-industry analysis is carried out to the affairs type data using association analysis algorithm, obtains Industrial Correlation result;
Industrial Correlation module, for constructing industrial relations chain according to the Industrial Correlation result.
9. a kind of industrial relations chain building equipment, which is characterized in that the equipment includes: memory, processor and is stored in institute
The industrial relations chain building program stated on memory and can run on the processor, the industrial relations chain building program are matched
It is set to the step of realizing the industrial relations chain building method as described in any one of claims 1 to 7.
10. a kind of storage medium, which is characterized in that be stored with industrial relations chain building program, the production on the storage medium
Industry relation chain construction procedures realize industrial relations chain building side as described in any one of claim 1 to 7 when being executed by processor
The step of method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910548138.8A CN110378569A (en) | 2019-06-19 | 2019-06-19 | Industrial relations chain building method, apparatus, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910548138.8A CN110378569A (en) | 2019-06-19 | 2019-06-19 | Industrial relations chain building method, apparatus, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110378569A true CN110378569A (en) | 2019-10-25 |
Family
ID=68250604
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910548138.8A Pending CN110378569A (en) | 2019-06-19 | 2019-06-19 | Industrial relations chain building method, apparatus, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110378569A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111209397A (en) * | 2019-12-30 | 2020-05-29 | 中伯伦(北京)信息技术有限公司 | Method for determining enterprise industry category |
CN111784150A (en) * | 2020-06-29 | 2020-10-16 | 中电工业互联网有限公司 | System and method for generating industrial chain overview chart |
CN112487021A (en) * | 2020-11-26 | 2021-03-12 | 中国人寿保险股份有限公司 | Correlation analysis method, device and equipment for business data |
CN112487021B (en) * | 2020-11-26 | 2024-04-30 | 中国人寿保险股份有限公司 | Correlation analysis method, device and equipment of business data |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090112825A1 (en) * | 2007-10-31 | 2009-04-30 | Nec (China) Co., Ltd | Entity relation mining apparatus and method |
CN105786860A (en) * | 2014-12-23 | 2016-07-20 | 华为技术有限公司 | Data processing method and device in data modeling |
CN103353880B (en) * | 2013-06-20 | 2017-03-15 | 兰州交通大学 | A kind of utilization distinctiveness ratio cluster and the data digging method for associating |
CN106802936A (en) * | 2016-12-29 | 2017-06-06 | 桂林电子科技大学 | A kind of data digging method based on item collection entropy |
CN107103050A (en) * | 2017-03-31 | 2017-08-29 | 海通安恒(大连)大数据科技有限公司 | A kind of big data Modeling Platform and method |
CN107342976A (en) * | 2017-05-18 | 2017-11-10 | 辛柯俊 | For the mobile solution platform and method of enterprise's Analysis on Industry Chain |
CN107609105A (en) * | 2017-09-12 | 2018-01-19 | 电子科技大学 | The construction method of big data accelerating structure |
CN107844496A (en) * | 2016-09-19 | 2018-03-27 | 阿里巴巴集团控股有限公司 | Statistical information output intent and device |
CN108334954A (en) * | 2018-01-22 | 2018-07-27 | 中国平安人寿保险股份有限公司 | Construction method, device, storage medium and the terminal of Logic Regression Models |
-
2019
- 2019-06-19 CN CN201910548138.8A patent/CN110378569A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090112825A1 (en) * | 2007-10-31 | 2009-04-30 | Nec (China) Co., Ltd | Entity relation mining apparatus and method |
CN103353880B (en) * | 2013-06-20 | 2017-03-15 | 兰州交通大学 | A kind of utilization distinctiveness ratio cluster and the data digging method for associating |
CN105786860A (en) * | 2014-12-23 | 2016-07-20 | 华为技术有限公司 | Data processing method and device in data modeling |
CN107844496A (en) * | 2016-09-19 | 2018-03-27 | 阿里巴巴集团控股有限公司 | Statistical information output intent and device |
CN106802936A (en) * | 2016-12-29 | 2017-06-06 | 桂林电子科技大学 | A kind of data digging method based on item collection entropy |
CN107103050A (en) * | 2017-03-31 | 2017-08-29 | 海通安恒(大连)大数据科技有限公司 | A kind of big data Modeling Platform and method |
CN107342976A (en) * | 2017-05-18 | 2017-11-10 | 辛柯俊 | For the mobile solution platform and method of enterprise's Analysis on Industry Chain |
CN107609105A (en) * | 2017-09-12 | 2018-01-19 | 电子科技大学 | The construction method of big data accelerating structure |
CN108334954A (en) * | 2018-01-22 | 2018-07-27 | 中国平安人寿保险股份有限公司 | Construction method, device, storage medium and the terminal of Logic Regression Models |
Non-Patent Citations (2)
Title |
---|
董红斌: "《协同演化算法及其在数据挖掘中的应用》", 31 July 2008, 中国水利水电出版社 * |
赵炳新: "产业关联分析中的图论模型及应用研究", 《系统工程理论与实践》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111209397A (en) * | 2019-12-30 | 2020-05-29 | 中伯伦(北京)信息技术有限公司 | Method for determining enterprise industry category |
CN111784150A (en) * | 2020-06-29 | 2020-10-16 | 中电工业互联网有限公司 | System and method for generating industrial chain overview chart |
CN112487021A (en) * | 2020-11-26 | 2021-03-12 | 中国人寿保险股份有限公司 | Correlation analysis method, device and equipment for business data |
CN112487021B (en) * | 2020-11-26 | 2024-04-30 | 中国人寿保险股份有限公司 | Correlation analysis method, device and equipment of business data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109948641B (en) | Abnormal group identification method and device | |
CN110457302B (en) | Intelligent structured data cleaning method | |
Altuntas et al. | Analysis of patent documents with weighted association rules | |
CN107622326B (en) | User classification and available resource prediction method, device and equipment | |
CN109740642A (en) | Invoice category recognition methods, device, electronic equipment and readable storage medium storing program for executing | |
CN111159428A (en) | Method and device for automatically extracting event relation of knowledge graph in economic field | |
CN111143578A (en) | Method, device and processor for extracting event relation based on neural network | |
CN106294128B (en) | A kind of automated testing method and device exporting report data | |
CN115203167A (en) | Data detection method and device, computer equipment and storage medium | |
CN110276382A (en) | Listener clustering method, apparatus and medium based on spectral clustering | |
CN110378569A (en) | Industrial relations chain building method, apparatus, equipment and storage medium | |
CN113538154A (en) | Risk object identification method and device, storage medium and electronic equipment | |
CN107729330B (en) | Method and apparatus for acquiring data set | |
JPWO2017158802A1 (en) | Data conversion system and data conversion method | |
CN105589900A (en) | Data mining method based on multi-dimensional analysis | |
CN109144999B (en) | Data positioning method, device, storage medium and program product | |
CN114139490B (en) | Method, device and equipment for automatic data preprocessing | |
CN109614416A (en) | A kind of invoice management method and device based on data statistic analysis | |
CN113642291B (en) | Method, system, storage medium and terminal for constructing logical structure tree reported by listed companies | |
CN114021716A (en) | Model training method and system and electronic equipment | |
KR102110350B1 (en) | Domain classifying device and method for non-standardized databases | |
CN110765100B (en) | Label generation method and device, computer readable storage medium and server | |
CN115617790A (en) | Data warehouse creation method, electronic device and storage medium | |
CN112732891A (en) | Office course recommendation method and device, electronic equipment and medium | |
CN111723129A (en) | Report generation method, report generation device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191025 |