CN108764717A - Data processing method and device applied to industry big data analysis - Google Patents

Data processing method and device applied to industry big data analysis Download PDF

Info

Publication number
CN108764717A
CN108764717A CN201810522737.8A CN201810522737A CN108764717A CN 108764717 A CN108764717 A CN 108764717A CN 201810522737 A CN201810522737 A CN 201810522737A CN 108764717 A CN108764717 A CN 108764717A
Authority
CN
China
Prior art keywords
data
area
level
value
level index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810522737.8A
Other languages
Chinese (zh)
Inventor
张同义
孙丹丹
韦晓
周永利
马述杰
田佳云
胡玉玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taihua Wisdom Industry Group Co Ltd
Original Assignee
Taihua Wisdom Industry Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taihua Wisdom Industry Group Co Ltd filed Critical Taihua Wisdom Industry Group Co Ltd
Priority to CN201810522737.8A priority Critical patent/CN108764717A/en
Publication of CN108764717A publication Critical patent/CN108764717A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis

Abstract

The present invention provides a kind of data processing method and device applied to industry big data analysis.This method includes:Socioeconomic data, business data and industrial economy data are captured in preset crawl range, and the above-mentioned data grabbed are converted to structural data;Structural data is screened and handled, the data of redundancy are removed and is loaded into database;Preset supplementary data is transferred to be added in database;The corresponding initial value of t three-level index in area to be evaluated is selected from database;Convert the initial value of each three-level index to value of utility;The value of utility of all three-level indexs is added, the comprehensive score in area to be evaluated is obtained;According to the initial value of all three-level indexs in area to be evaluated, the value of utility of all three-level indexs, area to be evaluated comprehensive score and database in historical data generate data analysis report.The present invention data area more comprehensively, data analysis more science, data analysis mode more diversity.

Description

Data processing method and device applied to industry big data analysis
Technical field
The present invention relates to data analyses and data processing field, and in particular to a kind of number applied to industry big data analysis According to treating method and apparatus.
Background technology
Appearance with country about industry relevant policies, how by estate planning come thrust zone rapid economic development As departments of government issues that need special attention.Generated during estate planning a large amount of Housing economy, industry project, The regional industrials related data such as economical operation, being analyzed above-mentioned industry data for science assess body to improving regional planning System promotes the new and old kinetic energy conversion of regional industrial, raising national economy production to be of great significance.But point of existing industry data Analysis has the disadvantages that:One side data acquisition channel is limited, and reference data is based primarily upon the report of relevant departments' offer, reports Material lacks the company information and economic indicator information of the big scale of construction;On the other hand excavation is lacked to the analysis of data, simply by Simple icon is shown, and in the form of a single, architectonical, can not accurately not become corresponsively in the economic conditions in area and development Gesture.
Invention content
The present invention provides a kind of data processing method and device applied to industry big data analysis, solve existing skill Data during operation analytical form is single, lacks data mining, the problem of data analysis inaccuracy.
In order to solve the above technical problem, the present invention provides a kind of data processing sides applied to industry big data analysis Method, this method include:
Socioeconomic data, business data and industrial economy data are captured in preset crawl range, to the society Economic data, business data and industrial economy data carry out processing generating structure data;
The structural data is screened and handled, the data of redundancy are removed;
The structural data for removing redundant data is loaded into preset database;
Preset supplementary data is transferred to be added in the database;
The corresponding initial value of t three-level index in area to be evaluated is selected from the database, wherein described Three-level index is for describing the socioeconomic data in the area to be evaluated, the specific number of business data and industrial economy data Value;
By preset dimensionless standardization model, it converts the initial value of each three-level index to three-level The value of utility of index obtains the value of utility of the t three-level indexs;
The value of utility of all three-level indexs is added, the comprehensive score in the area to be evaluated is obtained;
According to the initial value of all three-level indexs in the area to be evaluated, the effectiveness of all three-level indexs Value, the comprehensive score in the area to be evaluated and the historical data in the database generate data analysis report.
Further, the socioeconomic data includes:Describe natural resources and condition, population and labour, social development, Urban construction, people's lives and indicator of economic development data;
The business data includes:Listing of a company information, business background, production and operation information, financial situation, enterprise are described Operation, business risk, innovation ability, business production, news information industrial and commercial registration information, intellectual property information and public feelings information Data;
The industrial economy data include:Describe price data, exponent data, world economic data, national economic data, The data of regional economy data, industrial economy data and Economic Analysis Report.
Further, the structural data for removing redundant data is loaded into preset database and is further, The structural data for removing redundant data is loaded into preset database using Sqoop tools.
Further, the t three-level indexs are p two-level index, p according to its combinations of attributes<T, the p two levels refer to Mark is q first class index according to its combinations of attributes, and the method is further comprising the steps of:
The value of utility for all three-level indexs for being included by each described two-level index is added, and obtains p two-level index Fractional value;
The fractional value for all two-level index for being included by each described first class index is added, and obtains q first class index Fractional value;
According to the initial value of all three-level indexs in the area to be evaluated, the effectiveness of all three-level indexs Value, the fractional value of the fractional value of all two-level index, all first class index, the area to be evaluated comprehensive score And the historical data in the database generates data analysis report.
Further, the three-level index includes at least the following contents:
Description newly register enterprise's number, newly register enterprise's growth rate, quantity is closed in newly-increased individual, agriculture, individual, agriculture conjunction increase by a year-on-year basis Rate, quasi- listing and the listed amount of new three plate, each city's current year graduates' number, every ten thousand people patent of invention owning amount, scientific and technological achievement Registrating number, technology contract transaction value, state-owned and large non-state industrial enterprises' new product income from sales, creation of new enterprise generate tax revenue increment, private Seek the enterprise and newly-increased employed population number of individual, the private enterprise and individual employed population growth rate, the event number of investment and financing, investment and financing Total scale, risk investment raises volume, state-owned and large non-state industrial enterprises' R&D expenditure intensity, state-owned and large non-state industrial enterprises grind Hair activity accounting, GDP per capita, research and development funds inside expenditure accounts for GDP proportions, whole society's research and development funds expenditure accounts for The proportion of GDP, government's expenditures on science and technology account for the ratio of government finance total expenditure and account for government finance to the preferential volume of enterprise's various kinds of taxes The data of the ratio of total expenditure;
Wherein, described to newly register enterprise's number, newly register enterprise's growth rate, newly-increased individual, agriculture conjunction quantity, individual, agriculture contract Amount more listed than growth rate and quasi- listing and new three plate constitutes the first two-level index;
Each city's current year graduates' number constitutes the second two-level index;
Every ten thousand people patent of invention owning amount, scientific and technological achievement registrating number and technology contract transaction value constitute third two level Index;
State-owned and large non-state industrial enterprises' new product income from sales and creation of new enterprise generate tax revenue increment and constitute the four or two Grade index;
The private enterprise and the newly-increased employed population number of individual and the private enterprise and individual employed population growth rate are constituted 5th two-level index;
The event number of the investment and financing, the total scale of investment and financing and risk investment raise volume and constitute the 6th two-level index;
State-owned and large non-state industrial enterprises' R&D expenditure intensity and state-owned and large non-state industrial enterprises have development activities to account for Than constituting the 7th two-level index;
Expenditure accounts for GDP ratios and is reconstructed into the 8th two-level index inside the GDP per capita and research and development funds;
Whole society's research and development funds expenditure accounts for the proportion of GDP, government's expenditures on science and technology account for government finance total expenditure Ratio and the 9th two-level index of composition of proportions of government finance total expenditure is accounted for the preferential volume of enterprise's various kinds of taxes;
First two-level index and second two-level index constitute the first first class index;
The third two-level index, the 4th two-level index and the 5th two-level index constitute the second first class index;Described Six two-level index and the 7th two-level index constitute third first class index;
8th two-level index and the 9th two-level index constitute the 4th first class index.
Further, the dimensionless standardization model is:
Yij=(xij-min xj)/(max xj-min xj) * 100,
Wherein, the xij is the initial value of the jth item three-level index in i-th of area, and min xj are described i-th area Minimum value in the initial value of all three-level indexs, max xj be described i-th all three-level index in area just Maximum value in initial value, i-th of area are the area to be evaluated, and j is more than or equal to 1 and is less than or equal to t, and i is natural number.
Further, the comprehensive score model is:
Wherein, based on 60 point, M is the practical score in the area to be evaluated,For the reality in all areas to be evaluated The average value of score, MAX are the maximum value of the practical score in all areas to be evaluated.
In order to solve the above technical problem, the present invention provides a kind of data processing dresses applied to industry big data analysis It sets, including:
Data capture module, for capturing socioeconomic data, business data and industry warp in preset crawl range Ji data carry out processing to the socioeconomic data, the business data and the industrial economy data and are converted to structure Change data;
Data screening module removes the data of redundancy for being screened and being handled the structural data;
Data load-on module, for the structural data for removing redundant data to be loaded into preset database;
Data complementary module is added to for transferring preset supplementary data in the database;
Data transfer module, and the t three-level index for selecting area to be evaluated from the database corresponds to respectively Initial value, wherein the three-level index is used to describe socioeconomic data, business data and the industry in the area to be evaluated The concrete numerical value of economic data;
Data processing module is used for by preset dimensionless standardization model, by each three-level index Initial value be converted into the value of utility of three-level index, obtain the value of utility of the t three-level indexs, and by all three-level indexs Value of utility is added, and obtains the comprehensive score in the area to be evaluated;
Data analysis module is used for initial value, the Suo Yousuo of all three-level indexs according to the area to be evaluated It states the value of utility of three-level index, the comprehensive score in the area to be evaluated and the historical data in the database and generates data Analysis report.
Further, the socioeconomic data includes:Describe natural resources and condition, population and labour, social development, Urban construction, people's lives and indicator of economic development data;
The business data includes:Listing of a company information, business background, production and operation information, financial situation, enterprise are described Operation, business risk, innovation ability, business production, news information industrial and commercial registration information, intellectual property information and public feelings information Data;
The industrial economy data include:Describe price data, exponent data, world economic data, national economic data, The data of regional economy data, industrial economy data and Economic Analysis Report.
Further, the data load-on module will specifically remove the structuring of redundant data using Sqoop tools Data are loaded into preset database.
Compared with prior art, the data processing method and device for industry big data analysis of offer of the invention, It has the advantages that:
1, more comprehensively, the present invention acquires socioeconomic data, business data and industry by multiple support channels and passes through data area Ji data provide strong data and support, data are more representative;
2, data analysis more science can carry out area to be evaluated many-sided, more by establishing double wound exponential models The quantization of angle is divided equally, convenient comprehensively to be analyzed area to be evaluated;
3, analysis mode diversity, can by data analysis report or by the comprehensive score in each area to be evaluated into Row analysis;Data analysis report can analyze the economic development tendency in area to be evaluated and double wound abilities deeply, and comprehensive score can To facilitate the integration capability between multiple areas to be evaluated to compare.
It should be noted that technical solution provided by the invention need not reach above-mentioned all technique effects simultaneously.
Description of the drawings
Fig. 1 is a kind of flow chart for data processing method for industry big data analysis that embodiment 1 provides;
Fig. 2 is a kind of flow chart for data processing method for industry big data analysis that embodiment 2 provides;
Fig. 3 is a kind of block diagram for data processing equipment for industry big data analysis that embodiment 3 provides.
Specific implementation mode
Embodiment 1:
A kind of data processing method for industry big data analysis is present embodiments provided, is as shown in Figure 1 a kind of use In the data processing method flow chart of industry big data analysis, include the following steps:
S101:Socioeconomic data, business data and industrial economy data are captured in preset crawl range, to described Socioeconomic data, business data and industrial economy data carry out processing generating structure data;
Specifically, capturing socioeconomic data, business data and industrial economy data by distributed network crawler technology;
Wherein socioeconomic data includes:Describe natural resources and condition, population and labour, social development, urban construction, The data of people's lives and indicator of economic development;The business data includes:Listing of a company information, business background, production are described Operation information, enterprise operation, business risk, innovation ability, business production, news information industrial and commercial registration information, is known financial situation Know the data of property information and public feelings information.The industrial economy data include:Price data, exponent data, world's warp are described Ji data, national economic data, the data of regional economy data, industrial economy data and Economic Analysis Report.
S102:The structural data is screened and handled, the data of redundancy are removed.
Specifically, the structural data being drawn into is screened according to different data sources and different data structures, And structural data is put into data buffer zone, noise reduction, data splicing and dimension transformation are carried out to data, it is therefore an objective to removal knot Redundant data in structure data improves the efficiency of data processing.
S103:The structural data for removing redundant data is loaded into preset database.
S104:Preset supplementary data is transferred to be added in the database.
Wherein preset supplementary data can be bought by third party's data and/or business tie-up it is obtained society warp Data of helping or business data;Database is supplemented by the data of different channels, keeps the data in database more comprehensive, It is more representative.
S105:The corresponding initial value of t three-level index in area to be evaluated is selected from the database.
Wherein, the three-level index is used to describe socioeconomic data, business data and the industry in the area to be evaluated The concrete numerical value of economic data.
S106:By preset dimensionless standardization model, the initial value of each three-level index is converted For the value of utility of three-level index, the value of utility of the t three-level indexs is obtained.
S107:The value of utility of all three-level indexs is added, the comprehensive score in the area to be evaluated is obtained.
For an area to be evaluated, all data for describing the area to be evaluated are embodied in the initial value of three-level index In the value of utility of three-level index, the initial value of three-level index directly reacts the concrete numerical value of each three-level index, such as right The three-level index for newly registering enterprise's number in area to be evaluated, it is assumed that its initial value is 1000, that is, represents the area to be evaluated The quantity for newly registering enterprise is 1000;Index for example for newly registering enterprise's growth rate again, it is assumed that its initial value is 20%, i.e., It is 20% to represent and newly register the growth rate of enterprise, can be evident that by above-mentioned two index, 1000 and 20% in number It is widely different in amount, if the initial values of all three-level indexs is added, result to newly register the related of enterprise's growth rate Property is very low, cannot represent the comprehensive condition in area to be evaluated.The purpose for converting initial value to value of utility is to remove difference three Formal difference between grade index initial value, such as it is 20% to newly register the initial value of enterprise's growth rate, which exists The value of utility being converted into after nondimensionalization standardization model may be 80, newly register the effect corresponding to enterprise's number 1000 May be 90 with value, the difference between the value of utility of the two three-level indexs is substantially reduced at this time, therefore is referred to all three-levels Target value of utility is added the comprehensive score more science for obtaining area to be evaluated, more representative.
S108:According to the initial values of all three-level indexs in the area to be evaluated, all three-level indexs Value of utility, the comprehensive score in the area to be evaluated and the historical data in the database generate data analysis report.
The innovation undertaking in area to be evaluated can be deeply dissected from horizontal and vertical two dimensions by data analysis report Ability and economic trend, horizontal upwardly through socioeconomic data and business data over the years in database, deep understand is waited for Evaluate the economic development tendency in area;The fractional value and comprehensive score of every first class index in area to be evaluated are carried out on longitudinal direction Analysis, can macroscopically react the integration capability in area to be evaluated, judge the development trend in entire province/city;It can be government's tune Whole development strategy formulates talent introduction plan, invites outside investment etc. and provides data and support.
Embodiment 2:
A kind of data processing method applied to industry big data analysis is present embodiments provided, one kind is illustrated in figure 2 For the data processing method flow chart of industry big data analysis, include the following steps:
S201:Socioeconomic data, business data and industrial economy data are captured in preset crawl range, to described Socioeconomic data, business data and industrial economy data carry out processing generating structure data;
Specifically, capturing socioeconomic data, business data and industrial economy data by distributed network crawler technology.
Wherein socioeconomic data includes:Describe natural resources and condition, population and labour, social development, urban construction, The data of people's lives and indicator of economic development;
The business data includes:Listing of a company information, business background, production and operation information, financial situation, enterprise are described Operation, business risk, innovation ability, business production, news information industrial and commercial registration information, intellectual property information and public feelings information Data.
The industrial economy data include:Describe price data, exponent data, world economic data, national economic data, The data of regional economy data, industrial economy data and Economic Analysis Report.
S202:The structural data is screened and handled, the data of redundancy are removed.
Specifically, can be sieved according to different data sources and different data structures to the structural data being drawn into Choosing, and structural data is put into data buffer zone, noise reduction, data splicing and dimension transformation are carried out to data, it is therefore an objective to go Except the redundant data in structural data, the efficiency of data processing is improved.
S203;The structural data for removing redundant data is loaded into preset database.
Specifically, transformed data are loaded into database using Sqoop tools.
S204:Preset supplementary data is transferred to be added in the database.
Wherein preset supplementary data can be bought by third party's data and/or business tie-up it is obtained society warp Data of helping or business data;Database is supplemented by the data of different channels, keeps the data in database more comprehensive, It is more representative.
S205:The corresponding initial value of t three-level index in area to be evaluated is selected from the database.
Wherein, the three-level index includes the t three-level index in area to be evaluated, for describing the area to be evaluated Socioeconomic data, the concrete numerical value of industrial economy data and business data, t three-level indexs are according to its set of properties It is combined into p two-level index, p<T, p two-level index are q first class index according to its combinations of attributes.
S206;By preset dimensionless standardization model, the initial value of each three-level index is converted For the value of utility of three-level index, the value of utility of t three-level index is obtained.
Preset dimensionless standardization model is:
Yij=(xij-min xj)/(max xj-min xj) * 100.
Wherein, xij is the initial value of the one of which three-level index in area to be evaluated, and i represents area, and j represents three-level and refers to Mark, such as Shandong Province have 17 prefecture-level cities, respectively to each prefecture-level city with 1-17 into line label, while to the t of Shandong Province A three-level index is labeled as 1 according to 1-j into line label, such as by Qingdao City, then x19 is the 9th three-level index of Qingdao City Initial value.Min xj are the minimum value in the initial value of all three-level indexs in this area, and max xj are that all three-levels in this area refer to Maximum value in target initial value.The initial value of all three-level indexs is calculated it by nondimensionalization standardization model Afterwards, the order of magnitude of difference and achievement data value of all three-level indexs in measurement unit, the difference of relative fashion can be eliminated Not, make data structure and form more unified, comparativity is had more between the value of utility of each three-level index.
More specifically, area to be evaluated is provided with 23 three-level indexs, 23 three-level index groups in the present embodiment 7 two-level index are synthesized, 7 two-level index are combined as 4 first class index, specifically see following table:
Three-level index includes the following contents:
Description newly register enterprise's number, newly register enterprise's growth rate, quantity is closed in newly-increased individual, agriculture, individual, agriculture conjunction increase by a year-on-year basis Rate, quasi- listing and the listed amount of new three plate, each city's current year graduates' number, every ten thousand people patent of invention owning amount, scientific and technological achievement Registrating number, technology contract transaction value, state-owned and large non-state industrial enterprises' new product income from sales, creation of new enterprise generate tax revenue increment, private Seek the enterprise and newly-increased employed population number of individual, the private enterprise and individual employed population growth rate, the event number of investment and financing, investment and financing Total scale, risk investment raises volume, state-owned and large non-state industrial enterprises' R&D expenditure intensity, state-owned and large non-state industrial enterprises grind Hair activity accounting, GDP per capita, research and development funds inside expenditure accounts for GDP proportions, whole society's research and development funds expenditure accounts for The proportion of GDP, government's expenditures on science and technology account for the ratio of government finance total expenditure and account for government finance to the preferential volume of enterprise's various kinds of taxes The data of the ratio of total expenditure;Above-mentioned data are successively according to 1-23 into line label.Such as x12 be first it is to be evaluated area it is new Register enterprise's growth rate.
Wherein newly register enterprise's number, newly register enterprise's growth rate, quantity is closed in newly-increased individual, agriculture, individual, agriculture conjunction increase by a year-on-year basis Rate and quasi- listing and the listed amount of new three plate constitute the first two-level index;First two-level index is used to describe the enterprise in area to be evaluated Industry index;
Each city's current year graduates' number constitutes the second two-level index;Second two-level index is for describing area to be evaluated Talent's index;
Every ten thousand people patent of invention owning amount, scientific and technological achievement registrating number and technology contract transaction value constitute third two level Index;Third two-level index is used to describe the innovation activity index in area to be evaluated;
State-owned and large non-state industrial enterprises' new product income from sales and creation of new enterprise generate tax revenue increment and constitute the four or two Grade index;4th two-level index is used to describe the economic benefits indicator in area to be evaluated;
The private enterprise and the newly-increased employed population number of individual and the private enterprise and individual employed population growth rate are constituted 5th two-level index;5th two-level index is used to describe the drive employment index in area to be evaluated;
The event number of the investment and financing, the total scale of investment and financing and risk investment raise volume and constitute the 6th two-level index; 6th two-level index is used to describe the investment and financing index in area to be evaluated;
State-owned and large non-state industrial enterprises' R&D expenditure intensity and state-owned and large non-state industrial enterprises have development activities to account for Than constituting the 7th two-level index;7th two-level index is used to describe the innovation support index in area to be evaluated;
Expenditure accounts for GDP ratios and is reconstructed into the 8th two-level index inside the GDP per capita and research and development funds;Eight or two Grade index is used to describe the economic base index in area to be evaluated;
Whole society's research and development funds expenditure accounts for the proportion of GDP, government's expenditures on science and technology account for government finance total expenditure Ratio and the 9th two-level index of composition of proportions of government finance total expenditure is accounted for the preferential volume of enterprise's various kinds of taxes;9th two level Index is used to describe the policy environment index in area to be evaluated;
First two-level index and second two-level index constitute the first first class index;First first class index is for retouching State double wound Motility Index in area to be evaluated;
The third two-level index, the 4th two-level index and the 5th two-level index constitute the second first class index;2nd 1 Grade index is used to describe double wound effect indexs in area to be evaluated;
6th two-level index and the 7th two-level index constitute third first class index;Third first class index is used for Double wound support index in area to be evaluated are described;
8th two-level index and the 9th two-level index constitute the 4th first class index.4th first class index is for describing Double wound environmental index in area to be evaluated.
S207:The value of utility for all three-level indexs for being included by each described two-level index is added, and obtains p two level The fractional value of index.
Such as it is to be evaluated area description enterprise two-level index, including all three-level indexs be newly register enterprise Industry number newly registers enterprise's growth rate, newly-increased individual, agriculture conjunction quantity, individual, agriculture conjunction year-on-year growth rate and quasi- listing and new three plate Listed amount, it is assumed that the initial value for newly registering the three-level index of enterprise's number is 1000, that is, represent the area to be evaluated newly registers enterprise The quantity of industry is 1000;The initial value for newly registering the index of enterprise's growth rate is 20%, that is, represents the growth rate for newly registering enterprise Be 20%, can be evident that by above-mentioned two index, 1000 with 20% quantitatively widely different, if by institute There is the initial value of three-level index to be added, result is very low with the correlation for newly registering enterprise's growth rate, cannot represent to be evaluatedly The comprehensive condition in area.So the fractional value of two-level index is the sum of the value of utility of all three-level indexs, it converts initial value to effect It is to remove the formal difference between different three-level index initial values with the purpose of value, such as newly registers enterprise's growth rate Initial value is 20%, which may be 80 in the value of utility being converted into after nondimensionalization standardization model, newly It may be 90 to register the value of utility corresponding to enterprise's number 1000, and the difference between the value of utility of the two three-level indexs is big at this time It is big to reduce, therefore the fractional value more science of the two-level index obtained with the addition of the value of utility of all three-level indexs, have more generation Table.
S208:The fractional value for all two-level index for being included by each described first class index is added, and obtains q level-one The fractional value of achievement data;
By the fractional value of four first class index, the innovation undertaking in area to be evaluated can be reacted from four different angles Ability.
Such as double wound Motility Index for area to be evaluated, the fractional value of double wound Motility Index of this area is that description is looked forward to The fractional value of the two-level index of industry adds the fractional value of the two-level index of the description talent.
S209:Using preset comprehensive score model, practical score is input in the comprehensive score model, is obtained every Double practical scores of wound index in a area to be evaluated;
Wherein preset practical score model is:
Wherein, based on 60 point, M is the practical score in each area to be evaluated,It is all described to be evaluated The average value of the practical score in area, MAX are the maximum value of the practical score in all areas to be evaluated.
Practical score can react the summation score of all double wound indexes in area to be evaluated, convert practical score to synthesis The purpose of score is, converts practical score to the form of hundred-mark system, and basis is divided into 60 points, and practical when area to be evaluated obtains When dividing the average practical score more than all areas to be evaluated, it is more than 60 by the practical score that practical score computation model obtains Point, it was demonstrated that the innovation & enterprise capability of this area is qualified;When the practical score in area to be evaluated is flat less than all areas to be evaluated Practical score when, the practical score obtained by practical score computation model is less than 60 points, it was demonstrated that the innovation undertaking of this area Ability is unqualified.Overall merit can be carried out to the innovation & enterprise capability in area to be evaluated by practical score, and convenient more It is compared between a area to be evaluated.
S210:According to the initial value of all three-level indexs, the value of utility of all three-level indexs, institute in area to be evaluated There are the fractional value of the first class index and two-level index, the comprehensive score in the area to be evaluated and going through in the database History data generate data analysis report.It can deeply be dissected from horizontal and vertical two dimensions by data analysis report to be evaluated The innovation & enterprise capability and economic trend in area, cross is upwardly through socioeconomic data over the years in database and enterprise's number According to understanding the economic development tendency in area to be evaluated deeply;To the fractional value of every first class index in area to be evaluated on longitudinal direction It is analyzed with comprehensive score, can macroscopically react the integration capability in area to be evaluated, judge that the development in entire province/city becomes Gesture;Can be that government adjusts development strategy, formulates talent introduction plan, invites outside investment etc. and provides data support.
Through this embodiment, following advantageous effect is realized:
1, more comprehensively, the present invention acquires socioeconomic data, business data and industry by multiple support channels and passes through data area Ji data provide strong data and support, data are more representative;
2, data analysis more science can carry out area to be evaluated many-sided, more by establishing double wound exponential models The quantization of angle is divided equally, convenient comprehensively to be analyzed area to be evaluated;
3, analysis mode diversity, can by data analysis report or by the comprehensive score in each area to be evaluated into Row analysis;Data analysis report can analyze the economic development tendency in area to be evaluated and double wound abilities deeply, and comprehensive score can To facilitate the integration capability between multiple areas to be evaluated to compare.
Embodiment 3:
A kind of data processing equipment for industry big data analysis is present embodiments provided, if Fig. 3 is that one kind is used to produce The block diagram of the data processing equipment of sparetime university's data analysis, the device include:
Data capture module 301, for capturing socioeconomic data, business data and industry in preset crawl range Economic data carries out processing to socioeconomic data, business data and industrial economy data and is converted to structural data;
Wherein, socioeconomic data includes:Description natural resources is built with condition, population and labour, social development, city If, the data of people's lives and indicator of economic development;Business data includes:Listing of a company information, business background, production warp are described Seek information, financial situation, enterprise operation, business risk, innovation ability, business production, news information industrial and commercial registration information, knowledge The data of property information and public feelings information;Industrial economy data include:Describe price data, exponent data, world economic data, The data of national economic data, regional economy data, industrial economy data and Economic Analysis Report.
Data screening module 302 removes the data of redundancy for being screened and being handled the structural data;
Data load-on module 303, for the structural data for removing redundant data to be loaded into preset database In 305;Specifically, the structural data for removing redundant data is specifically loaded into using Sqoop tools by data load-on module 303 In preset database 305.
Data complementary module 304 is added to for transferring preset supplementary data in the database 305;It is wherein default Supplementary data can be bought by third party's data and/or the obtained socioeconomic data of business tie-up or business data, Keep the data in database 305 more comprehensive.
Data transfer module 306, the t three-level index point for selecting area to be evaluated from the database 305 Not corresponding initial value, wherein three-level index is used to describe socioeconomic data, business data and the production in the area to be evaluated The concrete numerical value of industry economic data;
Data processing module 307, for by preset dimensionless standardization model, each described three-level to be referred to Target initial value is converted into the value of utility of three-level index, obtains the value of utility of t three-level indexs, and by all three-level indexs Value of utility be added, obtain the comprehensive score in the area to be evaluated;
Preset dimensionless standardization model is:
Yij=(xij-min xj)/(max xj-min xj) * 100.
Wherein, xij is the initial value of the one of which three-level index in area to be evaluated, and i represents area, and j represents three-level and refers to Mark, min xj are the minimum value in the initial value of all three-level indexs in this area, and max xj are all three-level indexs in this area Maximum value in initial value.It, can after the initial value of all three-level indexs is calculated by nondimensionalization standardization model To eliminate the order of magnitude of difference and achievement data value of all three-level indexs in measurement unit, the difference of relative fashion, make Data structure and form are more unified, and comparativity is had more between the value of utility of each three-level index.By to area to be evaluated The value of utility of all three-level indexs, which is added the comprehensive score obtained, can react the macroeconomic situation and wound in area to be evaluated New students' ability, and for different areas to be evaluated, it is convenient to be compared by comprehensive score.
Data analysis module 308 is used for the initial value of all three-level indexs according to the area to be evaluated, owns The value of utility of the three-level index, the comprehensive score in the area to be evaluated and the historical data in the database generate number According to analysis report.Pass through the initial value of all three-level indexs to area to be evaluated, the effectiveness of all three-level indexs Value, the comprehensive score in the area to be evaluated and the historical data in the database are analyzed, can more comprehensive, visitor Economic conditions, economic trend and the innovation & enterprise capability in the analysis area to be evaluated of sight can be government's adjustment development Strategy formulates talent introduction plan, invites outside investment etc. and provides data and support.
It should be understood by those skilled in the art that, the embodiment of the present invention can be provided as method, apparatus or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, the present invention can be used in one or more wherein include computer usable program code computer The computer program production implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) The form of product.
Although some specific embodiments of the present invention are described in detail by example, the skill of this field Art personnel it should be understood that example above merely to illustrating, the range being not intended to be limiting of the invention.The skill of this field Art personnel are it should be understood that can without departing from the scope and spirit of the present invention modify to above example.This hair Bright range is defined by the following claims.

Claims (10)

1. a kind of data processing method applied to industry big data analysis, which is characterized in that include the following steps:
Socioeconomic data, business data and industrial economy data are captured in preset crawl range, to the social economy Data, business data and industrial economy data carry out processing generating structure data;
The structural data is screened and handled, the data of redundancy are removed;
The structural data for removing redundant data is loaded into preset database;
Preset supplementary data is transferred to be added in the database;
The corresponding initial value of t three-level index in area to be evaluated is selected from the database, wherein the three-level Index is used to describe socioeconomic data, the concrete numerical value of business data and industrial economy data in the area to be evaluated;
By preset dimensionless standardization model, it converts the initial value of each three-level index to three-level index Value of utility, obtain the value of utility of t three-level indexs;
The value of utility of all three-level indexs is added, the comprehensive score in the area to be evaluated is obtained;
According to the initial value of all three-level indexs, the value of utility of all three-level indexs, institute in the area to be evaluated It states the comprehensive score in area to be evaluated and the historical data in the database generates data analysis report.
2. the data processing method according to claim 1 applied to industry big data analysis, which is characterized in that
The socioeconomic data includes:Natural resources and condition, population and labour, social development, urban construction, the people are described The data of life and indicator of economic development;
The business data includes:Listing of a company information, business background, production and operation information, financial situation, enterprise's fortune are described Battalion, business risk, innovation ability, business production, news information industrial and commercial registration information, intellectual property information and public feelings information number According to;
The industrial economy data include:Price data, exponent data, world economic data, national economic data, area are described The data of economic data, industrial economy data and Economic Analysis Report.
3. the data processing method according to claim 1 applied to industry big data analysis, which is characterized in that
The structural data for removing redundant data is loaded into preset database and is further, using Sqoop tools The structural data for removing redundant data is loaded into preset database.
4. the data processing method according to claim 1 applied to industry big data analysis, which is characterized in that
The t three-level indexs are p two-level index, p according to its combinations of attributes<T, the p two-level index are according to its attribute It is combined as q first class index, the method is further comprising the steps of:
The value of utility for all three-level indexs for being included by each described two-level index is added, and obtains point of p two-level index Numerical value;
The fractional value for all two-level index for being included by each described first class index is added, and obtains point of q first class index Numerical value;
According to the initial value of all three-level indexs, the value of utility of all three-level indexs, institute in the area to be evaluated Have the fractional value of the two-level index, the fractional value of all first class index, the area to be evaluated comprehensive score and Historical data in the database generates data analysis report.
5. the data processing method according to claim 4 applied to industry big data analysis, which is characterized in that
The three-level index includes at least the following contents:
Description newly register enterprise's number, newly register enterprise's growth rate, quantity is closed in newly-increased individual, agriculture, year-on-year growth rate is closed in individual, agriculture, Quasi- listing and the listed amount of new three plate, the registration of each city's current year graduates' number, every ten thousand people patent of invention owning amount, scientific and technological achievement Number, technology contract transaction value, state-owned and large non-state industrial enterprises' new product income from sales, creation of new enterprise generate tax revenue increment, private enterprise The industry and newly-increased employed population number of individual, the private enterprise and individual employed population growth rate, the event number of investment and financing, investment and financing it is total Scale, risk investment recruitment volume, state-owned and large non-state industrial enterprises' R&D expenditure intensity, state-owned and large non-state industrial enterprises have research and development to live Dynamic accounting, GDP per capita, research and development funds inside expenditure accounts for GDP proportions, whole society's research and development funds expenditure accounts for GDP Proportion, government's expenditures on science and technology account for the ratio of government finance total expenditure and to enterprise's various kinds of taxes preferential E Zhan government finances general branch The data of the ratio gone out;
Wherein, described to newly register enterprise's number, newly register enterprise's growth rate, newly-increased individual, agriculture conjunction quantity, individual, agriculture contract than increasing Long rate and quasi- listing and the listed amount of new three plate constitute the first two-level index;
Each city's current year graduates' number constitutes the second two-level index;
Every ten thousand people patent of invention owning amount, scientific and technological achievement registrating number and technology contract transaction value constitute third two level and refer to Mark;
State-owned and large non-state industrial enterprises' new product income from sales and creation of new enterprise generate tax revenue increment the 4th two level of composition and refer to Mark;
The private enterprise and the newly-increased employed population number of individual and the private enterprise and individual employed population growth rate constitute the 5th Two-level index;
The event number of the investment and financing, the total scale of investment and financing and risk investment raise volume and constitute the 6th two-level index;
There are development activities accounting structure in state-owned and large non-state industrial enterprises' R&D expenditure intensity and state-owned and large non-state industrial enterprises At the 7th two-level index;
Expenditure accounts for GDP ratios and is reconstructed into the 8th two-level index inside the GDP per capita and research and development funds;
Whole society's research and development funds expenditure accounts for the proportion of GDP, government's expenditures on science and technology account for the ratio of government finance total expenditure Example and the 9th two-level index of composition of proportions that government finance total expenditure is accounted for the preferential volume of enterprise's various kinds of taxes;
First two-level index and second two-level index constitute the first first class index;
The third two-level index, the 4th two-level index and the 5th two-level index constitute the second first class index;Described 6th 2 Grade index and the 7th two-level index constitute third first class index;
8th two-level index and the 9th two-level index constitute the 4th first class index.
6. the data processing method according to claim 1 applied to industry big data analysis, which is characterized in that
The dimensionless standardization model is:
Yij=(xij-min xj)/(max xj-min xj) * 100,
Wherein, the xij is the initial value of the jth item three-level index in i-th of area, and min xj are that described i-th area is all Minimum value in the initial value of the three-level index, max xj are the initial value of the described i-th all three-level indexs in area In maximum value, it is described i-th area be the area to be evaluated, j be more than or equal to 1 and be less than or equal to t, i is natural number.
7. the data processing method according to claim 1 applied to industry big data analysis, which is characterized in that
The comprehensive score model is:
Wherein, based on 60 point, M is the practical score in the area to be evaluated,For the practical score in all areas to be evaluated Average value, MAX be all areas to be evaluated practical score maximum value.
8. a kind of data processing equipment applied to industry big data analysis, which is characterized in that including:
Data capture module, for capturing socioeconomic data, business data and industrial economy number in preset crawl range According to, to the socioeconomic data, the business data and the industrial economy data carry out processing be converted to structuring number According to;
Data screening module removes the data of redundancy for being screened and being handled the structural data;
Data load-on module, for the structural data for removing redundant data to be loaded into preset database;
Data complementary module is added to for transferring preset supplementary data in the database;
Data transfer module, and the t three-level index for selecting area to be evaluated from the database is corresponding just Initial value, wherein the three-level index is used to describe socioeconomic data, business data and the industrial economy in the area to be evaluated The concrete numerical value of data;
Data processing module is used for through preset dimensionless standardization model, by the first of each three-level index Initial value is converted into the value of utility of three-level index, obtains the value of utility of t three-level indexs, and by the effectiveness of all three-level indexs Value is added, and obtains the comprehensive score in the area to be evaluated;
Data analysis module, for according to the initial values of all three-level indexs in the area to be evaluated, all described three The value of utility of grade index, the comprehensive score in the area to be evaluated and the historical data in the database generate data analysis Report.
9. the data processing equipment according to claim 8 applied to industry big data analysis, which is characterized in that
The socioeconomic data includes:Natural resources and condition, population and labour, social development, urban construction, the people are described The data of life and indicator of economic development;
The business data includes:Listing of a company information, business background, production and operation information, financial situation, enterprise's fortune are described Battalion, business risk, innovation ability, business production, news information industrial and commercial registration information, intellectual property information and public feelings information number According to;
The industrial economy data include:Price data, exponent data, world economic data, national economic data, area are described The data of economic data, industrial economy data and Economic Analysis Report.
10. the data processing equipment according to claim 8 applied to industry big data analysis, which is characterized in that
The structural data for removing redundant data is specifically loaded into using Sqoop tools default by the data load-on module Database in.
CN201810522737.8A 2018-05-28 2018-05-28 Data processing method and device applied to industry big data analysis Pending CN108764717A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810522737.8A CN108764717A (en) 2018-05-28 2018-05-28 Data processing method and device applied to industry big data analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810522737.8A CN108764717A (en) 2018-05-28 2018-05-28 Data processing method and device applied to industry big data analysis

Publications (1)

Publication Number Publication Date
CN108764717A true CN108764717A (en) 2018-11-06

Family

ID=64002878

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810522737.8A Pending CN108764717A (en) 2018-05-28 2018-05-28 Data processing method and device applied to industry big data analysis

Country Status (1)

Country Link
CN (1) CN108764717A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110335083A (en) * 2019-07-10 2019-10-15 山东众志电子有限公司 A kind of price big data analysis system and method based on cloud platform
CN111598459A (en) * 2020-05-19 2020-08-28 苏州云联智慧信息技术应用有限公司 Intensive utilization method and device for enterprise resources
CN111724079A (en) * 2020-06-29 2020-09-29 信阳农林学院 Industry economic data management system based on big data
CN113554299A (en) * 2021-07-19 2021-10-26 云南省烟草烟叶公司 System, method and device for evaluating comprehensive grade quality of tobacco leaves and electronic equipment
CN116502918A (en) * 2023-05-12 2023-07-28 广东省科技基础条件平台中心 Innovative capability evaluation method of technological innovation platform
CN116663750A (en) * 2023-07-31 2023-08-29 北京市科学技术研究院 Industrial chain data value evaluation analysis system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2015100771A4 (en) * 2015-04-27 2015-07-09 Xero Limited Benchmarking through data mining
CN104966172A (en) * 2015-07-21 2015-10-07 上海融甸信息科技有限公司 Large data visualization analysis and processing system for enterprise operation data analysis
CN106845829A (en) * 2017-01-20 2017-06-13 国信优易数据有限公司 A kind of civil-military inosculation data message quantified system analysis
CN107274064A (en) * 2017-05-15 2017-10-20 东南大学 Highway operation conditions Dynamic Comprehensive Evaluation method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2015100771A4 (en) * 2015-04-27 2015-07-09 Xero Limited Benchmarking through data mining
CN104966172A (en) * 2015-07-21 2015-10-07 上海融甸信息科技有限公司 Large data visualization analysis and processing system for enterprise operation data analysis
CN106845829A (en) * 2017-01-20 2017-06-13 国信优易数据有限公司 A kind of civil-military inosculation data message quantified system analysis
CN107274064A (en) * 2017-05-15 2017-10-20 东南大学 Highway operation conditions Dynamic Comprehensive Evaluation method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王卓伦 等: "网络零售物流企业信用评价指标体系与风险预警", 《福建电脑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110335083A (en) * 2019-07-10 2019-10-15 山东众志电子有限公司 A kind of price big data analysis system and method based on cloud platform
CN111598459A (en) * 2020-05-19 2020-08-28 苏州云联智慧信息技术应用有限公司 Intensive utilization method and device for enterprise resources
CN111724079A (en) * 2020-06-29 2020-09-29 信阳农林学院 Industry economic data management system based on big data
CN113554299A (en) * 2021-07-19 2021-10-26 云南省烟草烟叶公司 System, method and device for evaluating comprehensive grade quality of tobacco leaves and electronic equipment
CN116502918A (en) * 2023-05-12 2023-07-28 广东省科技基础条件平台中心 Innovative capability evaluation method of technological innovation platform
CN116502918B (en) * 2023-05-12 2024-04-05 广东省科技基础条件平台中心 Innovative capability evaluation method of technological innovation platform
CN116663750A (en) * 2023-07-31 2023-08-29 北京市科学技术研究院 Industrial chain data value evaluation analysis system
CN116663750B (en) * 2023-07-31 2023-10-13 北京市科学技术研究院 Industrial chain data value evaluation analysis system

Similar Documents

Publication Publication Date Title
CN108764717A (en) Data processing method and device applied to industry big data analysis
Guo et al. Measuring and evaluating SDG indicators with Big Earth Data
Boss et al. Contagion flow through banking networks
Sakhno et al. A Methodological Analysis for the Impact Assessment of the Digitalisation of Economy on Agricultural Growth.
Zubiashvili et al. Labour Emigration and Employment in Georgia
Taraniuk et al. Estimation of the marketing potential of industrial enterprises in the period of re-engineering of business processes
Wang et al. Impact of farmland characteristics on grain costs and benefits in the North China Plain
Pasnicu Supporting SMEs in creating jobs
Seitzhanov et al. Innovational approach of business management in Kazakhstan
Vijayalakshmi et al. Factors determining in foreign direct investment (FDI) in India
Wang et al. Promoting mineral resources consumption efficiency: Evidence from technology of big data
Cestti et al. Indirect economic impacts of dams
Jurayevich Main directions of improvement of the process of investment attraction
Chan et al. Galvanizing the groundswell of climate actions in the developing world
Di Giorno et al. A Niche Approach for Modeling Economic Competition.
Cheng et al. Risk identification of public infrastructure projects based on VFPE
Samadi-Parviznejad et al. Identifying and evaluating smart city marketing parameters (Case study: Tabriz)
CN113128912B (en) Method for determining industry new and old kinetic energy conversion level based on electricity consumption data
Alexandru et al. Analysis of Romanian development regions-a first step to support national regional development priorities.
Agbonifi The dynamic approach of modelling regional recovery investment policies using environmentally-extended SAM Matrix
Ruchkin Model for the development of design solutions in the framework of strategic planning for the development of rural areas
Gladevich Assessment of the innovation potential of the regions of Latvia, Lithuania and Belarus
Wijaya et al. Decomposition of the Theil index in inequality analyses in Yogyakarta Indonesia
Manzhosova Digital analysis as a tool for assessing regional opportunities for the transition to digital technologies
Goncharov Green Human Capital: Problems and Development Strategy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181106