CN108764717A - Data processing method and device applied to industry big data analysis - Google Patents
Data processing method and device applied to industry big data analysis Download PDFInfo
- Publication number
- CN108764717A CN108764717A CN201810522737.8A CN201810522737A CN108764717A CN 108764717 A CN108764717 A CN 108764717A CN 201810522737 A CN201810522737 A CN 201810522737A CN 108764717 A CN108764717 A CN 108764717A
- Authority
- CN
- China
- Prior art keywords
- data
- area
- level
- value
- level index
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
Abstract
The present invention provides a kind of data processing method and device applied to industry big data analysis.This method includes:Socioeconomic data, business data and industrial economy data are captured in preset crawl range, and the above-mentioned data grabbed are converted to structural data;Structural data is screened and handled, the data of redundancy are removed and is loaded into database;Preset supplementary data is transferred to be added in database;The corresponding initial value of t three-level index in area to be evaluated is selected from database;Convert the initial value of each three-level index to value of utility;The value of utility of all three-level indexs is added, the comprehensive score in area to be evaluated is obtained;According to the initial value of all three-level indexs in area to be evaluated, the value of utility of all three-level indexs, area to be evaluated comprehensive score and database in historical data generate data analysis report.The present invention data area more comprehensively, data analysis more science, data analysis mode more diversity.
Description
Technical field
The present invention relates to data analyses and data processing field, and in particular to a kind of number applied to industry big data analysis
According to treating method and apparatus.
Background technology
Appearance with country about industry relevant policies, how by estate planning come thrust zone rapid economic development
As departments of government issues that need special attention.Generated during estate planning a large amount of Housing economy, industry project,
The regional industrials related data such as economical operation, being analyzed above-mentioned industry data for science assess body to improving regional planning
System promotes the new and old kinetic energy conversion of regional industrial, raising national economy production to be of great significance.But point of existing industry data
Analysis has the disadvantages that:One side data acquisition channel is limited, and reference data is based primarily upon the report of relevant departments' offer, reports
Material lacks the company information and economic indicator information of the big scale of construction;On the other hand excavation is lacked to the analysis of data, simply by
Simple icon is shown, and in the form of a single, architectonical, can not accurately not become corresponsively in the economic conditions in area and development
Gesture.
Invention content
The present invention provides a kind of data processing method and device applied to industry big data analysis, solve existing skill
Data during operation analytical form is single, lacks data mining, the problem of data analysis inaccuracy.
In order to solve the above technical problem, the present invention provides a kind of data processing sides applied to industry big data analysis
Method, this method include:
Socioeconomic data, business data and industrial economy data are captured in preset crawl range, to the society
Economic data, business data and industrial economy data carry out processing generating structure data;
The structural data is screened and handled, the data of redundancy are removed;
The structural data for removing redundant data is loaded into preset database;
Preset supplementary data is transferred to be added in the database;
The corresponding initial value of t three-level index in area to be evaluated is selected from the database, wherein described
Three-level index is for describing the socioeconomic data in the area to be evaluated, the specific number of business data and industrial economy data
Value;
By preset dimensionless standardization model, it converts the initial value of each three-level index to three-level
The value of utility of index obtains the value of utility of the t three-level indexs;
The value of utility of all three-level indexs is added, the comprehensive score in the area to be evaluated is obtained;
According to the initial value of all three-level indexs in the area to be evaluated, the effectiveness of all three-level indexs
Value, the comprehensive score in the area to be evaluated and the historical data in the database generate data analysis report.
Further, the socioeconomic data includes:Describe natural resources and condition, population and labour, social development,
Urban construction, people's lives and indicator of economic development data;
The business data includes:Listing of a company information, business background, production and operation information, financial situation, enterprise are described
Operation, business risk, innovation ability, business production, news information industrial and commercial registration information, intellectual property information and public feelings information
Data;
The industrial economy data include:Describe price data, exponent data, world economic data, national economic data,
The data of regional economy data, industrial economy data and Economic Analysis Report.
Further, the structural data for removing redundant data is loaded into preset database and is further,
The structural data for removing redundant data is loaded into preset database using Sqoop tools.
Further, the t three-level indexs are p two-level index, p according to its combinations of attributes<T, the p two levels refer to
Mark is q first class index according to its combinations of attributes, and the method is further comprising the steps of:
The value of utility for all three-level indexs for being included by each described two-level index is added, and obtains p two-level index
Fractional value;
The fractional value for all two-level index for being included by each described first class index is added, and obtains q first class index
Fractional value;
According to the initial value of all three-level indexs in the area to be evaluated, the effectiveness of all three-level indexs
Value, the fractional value of the fractional value of all two-level index, all first class index, the area to be evaluated comprehensive score
And the historical data in the database generates data analysis report.
Further, the three-level index includes at least the following contents:
Description newly register enterprise's number, newly register enterprise's growth rate, quantity is closed in newly-increased individual, agriculture, individual, agriculture conjunction increase by a year-on-year basis
Rate, quasi- listing and the listed amount of new three plate, each city's current year graduates' number, every ten thousand people patent of invention owning amount, scientific and technological achievement
Registrating number, technology contract transaction value, state-owned and large non-state industrial enterprises' new product income from sales, creation of new enterprise generate tax revenue increment, private
Seek the enterprise and newly-increased employed population number of individual, the private enterprise and individual employed population growth rate, the event number of investment and financing, investment and financing
Total scale, risk investment raises volume, state-owned and large non-state industrial enterprises' R&D expenditure intensity, state-owned and large non-state industrial enterprises grind
Hair activity accounting, GDP per capita, research and development funds inside expenditure accounts for GDP proportions, whole society's research and development funds expenditure accounts for
The proportion of GDP, government's expenditures on science and technology account for the ratio of government finance total expenditure and account for government finance to the preferential volume of enterprise's various kinds of taxes
The data of the ratio of total expenditure;
Wherein, described to newly register enterprise's number, newly register enterprise's growth rate, newly-increased individual, agriculture conjunction quantity, individual, agriculture contract
Amount more listed than growth rate and quasi- listing and new three plate constitutes the first two-level index;
Each city's current year graduates' number constitutes the second two-level index;
Every ten thousand people patent of invention owning amount, scientific and technological achievement registrating number and technology contract transaction value constitute third two level
Index;
State-owned and large non-state industrial enterprises' new product income from sales and creation of new enterprise generate tax revenue increment and constitute the four or two
Grade index;
The private enterprise and the newly-increased employed population number of individual and the private enterprise and individual employed population growth rate are constituted
5th two-level index;
The event number of the investment and financing, the total scale of investment and financing and risk investment raise volume and constitute the 6th two-level index;
State-owned and large non-state industrial enterprises' R&D expenditure intensity and state-owned and large non-state industrial enterprises have development activities to account for
Than constituting the 7th two-level index;
Expenditure accounts for GDP ratios and is reconstructed into the 8th two-level index inside the GDP per capita and research and development funds;
Whole society's research and development funds expenditure accounts for the proportion of GDP, government's expenditures on science and technology account for government finance total expenditure
Ratio and the 9th two-level index of composition of proportions of government finance total expenditure is accounted for the preferential volume of enterprise's various kinds of taxes;
First two-level index and second two-level index constitute the first first class index;
The third two-level index, the 4th two-level index and the 5th two-level index constitute the second first class index;Described
Six two-level index and the 7th two-level index constitute third first class index;
8th two-level index and the 9th two-level index constitute the 4th first class index.
Further, the dimensionless standardization model is:
Yij=(xij-min xj)/(max xj-min xj) * 100,
Wherein, the xij is the initial value of the jth item three-level index in i-th of area, and min xj are described i-th area
Minimum value in the initial value of all three-level indexs, max xj be described i-th all three-level index in area just
Maximum value in initial value, i-th of area are the area to be evaluated, and j is more than or equal to 1 and is less than or equal to t, and i is natural number.
Further, the comprehensive score model is:
Wherein, based on 60 point, M is the practical score in the area to be evaluated,For the reality in all areas to be evaluated
The average value of score, MAX are the maximum value of the practical score in all areas to be evaluated.
In order to solve the above technical problem, the present invention provides a kind of data processing dresses applied to industry big data analysis
It sets, including:
Data capture module, for capturing socioeconomic data, business data and industry warp in preset crawl range
Ji data carry out processing to the socioeconomic data, the business data and the industrial economy data and are converted to structure
Change data;
Data screening module removes the data of redundancy for being screened and being handled the structural data;
Data load-on module, for the structural data for removing redundant data to be loaded into preset database;
Data complementary module is added to for transferring preset supplementary data in the database;
Data transfer module, and the t three-level index for selecting area to be evaluated from the database corresponds to respectively
Initial value, wherein the three-level index is used to describe socioeconomic data, business data and the industry in the area to be evaluated
The concrete numerical value of economic data;
Data processing module is used for by preset dimensionless standardization model, by each three-level index
Initial value be converted into the value of utility of three-level index, obtain the value of utility of the t three-level indexs, and by all three-level indexs
Value of utility is added, and obtains the comprehensive score in the area to be evaluated;
Data analysis module is used for initial value, the Suo Yousuo of all three-level indexs according to the area to be evaluated
It states the value of utility of three-level index, the comprehensive score in the area to be evaluated and the historical data in the database and generates data
Analysis report.
Further, the socioeconomic data includes:Describe natural resources and condition, population and labour, social development,
Urban construction, people's lives and indicator of economic development data;
The business data includes:Listing of a company information, business background, production and operation information, financial situation, enterprise are described
Operation, business risk, innovation ability, business production, news information industrial and commercial registration information, intellectual property information and public feelings information
Data;
The industrial economy data include:Describe price data, exponent data, world economic data, national economic data,
The data of regional economy data, industrial economy data and Economic Analysis Report.
Further, the data load-on module will specifically remove the structuring of redundant data using Sqoop tools
Data are loaded into preset database.
Compared with prior art, the data processing method and device for industry big data analysis of offer of the invention,
It has the advantages that:
1, more comprehensively, the present invention acquires socioeconomic data, business data and industry by multiple support channels and passes through data area
Ji data provide strong data and support, data are more representative;
2, data analysis more science can carry out area to be evaluated many-sided, more by establishing double wound exponential models
The quantization of angle is divided equally, convenient comprehensively to be analyzed area to be evaluated;
3, analysis mode diversity, can by data analysis report or by the comprehensive score in each area to be evaluated into
Row analysis;Data analysis report can analyze the economic development tendency in area to be evaluated and double wound abilities deeply, and comprehensive score can
To facilitate the integration capability between multiple areas to be evaluated to compare.
It should be noted that technical solution provided by the invention need not reach above-mentioned all technique effects simultaneously.
Description of the drawings
Fig. 1 is a kind of flow chart for data processing method for industry big data analysis that embodiment 1 provides;
Fig. 2 is a kind of flow chart for data processing method for industry big data analysis that embodiment 2 provides;
Fig. 3 is a kind of block diagram for data processing equipment for industry big data analysis that embodiment 3 provides.
Specific implementation mode
Embodiment 1:
A kind of data processing method for industry big data analysis is present embodiments provided, is as shown in Figure 1 a kind of use
In the data processing method flow chart of industry big data analysis, include the following steps:
S101:Socioeconomic data, business data and industrial economy data are captured in preset crawl range, to described
Socioeconomic data, business data and industrial economy data carry out processing generating structure data;
Specifically, capturing socioeconomic data, business data and industrial economy data by distributed network crawler technology;
Wherein socioeconomic data includes:Describe natural resources and condition, population and labour, social development, urban construction,
The data of people's lives and indicator of economic development;The business data includes:Listing of a company information, business background, production are described
Operation information, enterprise operation, business risk, innovation ability, business production, news information industrial and commercial registration information, is known financial situation
Know the data of property information and public feelings information.The industrial economy data include:Price data, exponent data, world's warp are described
Ji data, national economic data, the data of regional economy data, industrial economy data and Economic Analysis Report.
S102:The structural data is screened and handled, the data of redundancy are removed.
Specifically, the structural data being drawn into is screened according to different data sources and different data structures,
And structural data is put into data buffer zone, noise reduction, data splicing and dimension transformation are carried out to data, it is therefore an objective to removal knot
Redundant data in structure data improves the efficiency of data processing.
S103:The structural data for removing redundant data is loaded into preset database.
S104:Preset supplementary data is transferred to be added in the database.
Wherein preset supplementary data can be bought by third party's data and/or business tie-up it is obtained society warp
Data of helping or business data;Database is supplemented by the data of different channels, keeps the data in database more comprehensive,
It is more representative.
S105:The corresponding initial value of t three-level index in area to be evaluated is selected from the database.
Wherein, the three-level index is used to describe socioeconomic data, business data and the industry in the area to be evaluated
The concrete numerical value of economic data.
S106:By preset dimensionless standardization model, the initial value of each three-level index is converted
For the value of utility of three-level index, the value of utility of the t three-level indexs is obtained.
S107:The value of utility of all three-level indexs is added, the comprehensive score in the area to be evaluated is obtained.
For an area to be evaluated, all data for describing the area to be evaluated are embodied in the initial value of three-level index
In the value of utility of three-level index, the initial value of three-level index directly reacts the concrete numerical value of each three-level index, such as right
The three-level index for newly registering enterprise's number in area to be evaluated, it is assumed that its initial value is 1000, that is, represents the area to be evaluated
The quantity for newly registering enterprise is 1000;Index for example for newly registering enterprise's growth rate again, it is assumed that its initial value is 20%, i.e.,
It is 20% to represent and newly register the growth rate of enterprise, can be evident that by above-mentioned two index, 1000 and 20% in number
It is widely different in amount, if the initial values of all three-level indexs is added, result to newly register the related of enterprise's growth rate
Property is very low, cannot represent the comprehensive condition in area to be evaluated.The purpose for converting initial value to value of utility is to remove difference three
Formal difference between grade index initial value, such as it is 20% to newly register the initial value of enterprise's growth rate, which exists
The value of utility being converted into after nondimensionalization standardization model may be 80, newly register the effect corresponding to enterprise's number 1000
May be 90 with value, the difference between the value of utility of the two three-level indexs is substantially reduced at this time, therefore is referred to all three-levels
Target value of utility is added the comprehensive score more science for obtaining area to be evaluated, more representative.
S108:According to the initial values of all three-level indexs in the area to be evaluated, all three-level indexs
Value of utility, the comprehensive score in the area to be evaluated and the historical data in the database generate data analysis report.
The innovation undertaking in area to be evaluated can be deeply dissected from horizontal and vertical two dimensions by data analysis report
Ability and economic trend, horizontal upwardly through socioeconomic data and business data over the years in database, deep understand is waited for
Evaluate the economic development tendency in area;The fractional value and comprehensive score of every first class index in area to be evaluated are carried out on longitudinal direction
Analysis, can macroscopically react the integration capability in area to be evaluated, judge the development trend in entire province/city;It can be government's tune
Whole development strategy formulates talent introduction plan, invites outside investment etc. and provides data and support.
Embodiment 2:
A kind of data processing method applied to industry big data analysis is present embodiments provided, one kind is illustrated in figure 2
For the data processing method flow chart of industry big data analysis, include the following steps:
S201:Socioeconomic data, business data and industrial economy data are captured in preset crawl range, to described
Socioeconomic data, business data and industrial economy data carry out processing generating structure data;
Specifically, capturing socioeconomic data, business data and industrial economy data by distributed network crawler technology.
Wherein socioeconomic data includes:Describe natural resources and condition, population and labour, social development, urban construction,
The data of people's lives and indicator of economic development;
The business data includes:Listing of a company information, business background, production and operation information, financial situation, enterprise are described
Operation, business risk, innovation ability, business production, news information industrial and commercial registration information, intellectual property information and public feelings information
Data.
The industrial economy data include:Describe price data, exponent data, world economic data, national economic data,
The data of regional economy data, industrial economy data and Economic Analysis Report.
S202:The structural data is screened and handled, the data of redundancy are removed.
Specifically, can be sieved according to different data sources and different data structures to the structural data being drawn into
Choosing, and structural data is put into data buffer zone, noise reduction, data splicing and dimension transformation are carried out to data, it is therefore an objective to go
Except the redundant data in structural data, the efficiency of data processing is improved.
S203;The structural data for removing redundant data is loaded into preset database.
Specifically, transformed data are loaded into database using Sqoop tools.
S204:Preset supplementary data is transferred to be added in the database.
Wherein preset supplementary data can be bought by third party's data and/or business tie-up it is obtained society warp
Data of helping or business data;Database is supplemented by the data of different channels, keeps the data in database more comprehensive,
It is more representative.
S205:The corresponding initial value of t three-level index in area to be evaluated is selected from the database.
Wherein, the three-level index includes the t three-level index in area to be evaluated, for describing the area to be evaluated
Socioeconomic data, the concrete numerical value of industrial economy data and business data, t three-level indexs are according to its set of properties
It is combined into p two-level index, p<T, p two-level index are q first class index according to its combinations of attributes.
S206;By preset dimensionless standardization model, the initial value of each three-level index is converted
For the value of utility of three-level index, the value of utility of t three-level index is obtained.
Preset dimensionless standardization model is:
Yij=(xij-min xj)/(max xj-min xj) * 100.
Wherein, xij is the initial value of the one of which three-level index in area to be evaluated, and i represents area, and j represents three-level and refers to
Mark, such as Shandong Province have 17 prefecture-level cities, respectively to each prefecture-level city with 1-17 into line label, while to the t of Shandong Province
A three-level index is labeled as 1 according to 1-j into line label, such as by Qingdao City, then x19 is the 9th three-level index of Qingdao City
Initial value.Min xj are the minimum value in the initial value of all three-level indexs in this area, and max xj are that all three-levels in this area refer to
Maximum value in target initial value.The initial value of all three-level indexs is calculated it by nondimensionalization standardization model
Afterwards, the order of magnitude of difference and achievement data value of all three-level indexs in measurement unit, the difference of relative fashion can be eliminated
Not, make data structure and form more unified, comparativity is had more between the value of utility of each three-level index.
More specifically, area to be evaluated is provided with 23 three-level indexs, 23 three-level index groups in the present embodiment
7 two-level index are synthesized, 7 two-level index are combined as 4 first class index, specifically see following table:
Three-level index includes the following contents:
Description newly register enterprise's number, newly register enterprise's growth rate, quantity is closed in newly-increased individual, agriculture, individual, agriculture conjunction increase by a year-on-year basis
Rate, quasi- listing and the listed amount of new three plate, each city's current year graduates' number, every ten thousand people patent of invention owning amount, scientific and technological achievement
Registrating number, technology contract transaction value, state-owned and large non-state industrial enterprises' new product income from sales, creation of new enterprise generate tax revenue increment, private
Seek the enterprise and newly-increased employed population number of individual, the private enterprise and individual employed population growth rate, the event number of investment and financing, investment and financing
Total scale, risk investment raises volume, state-owned and large non-state industrial enterprises' R&D expenditure intensity, state-owned and large non-state industrial enterprises grind
Hair activity accounting, GDP per capita, research and development funds inside expenditure accounts for GDP proportions, whole society's research and development funds expenditure accounts for
The proportion of GDP, government's expenditures on science and technology account for the ratio of government finance total expenditure and account for government finance to the preferential volume of enterprise's various kinds of taxes
The data of the ratio of total expenditure;Above-mentioned data are successively according to 1-23 into line label.Such as x12 be first it is to be evaluated area it is new
Register enterprise's growth rate.
Wherein newly register enterprise's number, newly register enterprise's growth rate, quantity is closed in newly-increased individual, agriculture, individual, agriculture conjunction increase by a year-on-year basis
Rate and quasi- listing and the listed amount of new three plate constitute the first two-level index;First two-level index is used to describe the enterprise in area to be evaluated
Industry index;
Each city's current year graduates' number constitutes the second two-level index;Second two-level index is for describing area to be evaluated
Talent's index;
Every ten thousand people patent of invention owning amount, scientific and technological achievement registrating number and technology contract transaction value constitute third two level
Index;Third two-level index is used to describe the innovation activity index in area to be evaluated;
State-owned and large non-state industrial enterprises' new product income from sales and creation of new enterprise generate tax revenue increment and constitute the four or two
Grade index;4th two-level index is used to describe the economic benefits indicator in area to be evaluated;
The private enterprise and the newly-increased employed population number of individual and the private enterprise and individual employed population growth rate are constituted
5th two-level index;5th two-level index is used to describe the drive employment index in area to be evaluated;
The event number of the investment and financing, the total scale of investment and financing and risk investment raise volume and constitute the 6th two-level index;
6th two-level index is used to describe the investment and financing index in area to be evaluated;
State-owned and large non-state industrial enterprises' R&D expenditure intensity and state-owned and large non-state industrial enterprises have development activities to account for
Than constituting the 7th two-level index;7th two-level index is used to describe the innovation support index in area to be evaluated;
Expenditure accounts for GDP ratios and is reconstructed into the 8th two-level index inside the GDP per capita and research and development funds;Eight or two
Grade index is used to describe the economic base index in area to be evaluated;
Whole society's research and development funds expenditure accounts for the proportion of GDP, government's expenditures on science and technology account for government finance total expenditure
Ratio and the 9th two-level index of composition of proportions of government finance total expenditure is accounted for the preferential volume of enterprise's various kinds of taxes;9th two level
Index is used to describe the policy environment index in area to be evaluated;
First two-level index and second two-level index constitute the first first class index;First first class index is for retouching
State double wound Motility Index in area to be evaluated;
The third two-level index, the 4th two-level index and the 5th two-level index constitute the second first class index;2nd 1
Grade index is used to describe double wound effect indexs in area to be evaluated;
6th two-level index and the 7th two-level index constitute third first class index;Third first class index is used for
Double wound support index in area to be evaluated are described;
8th two-level index and the 9th two-level index constitute the 4th first class index.4th first class index is for describing
Double wound environmental index in area to be evaluated.
S207:The value of utility for all three-level indexs for being included by each described two-level index is added, and obtains p two level
The fractional value of index.
Such as it is to be evaluated area description enterprise two-level index, including all three-level indexs be newly register enterprise
Industry number newly registers enterprise's growth rate, newly-increased individual, agriculture conjunction quantity, individual, agriculture conjunction year-on-year growth rate and quasi- listing and new three plate
Listed amount, it is assumed that the initial value for newly registering the three-level index of enterprise's number is 1000, that is, represent the area to be evaluated newly registers enterprise
The quantity of industry is 1000;The initial value for newly registering the index of enterprise's growth rate is 20%, that is, represents the growth rate for newly registering enterprise
Be 20%, can be evident that by above-mentioned two index, 1000 with 20% quantitatively widely different, if by institute
There is the initial value of three-level index to be added, result is very low with the correlation for newly registering enterprise's growth rate, cannot represent to be evaluatedly
The comprehensive condition in area.So the fractional value of two-level index is the sum of the value of utility of all three-level indexs, it converts initial value to effect
It is to remove the formal difference between different three-level index initial values with the purpose of value, such as newly registers enterprise's growth rate
Initial value is 20%, which may be 80 in the value of utility being converted into after nondimensionalization standardization model, newly
It may be 90 to register the value of utility corresponding to enterprise's number 1000, and the difference between the value of utility of the two three-level indexs is big at this time
It is big to reduce, therefore the fractional value more science of the two-level index obtained with the addition of the value of utility of all three-level indexs, have more generation
Table.
S208:The fractional value for all two-level index for being included by each described first class index is added, and obtains q level-one
The fractional value of achievement data;
By the fractional value of four first class index, the innovation undertaking in area to be evaluated can be reacted from four different angles
Ability.
Such as double wound Motility Index for area to be evaluated, the fractional value of double wound Motility Index of this area is that description is looked forward to
The fractional value of the two-level index of industry adds the fractional value of the two-level index of the description talent.
S209:Using preset comprehensive score model, practical score is input in the comprehensive score model, is obtained every
Double practical scores of wound index in a area to be evaluated;
Wherein preset practical score model is:
Wherein, based on 60 point, M is the practical score in each area to be evaluated,It is all described to be evaluated
The average value of the practical score in area, MAX are the maximum value of the practical score in all areas to be evaluated.
Practical score can react the summation score of all double wound indexes in area to be evaluated, convert practical score to synthesis
The purpose of score is, converts practical score to the form of hundred-mark system, and basis is divided into 60 points, and practical when area to be evaluated obtains
When dividing the average practical score more than all areas to be evaluated, it is more than 60 by the practical score that practical score computation model obtains
Point, it was demonstrated that the innovation & enterprise capability of this area is qualified;When the practical score in area to be evaluated is flat less than all areas to be evaluated
Practical score when, the practical score obtained by practical score computation model is less than 60 points, it was demonstrated that the innovation undertaking of this area
Ability is unqualified.Overall merit can be carried out to the innovation & enterprise capability in area to be evaluated by practical score, and convenient more
It is compared between a area to be evaluated.
S210:According to the initial value of all three-level indexs, the value of utility of all three-level indexs, institute in area to be evaluated
There are the fractional value of the first class index and two-level index, the comprehensive score in the area to be evaluated and going through in the database
History data generate data analysis report.It can deeply be dissected from horizontal and vertical two dimensions by data analysis report to be evaluated
The innovation & enterprise capability and economic trend in area, cross is upwardly through socioeconomic data over the years in database and enterprise's number
According to understanding the economic development tendency in area to be evaluated deeply;To the fractional value of every first class index in area to be evaluated on longitudinal direction
It is analyzed with comprehensive score, can macroscopically react the integration capability in area to be evaluated, judge that the development in entire province/city becomes
Gesture;Can be that government adjusts development strategy, formulates talent introduction plan, invites outside investment etc. and provides data support.
Through this embodiment, following advantageous effect is realized:
1, more comprehensively, the present invention acquires socioeconomic data, business data and industry by multiple support channels and passes through data area
Ji data provide strong data and support, data are more representative;
2, data analysis more science can carry out area to be evaluated many-sided, more by establishing double wound exponential models
The quantization of angle is divided equally, convenient comprehensively to be analyzed area to be evaluated;
3, analysis mode diversity, can by data analysis report or by the comprehensive score in each area to be evaluated into
Row analysis;Data analysis report can analyze the economic development tendency in area to be evaluated and double wound abilities deeply, and comprehensive score can
To facilitate the integration capability between multiple areas to be evaluated to compare.
Embodiment 3:
A kind of data processing equipment for industry big data analysis is present embodiments provided, if Fig. 3 is that one kind is used to produce
The block diagram of the data processing equipment of sparetime university's data analysis, the device include:
Data capture module 301, for capturing socioeconomic data, business data and industry in preset crawl range
Economic data carries out processing to socioeconomic data, business data and industrial economy data and is converted to structural data;
Wherein, socioeconomic data includes:Description natural resources is built with condition, population and labour, social development, city
If, the data of people's lives and indicator of economic development;Business data includes:Listing of a company information, business background, production warp are described
Seek information, financial situation, enterprise operation, business risk, innovation ability, business production, news information industrial and commercial registration information, knowledge
The data of property information and public feelings information;Industrial economy data include:Describe price data, exponent data, world economic data,
The data of national economic data, regional economy data, industrial economy data and Economic Analysis Report.
Data screening module 302 removes the data of redundancy for being screened and being handled the structural data;
Data load-on module 303, for the structural data for removing redundant data to be loaded into preset database
In 305;Specifically, the structural data for removing redundant data is specifically loaded into using Sqoop tools by data load-on module 303
In preset database 305.
Data complementary module 304 is added to for transferring preset supplementary data in the database 305;It is wherein default
Supplementary data can be bought by third party's data and/or the obtained socioeconomic data of business tie-up or business data,
Keep the data in database 305 more comprehensive.
Data transfer module 306, the t three-level index point for selecting area to be evaluated from the database 305
Not corresponding initial value, wherein three-level index is used to describe socioeconomic data, business data and the production in the area to be evaluated
The concrete numerical value of industry economic data;
Data processing module 307, for by preset dimensionless standardization model, each described three-level to be referred to
Target initial value is converted into the value of utility of three-level index, obtains the value of utility of t three-level indexs, and by all three-level indexs
Value of utility be added, obtain the comprehensive score in the area to be evaluated;
Preset dimensionless standardization model is:
Yij=(xij-min xj)/(max xj-min xj) * 100.
Wherein, xij is the initial value of the one of which three-level index in area to be evaluated, and i represents area, and j represents three-level and refers to
Mark, min xj are the minimum value in the initial value of all three-level indexs in this area, and max xj are all three-level indexs in this area
Maximum value in initial value.It, can after the initial value of all three-level indexs is calculated by nondimensionalization standardization model
To eliminate the order of magnitude of difference and achievement data value of all three-level indexs in measurement unit, the difference of relative fashion, make
Data structure and form are more unified, and comparativity is had more between the value of utility of each three-level index.By to area to be evaluated
The value of utility of all three-level indexs, which is added the comprehensive score obtained, can react the macroeconomic situation and wound in area to be evaluated
New students' ability, and for different areas to be evaluated, it is convenient to be compared by comprehensive score.
Data analysis module 308 is used for the initial value of all three-level indexs according to the area to be evaluated, owns
The value of utility of the three-level index, the comprehensive score in the area to be evaluated and the historical data in the database generate number
According to analysis report.Pass through the initial value of all three-level indexs to area to be evaluated, the effectiveness of all three-level indexs
Value, the comprehensive score in the area to be evaluated and the historical data in the database are analyzed, can more comprehensive, visitor
Economic conditions, economic trend and the innovation & enterprise capability in the analysis area to be evaluated of sight can be government's adjustment development
Strategy formulates talent introduction plan, invites outside investment etc. and provides data and support.
It should be understood by those skilled in the art that, the embodiment of the present invention can be provided as method, apparatus or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention
Apply the form of example.Moreover, the present invention can be used in one or more wherein include computer usable program code computer
The computer program production implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
The form of product.
Although some specific embodiments of the present invention are described in detail by example, the skill of this field
Art personnel it should be understood that example above merely to illustrating, the range being not intended to be limiting of the invention.The skill of this field
Art personnel are it should be understood that can without departing from the scope and spirit of the present invention modify to above example.This hair
Bright range is defined by the following claims.
Claims (10)
1. a kind of data processing method applied to industry big data analysis, which is characterized in that include the following steps:
Socioeconomic data, business data and industrial economy data are captured in preset crawl range, to the social economy
Data, business data and industrial economy data carry out processing generating structure data;
The structural data is screened and handled, the data of redundancy are removed;
The structural data for removing redundant data is loaded into preset database;
Preset supplementary data is transferred to be added in the database;
The corresponding initial value of t three-level index in area to be evaluated is selected from the database, wherein the three-level
Index is used to describe socioeconomic data, the concrete numerical value of business data and industrial economy data in the area to be evaluated;
By preset dimensionless standardization model, it converts the initial value of each three-level index to three-level index
Value of utility, obtain the value of utility of t three-level indexs;
The value of utility of all three-level indexs is added, the comprehensive score in the area to be evaluated is obtained;
According to the initial value of all three-level indexs, the value of utility of all three-level indexs, institute in the area to be evaluated
It states the comprehensive score in area to be evaluated and the historical data in the database generates data analysis report.
2. the data processing method according to claim 1 applied to industry big data analysis, which is characterized in that
The socioeconomic data includes:Natural resources and condition, population and labour, social development, urban construction, the people are described
The data of life and indicator of economic development;
The business data includes:Listing of a company information, business background, production and operation information, financial situation, enterprise's fortune are described
Battalion, business risk, innovation ability, business production, news information industrial and commercial registration information, intellectual property information and public feelings information number
According to;
The industrial economy data include:Price data, exponent data, world economic data, national economic data, area are described
The data of economic data, industrial economy data and Economic Analysis Report.
3. the data processing method according to claim 1 applied to industry big data analysis, which is characterized in that
The structural data for removing redundant data is loaded into preset database and is further, using Sqoop tools
The structural data for removing redundant data is loaded into preset database.
4. the data processing method according to claim 1 applied to industry big data analysis, which is characterized in that
The t three-level indexs are p two-level index, p according to its combinations of attributes<T, the p two-level index are according to its attribute
It is combined as q first class index, the method is further comprising the steps of:
The value of utility for all three-level indexs for being included by each described two-level index is added, and obtains point of p two-level index
Numerical value;
The fractional value for all two-level index for being included by each described first class index is added, and obtains point of q first class index
Numerical value;
According to the initial value of all three-level indexs, the value of utility of all three-level indexs, institute in the area to be evaluated
Have the fractional value of the two-level index, the fractional value of all first class index, the area to be evaluated comprehensive score and
Historical data in the database generates data analysis report.
5. the data processing method according to claim 4 applied to industry big data analysis, which is characterized in that
The three-level index includes at least the following contents:
Description newly register enterprise's number, newly register enterprise's growth rate, quantity is closed in newly-increased individual, agriculture, year-on-year growth rate is closed in individual, agriculture,
Quasi- listing and the listed amount of new three plate, the registration of each city's current year graduates' number, every ten thousand people patent of invention owning amount, scientific and technological achievement
Number, technology contract transaction value, state-owned and large non-state industrial enterprises' new product income from sales, creation of new enterprise generate tax revenue increment, private enterprise
The industry and newly-increased employed population number of individual, the private enterprise and individual employed population growth rate, the event number of investment and financing, investment and financing it is total
Scale, risk investment recruitment volume, state-owned and large non-state industrial enterprises' R&D expenditure intensity, state-owned and large non-state industrial enterprises have research and development to live
Dynamic accounting, GDP per capita, research and development funds inside expenditure accounts for GDP proportions, whole society's research and development funds expenditure accounts for GDP
Proportion, government's expenditures on science and technology account for the ratio of government finance total expenditure and to enterprise's various kinds of taxes preferential E Zhan government finances general branch
The data of the ratio gone out;
Wherein, described to newly register enterprise's number, newly register enterprise's growth rate, newly-increased individual, agriculture conjunction quantity, individual, agriculture contract than increasing
Long rate and quasi- listing and the listed amount of new three plate constitute the first two-level index;
Each city's current year graduates' number constitutes the second two-level index;
Every ten thousand people patent of invention owning amount, scientific and technological achievement registrating number and technology contract transaction value constitute third two level and refer to
Mark;
State-owned and large non-state industrial enterprises' new product income from sales and creation of new enterprise generate tax revenue increment the 4th two level of composition and refer to
Mark;
The private enterprise and the newly-increased employed population number of individual and the private enterprise and individual employed population growth rate constitute the 5th
Two-level index;
The event number of the investment and financing, the total scale of investment and financing and risk investment raise volume and constitute the 6th two-level index;
There are development activities accounting structure in state-owned and large non-state industrial enterprises' R&D expenditure intensity and state-owned and large non-state industrial enterprises
At the 7th two-level index;
Expenditure accounts for GDP ratios and is reconstructed into the 8th two-level index inside the GDP per capita and research and development funds;
Whole society's research and development funds expenditure accounts for the proportion of GDP, government's expenditures on science and technology account for the ratio of government finance total expenditure
Example and the 9th two-level index of composition of proportions that government finance total expenditure is accounted for the preferential volume of enterprise's various kinds of taxes;
First two-level index and second two-level index constitute the first first class index;
The third two-level index, the 4th two-level index and the 5th two-level index constitute the second first class index;Described 6th 2
Grade index and the 7th two-level index constitute third first class index;
8th two-level index and the 9th two-level index constitute the 4th first class index.
6. the data processing method according to claim 1 applied to industry big data analysis, which is characterized in that
The dimensionless standardization model is:
Yij=(xij-min xj)/(max xj-min xj) * 100,
Wherein, the xij is the initial value of the jth item three-level index in i-th of area, and min xj are that described i-th area is all
Minimum value in the initial value of the three-level index, max xj are the initial value of the described i-th all three-level indexs in area
In maximum value, it is described i-th area be the area to be evaluated, j be more than or equal to 1 and be less than or equal to t, i is natural number.
7. the data processing method according to claim 1 applied to industry big data analysis, which is characterized in that
The comprehensive score model is:
Wherein, based on 60 point, M is the practical score in the area to be evaluated,For the practical score in all areas to be evaluated
Average value, MAX be all areas to be evaluated practical score maximum value.
8. a kind of data processing equipment applied to industry big data analysis, which is characterized in that including:
Data capture module, for capturing socioeconomic data, business data and industrial economy number in preset crawl range
According to, to the socioeconomic data, the business data and the industrial economy data carry out processing be converted to structuring number
According to;
Data screening module removes the data of redundancy for being screened and being handled the structural data;
Data load-on module, for the structural data for removing redundant data to be loaded into preset database;
Data complementary module is added to for transferring preset supplementary data in the database;
Data transfer module, and the t three-level index for selecting area to be evaluated from the database is corresponding just
Initial value, wherein the three-level index is used to describe socioeconomic data, business data and the industrial economy in the area to be evaluated
The concrete numerical value of data;
Data processing module is used for through preset dimensionless standardization model, by the first of each three-level index
Initial value is converted into the value of utility of three-level index, obtains the value of utility of t three-level indexs, and by the effectiveness of all three-level indexs
Value is added, and obtains the comprehensive score in the area to be evaluated;
Data analysis module, for according to the initial values of all three-level indexs in the area to be evaluated, all described three
The value of utility of grade index, the comprehensive score in the area to be evaluated and the historical data in the database generate data analysis
Report.
9. the data processing equipment according to claim 8 applied to industry big data analysis, which is characterized in that
The socioeconomic data includes:Natural resources and condition, population and labour, social development, urban construction, the people are described
The data of life and indicator of economic development;
The business data includes:Listing of a company information, business background, production and operation information, financial situation, enterprise's fortune are described
Battalion, business risk, innovation ability, business production, news information industrial and commercial registration information, intellectual property information and public feelings information number
According to;
The industrial economy data include:Price data, exponent data, world economic data, national economic data, area are described
The data of economic data, industrial economy data and Economic Analysis Report.
10. the data processing equipment according to claim 8 applied to industry big data analysis, which is characterized in that
The structural data for removing redundant data is specifically loaded into using Sqoop tools default by the data load-on module
Database in.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810522737.8A CN108764717A (en) | 2018-05-28 | 2018-05-28 | Data processing method and device applied to industry big data analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810522737.8A CN108764717A (en) | 2018-05-28 | 2018-05-28 | Data processing method and device applied to industry big data analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108764717A true CN108764717A (en) | 2018-11-06 |
Family
ID=64002878
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810522737.8A Pending CN108764717A (en) | 2018-05-28 | 2018-05-28 | Data processing method and device applied to industry big data analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108764717A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110335083A (en) * | 2019-07-10 | 2019-10-15 | 山东众志电子有限公司 | A kind of price big data analysis system and method based on cloud platform |
CN111598459A (en) * | 2020-05-19 | 2020-08-28 | 苏州云联智慧信息技术应用有限公司 | Intensive utilization method and device for enterprise resources |
CN111724079A (en) * | 2020-06-29 | 2020-09-29 | 信阳农林学院 | Industry economic data management system based on big data |
CN113554299A (en) * | 2021-07-19 | 2021-10-26 | 云南省烟草烟叶公司 | System, method and device for evaluating comprehensive grade quality of tobacco leaves and electronic equipment |
CN116502918A (en) * | 2023-05-12 | 2023-07-28 | 广东省科技基础条件平台中心 | Innovative capability evaluation method of technological innovation platform |
CN116663750A (en) * | 2023-07-31 | 2023-08-29 | 北京市科学技术研究院 | Industrial chain data value evaluation analysis system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2015100771A4 (en) * | 2015-04-27 | 2015-07-09 | Xero Limited | Benchmarking through data mining |
CN104966172A (en) * | 2015-07-21 | 2015-10-07 | 上海融甸信息科技有限公司 | Large data visualization analysis and processing system for enterprise operation data analysis |
CN106845829A (en) * | 2017-01-20 | 2017-06-13 | 国信优易数据有限公司 | A kind of civil-military inosculation data message quantified system analysis |
CN107274064A (en) * | 2017-05-15 | 2017-10-20 | 东南大学 | Highway operation conditions Dynamic Comprehensive Evaluation method |
-
2018
- 2018-05-28 CN CN201810522737.8A patent/CN108764717A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2015100771A4 (en) * | 2015-04-27 | 2015-07-09 | Xero Limited | Benchmarking through data mining |
CN104966172A (en) * | 2015-07-21 | 2015-10-07 | 上海融甸信息科技有限公司 | Large data visualization analysis and processing system for enterprise operation data analysis |
CN106845829A (en) * | 2017-01-20 | 2017-06-13 | 国信优易数据有限公司 | A kind of civil-military inosculation data message quantified system analysis |
CN107274064A (en) * | 2017-05-15 | 2017-10-20 | 东南大学 | Highway operation conditions Dynamic Comprehensive Evaluation method |
Non-Patent Citations (1)
Title |
---|
王卓伦 等: "网络零售物流企业信用评价指标体系与风险预警", 《福建电脑》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110335083A (en) * | 2019-07-10 | 2019-10-15 | 山东众志电子有限公司 | A kind of price big data analysis system and method based on cloud platform |
CN111598459A (en) * | 2020-05-19 | 2020-08-28 | 苏州云联智慧信息技术应用有限公司 | Intensive utilization method and device for enterprise resources |
CN111724079A (en) * | 2020-06-29 | 2020-09-29 | 信阳农林学院 | Industry economic data management system based on big data |
CN113554299A (en) * | 2021-07-19 | 2021-10-26 | 云南省烟草烟叶公司 | System, method and device for evaluating comprehensive grade quality of tobacco leaves and electronic equipment |
CN116502918A (en) * | 2023-05-12 | 2023-07-28 | 广东省科技基础条件平台中心 | Innovative capability evaluation method of technological innovation platform |
CN116502918B (en) * | 2023-05-12 | 2024-04-05 | 广东省科技基础条件平台中心 | Innovative capability evaluation method of technological innovation platform |
CN116663750A (en) * | 2023-07-31 | 2023-08-29 | 北京市科学技术研究院 | Industrial chain data value evaluation analysis system |
CN116663750B (en) * | 2023-07-31 | 2023-10-13 | 北京市科学技术研究院 | Industrial chain data value evaluation analysis system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108764717A (en) | Data processing method and device applied to industry big data analysis | |
Guo et al. | Measuring and evaluating SDG indicators with Big Earth Data | |
Boss et al. | Contagion flow through banking networks | |
Sakhno et al. | A Methodological Analysis for the Impact Assessment of the Digitalisation of Economy on Agricultural Growth. | |
Zubiashvili et al. | Labour Emigration and Employment in Georgia | |
Taraniuk et al. | Estimation of the marketing potential of industrial enterprises in the period of re-engineering of business processes | |
Wang et al. | Impact of farmland characteristics on grain costs and benefits in the North China Plain | |
Pasnicu | Supporting SMEs in creating jobs | |
Seitzhanov et al. | Innovational approach of business management in Kazakhstan | |
Vijayalakshmi et al. | Factors determining in foreign direct investment (FDI) in India | |
Wang et al. | Promoting mineral resources consumption efficiency: Evidence from technology of big data | |
Cestti et al. | Indirect economic impacts of dams | |
Jurayevich | Main directions of improvement of the process of investment attraction | |
Chan et al. | Galvanizing the groundswell of climate actions in the developing world | |
Di Giorno et al. | A Niche Approach for Modeling Economic Competition. | |
Cheng et al. | Risk identification of public infrastructure projects based on VFPE | |
Samadi-Parviznejad et al. | Identifying and evaluating smart city marketing parameters (Case study: Tabriz) | |
CN113128912B (en) | Method for determining industry new and old kinetic energy conversion level based on electricity consumption data | |
Alexandru et al. | Analysis of Romanian development regions-a first step to support national regional development priorities. | |
Agbonifi | The dynamic approach of modelling regional recovery investment policies using environmentally-extended SAM Matrix | |
Ruchkin | Model for the development of design solutions in the framework of strategic planning for the development of rural areas | |
Gladevich | Assessment of the innovation potential of the regions of Latvia, Lithuania and Belarus | |
Wijaya et al. | Decomposition of the Theil index in inequality analyses in Yogyakarta Indonesia | |
Manzhosova | Digital analysis as a tool for assessing regional opportunities for the transition to digital technologies | |
Goncharov | Green Human Capital: Problems and Development Strategy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181106 |