CN105956015A - Service platform integration method based on big data - Google Patents
Service platform integration method based on big data Download PDFInfo
- Publication number
- CN105956015A CN105956015A CN201610254729.0A CN201610254729A CN105956015A CN 105956015 A CN105956015 A CN 105956015A CN 201610254729 A CN201610254729 A CN 201610254729A CN 105956015 A CN105956015 A CN 105956015A
- Authority
- CN
- China
- Prior art keywords
- data
- stored
- hbase
- data base
- base
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
Abstract
The invention discloses a service platform integration method based on big data. The method comprises the following steps: (1) acquiring multi-source heterogeneous data; (2) integrating the acquired multi-source heterogeneous data, and storing the integrated data in an Hbase database; (3) using Hive to perform ETL treatment on the integrated data which is stored in the Hbase database, and storing in the Hbase database, cleaning the data which is stored in the Hbase database, to obtain clean data, and storing the clean data in the Hbase database; (4) based on an Hadoop technology, performing modeling analysis on the obtained clean data, and storing an analysis result in an Hive database; (5) using a service-oriented architecture based on SOA to establish a data exchange and shared service bus, and then based on the service bus, establishing a data exchange architecture, and through the data exchange architecture, pushing the analysis result which is stored in the Hive database to a business application system database. The method effectively reduces communication cost and time cost, and improves data effective utilization rate.
Description
Technical field
The present invention relates to big data technique field, particularly to a kind of service platform integration methods based on big data.
Background technology
Big data are information-based industry next generation's information technology commanding elevations, and discussion is put in smart city construction the most at home
Schedule, substantial amounts of data are contained in smart city, are to serve government affairs, enterprise and civic a new generation application technology, but existing
The big Data Integration in smart city still can not serve government affairs and the public in city various aspects, this mainly due to
Under the limitation of several aspects cause:
(1) computation model of the complexity of big data, the most simply carries out property analysis and regularity exploring to multi-source data, does not still have
There is complete application process system;
(2) structural data is few, and unstructured data is many, does not the most have more advanced technology or means process destructuring and half
Structural data;
(3) system modelling exploring big data complexity, the depicting method of uncertain feature description and big data is not perfect;
(4) current big data mining is substantially at the state once excavating coarse knowledge, does not mouses out more perfect secondary and digs
Pick method provides wisdom knowledge elicitation decision-making for decision-making level.
Summary of the invention
In order to overcome disadvantages described above, it is an object of the invention to provide a kind of service platform integration method based on big data,
This invention is the collection to urban multi-source isomeric data, integrate, store, clean, modeling analysis and a kind of method of application, passes through
This method will form the processing procedure from bottom to top of data;Compare with routine data processing mode, the multi-data source of integration
Add the effective rate of utilization of data, effectively reduce communication cost and time cost.
In order to reach object above, the invention provides a kind of service platform integration method based on big data, including with
Lower step:
Step 1: gather multi-source heterogeneous data;
Step 2: the multi-source heterogeneous data collected are integrated, and will integrate after data be stored in Hbase data base;
Step 3: utilize Hive that the data being stored in Hbase data base after integrating are carried out ETL process, and be stored in Hbase data base,
The data being stored in Hbase data base are carried out obtaining clean data, and clean data are stored in Hbase data
In storehouse;
Step 4: be modeled analyzing to the clean data being stored in Hbase data base based on Hadoop technology, and will analyze
Result is stored in Hive data base;
Step 5: use service architecture based on SOA set up data exchange and share service bus, be then based on service bus and build
Vertical data architecture for exchanging, will be stored in the analysis result in Hive data base by data architecture for exchanging and pushes to service application system
In system data base, in order to analysis result data is applied in corresponding service system.
Preferably, gathering multi-source heterogeneous data in described step 1, its step is specific as follows:
Step 1.1: configuration multiple and distributing sources;
Step 1.2: multiple and distributing sources is packaged into data members;
Step 1.3: the data members being packaged into is read out and converts thereof into global object;
Step 1.4: will convert into the data members combination of global object, it is achieved the unified component that accesses of multi-source heterogeneous data is put down
Platform;
Step 1.5: gather multi-source heterogeneous data by component platform and transmit to data center, completing adopting of multi-source heterogeneous data
Collection;
Preferably, the data being stored in Hbase data base are carried out obtaining clean data by described step 3, its step
The most specific as follows:
Step 3.1: the data being stored in Hbase data base are carried out duplicate checking process;
Step 3.2: the missing data after duplicate checking is carried out interpolation data process;
Step 3.3: the data after filling a vacancy are carried out cluster analysis, analyzes the data being free in cluster edge;According to different numbers
According to type set effective range, get rid of extraneous value, obtain clean data, and be stored in Hbase data base.
Preferably, the clean data being stored in Hbase data base are built by described step 4 based on Hadoop technology
Mode division is analysed, including:
Based on hadoop technology, the clean data being stored in Hbase data base are carried out cluster analysis, the number after cluster analysis
According to being respectively stored in Hive data base, for future use, its detailed process is as follows:
(1) create an initialization point, select k object from the clean data being stored in Hbase data base randomly, by this
A little objects are as a bunch center;(2) remaining clean data and the distance at each bunch of center in Hbase data base are judged;(3) by remaining
Under clean data be assigned to a bunch center successively;(4) when having data object to join and depart from bunch when, the flat of this bunch is automatically calculated
These data, if being unsatisfactory for minimum range, are redistributed bunch by average;(5) circulation repeat the above steps, until bunch center
Data no longer change, and now record result;(6) result is stored in Hive data base.
The clean data that will be stored in Hbase data base based on hadoop technology carry out Collaborative Recommendation analysis, work in coordination with and push away
Recommending the data after analysis to be stored in Hive data base, for future use, its detailed process is as follows:
(1) obtain the clean data being stored in Hbase data base, and be converted into the data set analyzing desirable format;
(2) data set is divided into training dataset and test data set;
(3) recommended models is trained with training dataset;
(4) precision of recommended models is assessed by test set data;
(5) when the precision of recommended models meets demand, recommend, export result, otherwise re-start training and obtain model,
Till reevaluating until being met the data of demand;
(6) result of output is stored in Hive data base;
The clean data that will be stored in Hbase data base based on hadoop technology carry out classification analysis, after classification analysis
Data are stored in Hive data base, and stamp different label for different pieces of information, and for future use, its detailed process is as follows:
(1) obtain the clean data being stored in Hbase data base, and be converted into the data set analyzing desirable format;
(2) it is that data set gives characteristic attribute, according to characteristic attribute, data set is suitably divided into multiple item to be sorted, right
A part of sorting item is classified, and forms training sample set;
(3) frequency and each occurred according to each classification in our data classified counting training sample set to be obtained
The characteristic attribute probability Estimation to each classification, obtains grader;
(4) use grader that the data needing classification are classified, export result;
(5) result is saved in Hive data base.
Preferably, in described step 3.1 data being stored in Hbase data base being carried out duplicate checking process, it is concrete
Step is as follows:
Step 3.1.1: the data being stored in Hbase data base are repeated inquiry, filters out all fields and repeat completely
Data;Retain a pen data, remove other data repeated completely;
Step 3.1.2: carry out Data duplication inquiry with critical field;Filter out the data that critical field repeats;Heavier plural number
According to integrity, more complete one of reserved field data, remove remaining repeat data.
Preferably, in described step 3.2, the missing data after duplicate checking being carried out interpolation data process, its concrete steps are such as
Under:
Step 3.2.1: for regular missing data and inessential data, then delete disappearance;For regular missing data and
More important data, utilize partial data to calculate data weighting and augment;For irregular missing data according to missing data
Type processes respectively;
Step 3.2.2: the same attribute data acquisition value that there is data mean value the highest with this property value probability of occurrence is carried out
Fill up;First it is that each missing values produces possible interpolation value for different attribute missing at random data separate data, according to can
The partial data that the interpolation value of energy is formed carries out statistical analysis, evaluates analysis result, forms final interpolation value to lacking
Mistake value carries out interpolation.
Preferably, in described step 1.2, multiple and distributing sources being packaged into data members, it concretely comprises the following steps:
Step 1.2.1: utilize database table structure to prepare component object;
Step 1.2.2: gone out the tabular table in data base by data base querying;
Step 1.2.3: go out Database field and the field data structure of each table with the tables of data in tabular table for Object Query;
Step 1.2.4: read out by list structure for object with tables of data, is set to the basic of table object by data field attributes
Attribute information;
Step 1.2.5: Object table is packaged into a component can inquired about by attribute field;
Compared with prior art, beneficial effects of the present invention: the present invention is the collection to urban multi-source isomeric data, integrates, deposits
Storage, cleaning, modeling analysis and a kind of method of application, will form the processing procedure from bottom to top of data by this method;
And the source and processing procedure forming data is accomplished have mark to look into, has Zhang Kezun;Compare with routine data processing mode, whole
The multi-data source closed adds the effective rate of utilization of data, effectively reduces communication cost and time cost.
Accompanying drawing explanation
Fig. 1 is flow chart of the present invention;
Fig. 2 is ETL module based on Sqoop.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Describe wholely.
As depicted in figs. 1 and 2, the invention provides a kind of service platform integration method based on big data, including following
Step:
Step 1: gather multi-source heterogeneous data, its concrete gatherer process comprises the following steps:
Step 1.1: configuration multiple and distributing sources, (including: territory, environmental protection, water conservancy, meteorology, woods in the department gathering data
Industry, safety supervision, quality supervision etc.) configure front end processor and the terminal unit being manually entered, in data center configuration front end processor, application service
The equipment such as device, database server, WEB server, system monitoring terminal and operational terminal are acquired the Business Processing of data;
Step 1.2: multiple and distributing sources is packaged into data members, it concretely comprises the following steps:
Step 1.2.1: utilize database table structure to prepare component object;
Step 1.2.2: gone out the tabular table in data base by data base querying;
Step 1.2.3: go out Database field and the field data structure of each table with the tables of data in tabular table for Object Query;
Step 1.2.4: read out by list structure for object with tables of data, is set to the basic of table object by data field attributes
Attribute information;
Step 1.2.5: Object table is packaged into a component can inquired about by attribute field;
Step 1.3: the data members being packaged into is read out and converts thereof into global object;
Step 1.4: will convert into the data members combination of global object, it is achieved the unified component that accesses of multi-source heterogeneous data is put down
Platform;
Step 1.5: gather multi-source heterogeneous data by component platform and transmit to data center, completing adopting of multi-source heterogeneous data
Collecting, it concretely comprises the following steps:
Step 1.5.1 arranges data acquisition modes, including frequency acquisition, acquisition node, acquisition range;
Step 1.5.2. is intended to gather digital independent out by component platform;
The data read out are transmitted through the network to data center by step 1.5.3..
Step 2: the multi-source heterogeneous data collected are integrated, and will integrate after data be stored in Hbase data
Storehouse.
Step 3: utilize Hive that the data being stored in Hbase data base after integrating are carried out ETL process, and be stored in Hbase number
According to storehouse, the data being stored in Hbase data base are carried out obtaining clean data, and clean data are stored in
In Hbase data base, its detailed process is:
As in figure 2 it is shown, utilize Hive that the data being stored in Hbase data base after integrating are carried out ETL process, ETL based on Sqoop
Module first passes through Java data base and connects JDBC and data source and set up the metadata information being connected and checking data source, then general
The SQL categorical data that JDBC end obtains is converted to the Sqoop record of java class form, and submits to as formatting input
MapReduce task, finally, by initiating Map task and the Reduce task of respective numbers, and then writes data into HDFS
In, client node calls HDFS API, and whole file is divided into packet (packet) one by one, simultaneously by data queue
The mode of (Data queue) manages packet, and etc. pending, then, to the data block that NameNode application is new, and obtain
Taking one group of DataNode and carry out actual storage data block copy (replicas), DataNode constitutes a pipeline (pipeline),
Packet is write successively in corresponding DataNode, when last DataNode data has write, and opposite direction will confirm that
Information returns successively, finally submits to NameNode to represent that write completes;
Being carried out obtaining clean data to the data being stored in Hbase data base, it concretely comprises the following steps:
Step 3.1: the data being stored in Hbase data base are carried out duplicate checking process, it concretely comprises the following steps:
Step 3.1.1: the data being stored in Hbase data base are repeated inquiry, filters out all fields and repeat completely
Data;Retain a pen data, remove other data repeated completely;
Step 3.1.2: carry out Data duplication inquiry with critical field;Filter out the data that critical field repeats;Heavier plural number
According to integrity, more complete one of reserved field data, remove remaining repeat data.
Step 3.2: the missing data after duplicate checking is carried out interpolation data process, it concretely comprises the following steps:
Step 3.2.1: for regular missing data and inessential data, then delete disappearance;For regular missing data and
More important data, utilize partial data to calculate data weighting and augment;For irregular missing data according to missing data
Type processes respectively;
Step 3.2.2: the same attribute data acquisition value that there is data mean value the highest with this property value probability of occurrence is carried out
Fill up;First it is that each missing values produces possible interpolation value for different attribute missing at random data separate data, according to can
The partial data that the interpolation value of energy is formed carries out statistical analysis, evaluates analysis result, forms final interpolation value to lacking
Mistake value carries out interpolation.
Step 3.3: the data after filling a vacancy are carried out cluster analysis, analyzes the data being free in cluster edge;According to not
Set effective range with data type, get rid of extraneous value, obtain clean data, and be stored in Hbase data base.
Step 4: be modeled analyzing to the clean data being stored in Hbase data base based on Hadoop technology, and will
Analysis result is stored in Hive data base, comprising:
Carry out cluster analysis based on hadoop technology, will be stored in data in Hbase data base and cluster, Hbase data base
Middle data object is multiple class, has higher similarity between the object in same class, and the object difference in inhomogeneity
Relatively big, the data after cluster are stored in Hive data base, for future use;K-means cluster be a kind of widely used based on
The cluster algorithm divided, the process of implementing is:
(1) create an initialization point, select k object from the clean data being stored in Hbase data base randomly, by this
A little objects are as a bunch center;(2) remaining clean data and the distance at each bunch of center in Hbase data base are judged;(3) by remaining
Under clean data be assigned to a bunch center successively;(4) when having data object to join and depart from bunch when, the flat of this bunch is automatically calculated
These data, if being unsatisfactory for minimum range, are redistributed bunch by average;(5) circulation repeat the above steps, until bunch center
Data no longer change, and now record result;(6) result is stored in Hive data base.
Carrying out Collaborative Recommendation analysis based on hadoop technology, use habit and data customization label according to user are to difference
User type or the corresponding data of object recommendation, by being stored in Hive data base after training;Collaborative filtering is commending system
Widely used a kind of technology, it is mainly by considering the similarity between user and user, between article and article, come to
Family is recommended, and Collaborative Filtering with ALS-WR is a conventional proposed algorithm, this algorithm core
Thought exactly all of user and project are imagined as a two-dimensional table, this form has data cell (i, j),
Being the i-th user scoring to jth project, then utilizing this algorithm to use in form has the cell of data to be predicted as
Empty cell.The data that prediction obtains are user's scoring to project, then according to the project of prediction is marked from high to low
Sequence, just can recommend, and the process of implementing is:
(1) obtain the clean data being stored in Hbase data base, and be converted into the data set analyzing desirable format;
(2) data set is divided into training dataset and test data set;
(3) recommended models is trained with training dataset;
(4) precision of recommended models is assessed by test set data;
(5) when the precision of recommended models meets demand, recommend, export result, otherwise re-start training and obtain model,
Till reevaluating until being met the data of demand;
(6) result of output is stored in Hive data base.
Classifying based on hadoop technology, the data after gathering training carry out classifying and are stored in Hive data base
In, and stamp different label for different pieces of information, in order to and follow-up use, Naive Bayes Classification is a kind of conventional sorting algorithm,
Its core concept is the item to be sorted for being given, solve each classification occurs under conditions of this occurs probability which
Greatly, being considered as which classification this item to be sorted belongs to, the process of implementing is:
(1) obtain the clean data being stored in Hbase data base, and be converted into the data set analyzing desirable format;
(2) it is that data set gives characteristic attribute, according to characteristic attribute, data set is suitably divided into multiple item to be sorted, right
A part of sorting item is classified, and forms training sample set;
(3) frequency and each occurred according to each classification in our data classified counting training sample set to be obtained
The characteristic attribute probability Estimation to each classification, obtains grader;
(4) use grader that the data needing classification are classified, export result;
(5) result is saved in Hive data base.
Step 5: use service architecture based on SOA set up data exchange and share service bus, be then based on service total
Data architecture for exchanging set up by line, by data architecture for exchanging will be stored in the analysis result in Hive data base push to business should
With in system database, in order to analysis result data is applied in corresponding service system.
Claims (7)
1. a service platform integration method based on big data, it is characterised in that comprise the following steps:
Step 1: gather multi-source heterogeneous data;
Step 2: the multi-source heterogeneous data collected are integrated, and will integrate after data be stored in Hbase data base;
Step 3: utilize Hive that the data being stored in Hbase data base after integrating are carried out ETL process, and be stored in Hbase data base,
The data being stored in Hbase data base are carried out obtaining clean data, and clean data are stored in Hbase data base
In;
Step 4: be modeled analyzing to the clean data being stored in Hbase data base based on Hadoop technology, and will analyze
Result is stored in Hive data base;
Step 5: use service architecture based on SOA set up data exchange and share service bus, be then based on service bus and build
Vertical data architecture for exchanging, will be stored in the analysis result in Hive data base by data architecture for exchanging and pushes to service application system
In system data base, in order to apply in corresponding service system.
A kind of service platform integration methods based on big data the most according to claim 1, it is characterised in that described step
Gathering multi-source heterogeneous data in 1, its step is specific as follows:
Step 1.1: configuration multiple and distributing sources;
Step 1.2: multiple and distributing sources is packaged into data members;
Step 1.3: the data members being packaged into is read out and converts thereof into global object;
Step 1.4: will convert into the data members combination of global object, it is achieved the unified component that accesses of multi-source heterogeneous data is put down
Platform;
Step 1.5: gather multi-source heterogeneous data by component platform and transmit to data center, completing adopting of multi-source heterogeneous data
Collection.
A kind of service platform integration methods based on big data the most according to claim 1, it is characterised in that described step
Being carried out obtaining clean data to the data being stored in Hbase data base in 3, its step is specific as follows:
Step 3.1: the data being stored in Hbase data base are carried out duplicate checking process;
Step 3.2: the missing data after duplicate checking is carried out interpolation data process;
Step 3.3: the data after filling a vacancy are carried out cluster analysis, analyzes the data being free in cluster edge;According to different numbers
According to type set effective range, get rid of extraneous value, obtain clean data, and be stored in Hbase data base.
A kind of service platform integration methods based on big data the most according to claim 1, it is characterised in that described step
It is modeled analyzing to the clean data being stored in Hbase data base based on Hadoop technology in 4, including:
Based on hadoop technology, the clean data being stored in Hbase data base are carried out cluster analysis, the number after cluster analysis
According to being respectively stored in Hive data base, for future use, its detailed process is as follows:
(1) create an initialization point, select k object from the clean data being stored in Hbase data base randomly, by this
A little objects are as a bunch center;(2) remaining clean data and the distance at each bunch of center in Hbase data base are judged;(3) by remaining
Under clean data be assigned to a bunch center successively;(4) when having data object to join and depart from bunch when, the flat of this bunch is automatically calculated
These data, if being unsatisfactory for minimum range, are redistributed bunch by average;(5) circulation repeat the above steps, until bunch center
Data no longer change, and now record result;(6) result is stored in Hive data base;
The clean data that will be stored in Hbase data base based on hadoop technology carry out Collaborative Recommendation analysis, and Collaborative Recommendation divides
Data after analysis are stored in Hive data base, and for future use, its detailed process is as follows:
(1) obtain the clean data being stored in Hbase data base, and be converted into the data set analyzing desirable format;
(2) data set is divided into training dataset and test data set;
(3) recommended models is trained with training dataset;
(4) precision of recommended models is assessed by test set data;
(5) when the precision of recommended models meets demand, recommend, export result, otherwise re-start training and obtain model,
Till reevaluating until being met the data of demand;
(6) result of output is stored in Hive data base;
The clean data that will be stored in Hbase data base based on hadoop technology carry out classification analysis, after classification analysis
Data are stored in Hive data base, and stamp different label for different pieces of information, and for future use, its detailed process is as follows:
(1) obtain the clean data being stored in Hbase data base, and be converted into the data set analyzing desirable format;
(2) it is that data set gives characteristic attribute, according to characteristic attribute, data set is suitably divided into multiple item to be sorted, right
A part of sorting item is classified, and forms training sample set;
(3) frequency and each occurred according to each classification in our data classified counting training sample set to be obtained
The characteristic attribute probability Estimation to each classification, obtains grader;
(4) use grader that the data needing classification are classified, export result;
(5) result is saved in Hive data base.
A kind of service platform integration methods based on big data the most according to claim 3, it is characterised in that step 3.1
In the data being stored in Hbase data base are carried out duplicate checking process, it specifically comprises the following steps that
Step 3.1.1: the data being stored in Hbase data base are repeated inquiry, filters out all fields and repeat completely
Data;Retain a pen data, remove other data repeated completely;
Step 3.1.2: carry out Data duplication inquiry with critical field;Filter out the data that critical field repeats;Heavier plural number
According to integrity, more complete one of reserved field data, remove remaining repeat data.
A kind of service platform integration methods based on big data the most according to claim 3, it is characterised in that step 3.2
In the missing data after duplicate checking is carried out interpolation data process, it specifically comprises the following steps that
Step 3.2.1: for regular missing data and inessential data, then delete disappearance;For regular missing data and
More important data, utilize partial data to calculate data weighting and augment;For irregular missing data according to missing data
Type processes respectively;
Step 3.2.2: the same attribute data acquisition value that there is data mean value the highest with this property value probability of occurrence is carried out
Fill up;First it is that each missing values produces possible interpolation value for different attribute missing at random data separate data, according to can
The partial data that the interpolation value of energy is formed carries out statistical analysis, evaluates analysis result, forms final interpolation value to lacking
Mistake value carries out interpolation.
A kind of service platform integration methods based on big data the most according to claim 2, it is characterised in that step 1.2
Middle multiple and distributing sources being packaged into data members, it concretely comprises the following steps:
Step 1.2.1: utilize database table structure to prepare component object;
Step 1.2.2: gone out the tabular table in data base by data base querying;
Step 1.2.3: go out Database field and the field data structure of each table with the tables of data in tabular table for Object Query;
Step 1.2.4: read out by list structure for object with tables of data, is set to the basic of table object by data field attributes
Attribute information;
Step 1.2.5: Object table is packaged into a component can inquired about by attribute field.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610254729.0A CN105956015A (en) | 2016-04-22 | 2016-04-22 | Service platform integration method based on big data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610254729.0A CN105956015A (en) | 2016-04-22 | 2016-04-22 | Service platform integration method based on big data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105956015A true CN105956015A (en) | 2016-09-21 |
Family
ID=56914723
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610254729.0A Pending CN105956015A (en) | 2016-04-22 | 2016-04-22 | Service platform integration method based on big data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105956015A (en) |
Cited By (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106446255A (en) * | 2016-10-18 | 2017-02-22 | 安徽天达网络科技有限公司 | Data processing method based on cloud server |
CN106649801A (en) * | 2016-12-29 | 2017-05-10 | 广东精规划信息科技股份有限公司 | Time-space relationship analysis system based on multi-source internet-of-things position awareness |
CN106651188A (en) * | 2016-12-27 | 2017-05-10 | 贵州电网有限责任公司贵阳供电局 | Electric transmission and transformation device multi-source state assessment data processing method and application thereof |
CN106844585A (en) * | 2017-01-10 | 2017-06-13 | 广东精规划信息科技股份有限公司 | A kind of time-space relationship analysis system based on multi-source Internet of Things location aware |
CN106844636A (en) * | 2017-01-21 | 2017-06-13 | 亚信蓝涛(江苏)数据科技有限公司 | A kind of unstructured data processing method based on deep learning |
CN107103050A (en) * | 2017-03-31 | 2017-08-29 | 海通安恒(大连)大数据科技有限公司 | A kind of big data Modeling Platform and method |
CN107291951A (en) * | 2017-07-24 | 2017-10-24 | 北京都在哪智慧城市科技有限公司 | Data processing method, device, storage medium and processor |
CN107465692A (en) * | 2017-09-15 | 2017-12-12 | 湖北省楚天云有限公司 | Unification user identity identifying method, system and storage medium |
CN107590225A (en) * | 2017-09-05 | 2018-01-16 | 江苏电力信息技术有限公司 | A kind of Visualized management system based on distributed data digging algorithm |
CN107656995A (en) * | 2017-09-20 | 2018-02-02 | 温州市鹿城区中津先进科技研究院 | Towards the data management system of big data |
CN107807956A (en) * | 2017-09-30 | 2018-03-16 | 平安科技(深圳)有限公司 | Electronic installation, data processing method and computer-readable recording medium |
CN107909493A (en) * | 2017-12-04 | 2018-04-13 | 泰康保险集团股份有限公司 | Policy information processing method, device, computer equipment and storage medium |
CN108052574A (en) * | 2017-12-08 | 2018-05-18 | 南京中新赛克科技有限责任公司 | Slave ftp server based on Kafka technologies imports the ETL system and implementation method of mass data |
CN108121508A (en) * | 2017-12-15 | 2018-06-05 | 华中师范大学 | Multi-source heterogeneous data collecting system and processing method based on education big data |
CN108459842A (en) * | 2018-01-29 | 2018-08-28 | 北京奇艺世纪科技有限公司 | A kind of model configuration method, device and electronic equipment |
CN108520003A (en) * | 2018-03-12 | 2018-09-11 | 新华三大数据技术有限公司 | A kind of storing process scheduling system and method |
CN108595480A (en) * | 2018-03-13 | 2018-09-28 | 广州市优普科技有限公司 | A kind of big data ETL tool systems and application process based on cloud computing |
CN109033174A (en) * | 2018-06-21 | 2018-12-18 | 北京国网信通埃森哲信息技术有限公司 | A kind of power quality data cleaning method and device |
CN109033454A (en) * | 2018-08-27 | 2018-12-18 | 广东电网有限责任公司 | Data filling method, apparatus, equipment and storage medium based on attributes similarity |
CN109145031A (en) * | 2018-08-20 | 2019-01-04 | 国网安徽省电力有限公司合肥供电公司 | A kind of multi-source data multidimensional reconstructing method of service-oriented market access demand |
CN109635026A (en) * | 2018-11-29 | 2019-04-16 | 宝晟(广州)生物信息技术有限公司 | A kind of biological sample bank data distributing nodes sharing method, system and device |
CN109800220A (en) * | 2019-01-29 | 2019-05-24 | 浙江国贸云商企业服务有限公司 | A kind of big data cleaning method, system and relevant apparatus |
CN110059952A (en) * | 2019-04-12 | 2019-07-26 | 中国人民财产保险股份有限公司 | Vehicle insurance methods of risk assessment, device, equipment and storage medium |
CN110309152A (en) * | 2019-06-26 | 2019-10-08 | 广州探迹科技有限公司 | A kind of date storage method and device based on HBase |
CN110347480A (en) * | 2019-06-26 | 2019-10-18 | 联动优势科技有限公司 | The preferred access path method and device of data source containing coincidence data item label |
CN110377598A (en) * | 2018-04-11 | 2019-10-25 | 西安邮电大学 | A kind of multi-source heterogeneous date storage method based on intelligence manufacture process |
CN110427357A (en) * | 2018-04-28 | 2019-11-08 | 新疆金风科技股份有限公司 | Anemometer tower data processing method and device |
CN110457300A (en) * | 2019-07-15 | 2019-11-15 | 中国平安人寿保险股份有限公司 | A kind of method for cleaning and device, electronic equipment in common test library |
WO2019223601A1 (en) * | 2018-05-23 | 2019-11-28 | 杭州海康威视数字技术股份有限公司 | Database system, and establishment method and apparatus therefor |
CN111126661A (en) * | 2019-11-21 | 2020-05-08 | 格创东智(深圳)科技有限公司 | Self-service modeling method and system based on data analysis platform |
CN111200590A (en) * | 2019-12-09 | 2020-05-26 | 杭州安恒信息技术股份有限公司 | Algorithm for checking consistency of multiple period statistical data |
WO2020135048A1 (en) * | 2018-12-29 | 2020-07-02 | 颖投信息科技(上海)有限公司 | Data merging method and apparatus for knowledge graph |
CN111680082A (en) * | 2020-04-30 | 2020-09-18 | 四川弘智远大科技有限公司 | Government financial data acquisition system and data acquisition method based on data integration |
CN112100525A (en) * | 2020-11-02 | 2020-12-18 | 中国人民解放军国防科技大学 | Multi-source heterogeneous aerospace information resource storage method, retrieval method and device |
CN112506930A (en) * | 2020-12-15 | 2021-03-16 | 北京三维天地科技股份有限公司 | Data insight platform based on machine learning technology |
CN112597225A (en) * | 2020-12-22 | 2021-04-02 | 南京三眼精灵信息技术有限公司 | Data acquisition method and device based on distributed model |
CN112783962A (en) * | 2021-02-01 | 2021-05-11 | 盐城郅联空间科技有限公司 | ETL technology-based time-space big data artificial intelligence analysis method and system |
CN113360493A (en) * | 2021-07-12 | 2021-09-07 | 兰州领新网络信息科技有限公司 | Innovative entrepreneurship big data service platform |
WO2022000169A1 (en) * | 2020-06-29 | 2022-01-06 | 深圳大学 | Data analysis method and apparatus spanning data centers, and device and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030233369A1 (en) * | 2002-06-17 | 2003-12-18 | Fujitsu Limited | Data classifying device, and active learning method used by data classifying device and active learning program of data classifying device |
CN101546325A (en) * | 2008-12-23 | 2009-09-30 | 重庆邮电大学 | Grid heterogeneous data integrating method based on SOA |
CN102170449A (en) * | 2011-04-28 | 2011-08-31 | 浙江大学 | Web service QoS prediction method based on collaborative filtering |
CN104616180A (en) * | 2015-03-09 | 2015-05-13 | 浪潮集团有限公司 | Method for predicting hot sellers |
CN104932895A (en) * | 2015-06-26 | 2015-09-23 | 南京邮电大学 | Middleware based on SOA (Service-Oriented Architecture) and information publishing method thereof |
CN105184424A (en) * | 2015-10-19 | 2015-12-23 | 国网山东省电力公司菏泽供电公司 | Mapreduced short period load prediction method of multinucleated function learning SVM realizing multi-source heterogeneous data fusion |
-
2016
- 2016-04-22 CN CN201610254729.0A patent/CN105956015A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030233369A1 (en) * | 2002-06-17 | 2003-12-18 | Fujitsu Limited | Data classifying device, and active learning method used by data classifying device and active learning program of data classifying device |
CN101546325A (en) * | 2008-12-23 | 2009-09-30 | 重庆邮电大学 | Grid heterogeneous data integrating method based on SOA |
CN102170449A (en) * | 2011-04-28 | 2011-08-31 | 浙江大学 | Web service QoS prediction method based on collaborative filtering |
CN104616180A (en) * | 2015-03-09 | 2015-05-13 | 浪潮集团有限公司 | Method for predicting hot sellers |
CN104932895A (en) * | 2015-06-26 | 2015-09-23 | 南京邮电大学 | Middleware based on SOA (Service-Oriented Architecture) and information publishing method thereof |
CN105184424A (en) * | 2015-10-19 | 2015-12-23 | 国网山东省电力公司菏泽供电公司 | Mapreduced short period load prediction method of multinucleated function learning SVM realizing multi-source heterogeneous data fusion |
Cited By (53)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106446255A (en) * | 2016-10-18 | 2017-02-22 | 安徽天达网络科技有限公司 | Data processing method based on cloud server |
CN106651188A (en) * | 2016-12-27 | 2017-05-10 | 贵州电网有限责任公司贵阳供电局 | Electric transmission and transformation device multi-source state assessment data processing method and application thereof |
CN106649801A (en) * | 2016-12-29 | 2017-05-10 | 广东精规划信息科技股份有限公司 | Time-space relationship analysis system based on multi-source internet-of-things position awareness |
CN106844585A (en) * | 2017-01-10 | 2017-06-13 | 广东精规划信息科技股份有限公司 | A kind of time-space relationship analysis system based on multi-source Internet of Things location aware |
CN106844636A (en) * | 2017-01-21 | 2017-06-13 | 亚信蓝涛(江苏)数据科技有限公司 | A kind of unstructured data processing method based on deep learning |
CN107103050A (en) * | 2017-03-31 | 2017-08-29 | 海通安恒(大连)大数据科技有限公司 | A kind of big data Modeling Platform and method |
CN107291951B (en) * | 2017-07-24 | 2020-09-29 | 北京都在哪智慧城市科技有限公司 | Data processing method, device, storage medium and processor |
CN107291951A (en) * | 2017-07-24 | 2017-10-24 | 北京都在哪智慧城市科技有限公司 | Data processing method, device, storage medium and processor |
CN107590225A (en) * | 2017-09-05 | 2018-01-16 | 江苏电力信息技术有限公司 | A kind of Visualized management system based on distributed data digging algorithm |
CN107465692A (en) * | 2017-09-15 | 2017-12-12 | 湖北省楚天云有限公司 | Unification user identity identifying method, system and storage medium |
CN107465692B (en) * | 2017-09-15 | 2019-12-20 | 湖北省楚天云有限公司 | Unified user identity authentication method, system and storage medium |
CN107656995A (en) * | 2017-09-20 | 2018-02-02 | 温州市鹿城区中津先进科技研究院 | Towards the data management system of big data |
CN107807956A (en) * | 2017-09-30 | 2018-03-16 | 平安科技(深圳)有限公司 | Electronic installation, data processing method and computer-readable recording medium |
CN107909493B (en) * | 2017-12-04 | 2020-07-17 | 泰康保险集团股份有限公司 | Policy information processing method and device, computer equipment and storage medium |
CN107909493A (en) * | 2017-12-04 | 2018-04-13 | 泰康保险集团股份有限公司 | Policy information processing method, device, computer equipment and storage medium |
CN108052574A (en) * | 2017-12-08 | 2018-05-18 | 南京中新赛克科技有限责任公司 | Slave ftp server based on Kafka technologies imports the ETL system and implementation method of mass data |
CN108121508A (en) * | 2017-12-15 | 2018-06-05 | 华中师范大学 | Multi-source heterogeneous data collecting system and processing method based on education big data |
CN108459842A (en) * | 2018-01-29 | 2018-08-28 | 北京奇艺世纪科技有限公司 | A kind of model configuration method, device and electronic equipment |
CN108459842B (en) * | 2018-01-29 | 2021-05-14 | 北京奇艺世纪科技有限公司 | Model configuration method and device and electronic equipment |
CN108520003A (en) * | 2018-03-12 | 2018-09-11 | 新华三大数据技术有限公司 | A kind of storing process scheduling system and method |
CN108595480A (en) * | 2018-03-13 | 2018-09-28 | 广州市优普科技有限公司 | A kind of big data ETL tool systems and application process based on cloud computing |
CN108595480B (en) * | 2018-03-13 | 2022-01-21 | 广州市优普科技有限公司 | Big data ETL tool system based on cloud computing and application method |
CN110377598A (en) * | 2018-04-11 | 2019-10-25 | 西安邮电大学 | A kind of multi-source heterogeneous date storage method based on intelligence manufacture process |
CN110427357A (en) * | 2018-04-28 | 2019-11-08 | 新疆金风科技股份有限公司 | Anemometer tower data processing method and device |
CN110597801B (en) * | 2018-05-23 | 2021-09-17 | 杭州海康威视数字技术股份有限公司 | Database system and establishing method and device thereof |
CN110597801A (en) * | 2018-05-23 | 2019-12-20 | 杭州海康威视数字技术股份有限公司 | Database system and establishing method and device thereof |
WO2019223601A1 (en) * | 2018-05-23 | 2019-11-28 | 杭州海康威视数字技术股份有限公司 | Database system, and establishment method and apparatus therefor |
CN109033174A (en) * | 2018-06-21 | 2018-12-18 | 北京国网信通埃森哲信息技术有限公司 | A kind of power quality data cleaning method and device |
CN109145031A (en) * | 2018-08-20 | 2019-01-04 | 国网安徽省电力有限公司合肥供电公司 | A kind of multi-source data multidimensional reconstructing method of service-oriented market access demand |
CN109033454A (en) * | 2018-08-27 | 2018-12-18 | 广东电网有限责任公司 | Data filling method, apparatus, equipment and storage medium based on attributes similarity |
CN109635026A (en) * | 2018-11-29 | 2019-04-16 | 宝晟(广州)生物信息技术有限公司 | A kind of biological sample bank data distributing nodes sharing method, system and device |
WO2020135048A1 (en) * | 2018-12-29 | 2020-07-02 | 颖投信息科技(上海)有限公司 | Data merging method and apparatus for knowledge graph |
CN109800220A (en) * | 2019-01-29 | 2019-05-24 | 浙江国贸云商企业服务有限公司 | A kind of big data cleaning method, system and relevant apparatus |
CN110059952A (en) * | 2019-04-12 | 2019-07-26 | 中国人民财产保险股份有限公司 | Vehicle insurance methods of risk assessment, device, equipment and storage medium |
CN110347480A (en) * | 2019-06-26 | 2019-10-18 | 联动优势科技有限公司 | The preferred access path method and device of data source containing coincidence data item label |
CN110309152A (en) * | 2019-06-26 | 2019-10-08 | 广州探迹科技有限公司 | A kind of date storage method and device based on HBase |
CN110347480B (en) * | 2019-06-26 | 2021-06-25 | 联动优势科技有限公司 | Data source preferred access path method and device containing coincident data item label |
CN110457300A (en) * | 2019-07-15 | 2019-11-15 | 中国平安人寿保险股份有限公司 | A kind of method for cleaning and device, electronic equipment in common test library |
CN110457300B (en) * | 2019-07-15 | 2024-02-02 | 中国平安人寿保险股份有限公司 | Method and device for cleaning public test library and electronic equipment |
CN111126661B (en) * | 2019-11-21 | 2023-11-24 | 格创东智(深圳)科技有限公司 | Self-help modeling method and system based on data analysis platform |
CN111126661A (en) * | 2019-11-21 | 2020-05-08 | 格创东智(深圳)科技有限公司 | Self-service modeling method and system based on data analysis platform |
CN111200590A (en) * | 2019-12-09 | 2020-05-26 | 杭州安恒信息技术股份有限公司 | Algorithm for checking consistency of multiple period statistical data |
CN111200590B (en) * | 2019-12-09 | 2022-08-19 | 杭州安恒信息技术股份有限公司 | Algorithm for checking consistency of multiple period statistical data |
CN111680082B (en) * | 2020-04-30 | 2023-08-18 | 四川弘智远大科技有限公司 | Government financial data acquisition system and method based on data integration |
CN111680082A (en) * | 2020-04-30 | 2020-09-18 | 四川弘智远大科技有限公司 | Government financial data acquisition system and data acquisition method based on data integration |
WO2022000169A1 (en) * | 2020-06-29 | 2022-01-06 | 深圳大学 | Data analysis method and apparatus spanning data centers, and device and storage medium |
CN112100525B (en) * | 2020-11-02 | 2021-02-12 | 中国人民解放军国防科技大学 | Multi-source heterogeneous aerospace information resource storage method, retrieval method and device |
CN112100525A (en) * | 2020-11-02 | 2020-12-18 | 中国人民解放军国防科技大学 | Multi-source heterogeneous aerospace information resource storage method, retrieval method and device |
CN112506930A (en) * | 2020-12-15 | 2021-03-16 | 北京三维天地科技股份有限公司 | Data insight platform based on machine learning technology |
CN112597225A (en) * | 2020-12-22 | 2021-04-02 | 南京三眼精灵信息技术有限公司 | Data acquisition method and device based on distributed model |
CN112783962B (en) * | 2021-02-01 | 2021-12-28 | 盐城郅联空间科技有限公司 | ETL technology-based time-space big data artificial intelligence analysis method and system |
CN112783962A (en) * | 2021-02-01 | 2021-05-11 | 盐城郅联空间科技有限公司 | ETL technology-based time-space big data artificial intelligence analysis method and system |
CN113360493A (en) * | 2021-07-12 | 2021-09-07 | 兰州领新网络信息科技有限公司 | Innovative entrepreneurship big data service platform |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105956015A (en) | Service platform integration method based on big data | |
CN106709035B (en) | A kind of pretreatment system of electric power multidimensional panoramic view data | |
CN104820670B (en) | A kind of acquisition of power information big data and storage method | |
CN102521386B (en) | Method for grouping space metadata based on cluster storage | |
WO2016101628A1 (en) | Data processing method and device in data modeling | |
CN110502509B (en) | Traffic big data cleaning method based on Hadoop and Spark framework and related device | |
CN107193967A (en) | A kind of multi-source heterogeneous industry field big data handles full link solution | |
CN102591917B (en) | Data processing method and system and related device | |
CN104809244B (en) | Data digging method and device under a kind of big data environment | |
CN104462222A (en) | Distributed storage method and system for checkpoint vehicle pass data | |
CN105512167A (en) | Multi-business user data managing system based on mixed database and method for same | |
CN104317789A (en) | Method for building passenger social network | |
CN104156403A (en) | Clustering-based big data normal-mode extracting method and system | |
CN104679827A (en) | Big data-based public information association method and mining engine | |
CN106846082B (en) | Travel cold start user product recommendation system and method based on hardware information | |
CN105488211A (en) | Method for determining user group based on feature analysis | |
Scannapieco et al. | Placing big data in official statistics: a big challenge | |
CN102750367A (en) | Big data checking system and method thereof on cloud platform | |
CN105956932A (en) | Distribution and utilization data fusion method and system | |
CN104143006A (en) | Method and device for processing city data | |
CN104615734A (en) | Community management service big data processing system and processing method thereof | |
Karim et al. | Spatiotemporal Aspects of Big Data. | |
CN113254517A (en) | Service providing method based on internet big data | |
CN110597796B (en) | Big data real-time modeling method and system based on full life cycle | |
CN110826845B (en) | Multidimensional combination cost allocation device and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160921 |