CN105956015A - Service platform integration method based on big data - Google Patents

Service platform integration method based on big data Download PDF

Info

Publication number
CN105956015A
CN105956015A CN201610254729.0A CN201610254729A CN105956015A CN 105956015 A CN105956015 A CN 105956015A CN 201610254729 A CN201610254729 A CN 201610254729A CN 105956015 A CN105956015 A CN 105956015A
Authority
CN
China
Prior art keywords
data
stored
hbase
data base
base
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610254729.0A
Other languages
Chinese (zh)
Inventor
向富强
曾逸
杨雪琴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SICHUAN ZHONGRUAN TECHNOLOGY Co Ltd
Original Assignee
SICHUAN ZHONGRUAN TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SICHUAN ZHONGRUAN TECHNOLOGY Co Ltd filed Critical SICHUAN ZHONGRUAN TECHNOLOGY Co Ltd
Priority to CN201610254729.0A priority Critical patent/CN105956015A/en
Publication of CN105956015A publication Critical patent/CN105956015A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses

Abstract

The invention discloses a service platform integration method based on big data. The method comprises the following steps: (1) acquiring multi-source heterogeneous data; (2) integrating the acquired multi-source heterogeneous data, and storing the integrated data in an Hbase database; (3) using Hive to perform ETL treatment on the integrated data which is stored in the Hbase database, and storing in the Hbase database, cleaning the data which is stored in the Hbase database, to obtain clean data, and storing the clean data in the Hbase database; (4) based on an Hadoop technology, performing modeling analysis on the obtained clean data, and storing an analysis result in an Hive database; (5) using a service-oriented architecture based on SOA to establish a data exchange and shared service bus, and then based on the service bus, establishing a data exchange architecture, and through the data exchange architecture, pushing the analysis result which is stored in the Hive database to a business application system database. The method effectively reduces communication cost and time cost, and improves data effective utilization rate.

Description

A kind of service platform integration method based on big data
Technical field
The present invention relates to big data technique field, particularly to a kind of service platform integration methods based on big data.
Background technology
Big data are information-based industry next generation's information technology commanding elevations, and discussion is put in smart city construction the most at home Schedule, substantial amounts of data are contained in smart city, are to serve government affairs, enterprise and civic a new generation application technology, but existing The big Data Integration in smart city still can not serve government affairs and the public in city various aspects, this mainly due to Under the limitation of several aspects cause:
(1) computation model of the complexity of big data, the most simply carries out property analysis and regularity exploring to multi-source data, does not still have There is complete application process system;
(2) structural data is few, and unstructured data is many, does not the most have more advanced technology or means process destructuring and half Structural data;
(3) system modelling exploring big data complexity, the depicting method of uncertain feature description and big data is not perfect;
(4) current big data mining is substantially at the state once excavating coarse knowledge, does not mouses out more perfect secondary and digs Pick method provides wisdom knowledge elicitation decision-making for decision-making level.
Summary of the invention
In order to overcome disadvantages described above, it is an object of the invention to provide a kind of service platform integration method based on big data, This invention is the collection to urban multi-source isomeric data, integrate, store, clean, modeling analysis and a kind of method of application, passes through This method will form the processing procedure from bottom to top of data;Compare with routine data processing mode, the multi-data source of integration Add the effective rate of utilization of data, effectively reduce communication cost and time cost.
In order to reach object above, the invention provides a kind of service platform integration method based on big data, including with Lower step:
Step 1: gather multi-source heterogeneous data;
Step 2: the multi-source heterogeneous data collected are integrated, and will integrate after data be stored in Hbase data base;
Step 3: utilize Hive that the data being stored in Hbase data base after integrating are carried out ETL process, and be stored in Hbase data base, The data being stored in Hbase data base are carried out obtaining clean data, and clean data are stored in Hbase data In storehouse;
Step 4: be modeled analyzing to the clean data being stored in Hbase data base based on Hadoop technology, and will analyze Result is stored in Hive data base;
Step 5: use service architecture based on SOA set up data exchange and share service bus, be then based on service bus and build Vertical data architecture for exchanging, will be stored in the analysis result in Hive data base by data architecture for exchanging and pushes to service application system In system data base, in order to analysis result data is applied in corresponding service system.
Preferably, gathering multi-source heterogeneous data in described step 1, its step is specific as follows:
Step 1.1: configuration multiple and distributing sources;
Step 1.2: multiple and distributing sources is packaged into data members;
Step 1.3: the data members being packaged into is read out and converts thereof into global object;
Step 1.4: will convert into the data members combination of global object, it is achieved the unified component that accesses of multi-source heterogeneous data is put down Platform;
Step 1.5: gather multi-source heterogeneous data by component platform and transmit to data center, completing adopting of multi-source heterogeneous data Collection;
Preferably, the data being stored in Hbase data base are carried out obtaining clean data by described step 3, its step The most specific as follows:
Step 3.1: the data being stored in Hbase data base are carried out duplicate checking process;
Step 3.2: the missing data after duplicate checking is carried out interpolation data process;
Step 3.3: the data after filling a vacancy are carried out cluster analysis, analyzes the data being free in cluster edge;According to different numbers According to type set effective range, get rid of extraneous value, obtain clean data, and be stored in Hbase data base.
Preferably, the clean data being stored in Hbase data base are built by described step 4 based on Hadoop technology Mode division is analysed, including:
Based on hadoop technology, the clean data being stored in Hbase data base are carried out cluster analysis, the number after cluster analysis According to being respectively stored in Hive data base, for future use, its detailed process is as follows:
(1) create an initialization point, select k object from the clean data being stored in Hbase data base randomly, by this A little objects are as a bunch center;(2) remaining clean data and the distance at each bunch of center in Hbase data base are judged;(3) by remaining Under clean data be assigned to a bunch center successively;(4) when having data object to join and depart from bunch when, the flat of this bunch is automatically calculated These data, if being unsatisfactory for minimum range, are redistributed bunch by average;(5) circulation repeat the above steps, until bunch center Data no longer change, and now record result;(6) result is stored in Hive data base.
The clean data that will be stored in Hbase data base based on hadoop technology carry out Collaborative Recommendation analysis, work in coordination with and push away Recommending the data after analysis to be stored in Hive data base, for future use, its detailed process is as follows:
(1) obtain the clean data being stored in Hbase data base, and be converted into the data set analyzing desirable format;
(2) data set is divided into training dataset and test data set;
(3) recommended models is trained with training dataset;
(4) precision of recommended models is assessed by test set data;
(5) when the precision of recommended models meets demand, recommend, export result, otherwise re-start training and obtain model, Till reevaluating until being met the data of demand;
(6) result of output is stored in Hive data base;
The clean data that will be stored in Hbase data base based on hadoop technology carry out classification analysis, after classification analysis Data are stored in Hive data base, and stamp different label for different pieces of information, and for future use, its detailed process is as follows:
(1) obtain the clean data being stored in Hbase data base, and be converted into the data set analyzing desirable format;
(2) it is that data set gives characteristic attribute, according to characteristic attribute, data set is suitably divided into multiple item to be sorted, right A part of sorting item is classified, and forms training sample set;
(3) frequency and each occurred according to each classification in our data classified counting training sample set to be obtained The characteristic attribute probability Estimation to each classification, obtains grader;
(4) use grader that the data needing classification are classified, export result;
(5) result is saved in Hive data base.
Preferably, in described step 3.1 data being stored in Hbase data base being carried out duplicate checking process, it is concrete Step is as follows:
Step 3.1.1: the data being stored in Hbase data base are repeated inquiry, filters out all fields and repeat completely Data;Retain a pen data, remove other data repeated completely;
Step 3.1.2: carry out Data duplication inquiry with critical field;Filter out the data that critical field repeats;Heavier plural number According to integrity, more complete one of reserved field data, remove remaining repeat data.
Preferably, in described step 3.2, the missing data after duplicate checking being carried out interpolation data process, its concrete steps are such as Under:
Step 3.2.1: for regular missing data and inessential data, then delete disappearance;For regular missing data and More important data, utilize partial data to calculate data weighting and augment;For irregular missing data according to missing data Type processes respectively;
Step 3.2.2: the same attribute data acquisition value that there is data mean value the highest with this property value probability of occurrence is carried out Fill up;First it is that each missing values produces possible interpolation value for different attribute missing at random data separate data, according to can The partial data that the interpolation value of energy is formed carries out statistical analysis, evaluates analysis result, forms final interpolation value to lacking Mistake value carries out interpolation.
Preferably, in described step 1.2, multiple and distributing sources being packaged into data members, it concretely comprises the following steps:
Step 1.2.1: utilize database table structure to prepare component object;
Step 1.2.2: gone out the tabular table in data base by data base querying;
Step 1.2.3: go out Database field and the field data structure of each table with the tables of data in tabular table for Object Query;
Step 1.2.4: read out by list structure for object with tables of data, is set to the basic of table object by data field attributes Attribute information;
Step 1.2.5: Object table is packaged into a component can inquired about by attribute field;
Compared with prior art, beneficial effects of the present invention: the present invention is the collection to urban multi-source isomeric data, integrates, deposits Storage, cleaning, modeling analysis and a kind of method of application, will form the processing procedure from bottom to top of data by this method; And the source and processing procedure forming data is accomplished have mark to look into, has Zhang Kezun;Compare with routine data processing mode, whole The multi-data source closed adds the effective rate of utilization of data, effectively reduces communication cost and time cost.
Accompanying drawing explanation
Fig. 1 is flow chart of the present invention;
Fig. 2 is ETL module based on Sqoop.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Describe wholely.
As depicted in figs. 1 and 2, the invention provides a kind of service platform integration method based on big data, including following Step:
Step 1: gather multi-source heterogeneous data, its concrete gatherer process comprises the following steps:
Step 1.1: configuration multiple and distributing sources, (including: territory, environmental protection, water conservancy, meteorology, woods in the department gathering data Industry, safety supervision, quality supervision etc.) configure front end processor and the terminal unit being manually entered, in data center configuration front end processor, application service The equipment such as device, database server, WEB server, system monitoring terminal and operational terminal are acquired the Business Processing of data;
Step 1.2: multiple and distributing sources is packaged into data members, it concretely comprises the following steps:
Step 1.2.1: utilize database table structure to prepare component object;
Step 1.2.2: gone out the tabular table in data base by data base querying;
Step 1.2.3: go out Database field and the field data structure of each table with the tables of data in tabular table for Object Query;
Step 1.2.4: read out by list structure for object with tables of data, is set to the basic of table object by data field attributes Attribute information;
Step 1.2.5: Object table is packaged into a component can inquired about by attribute field;
Step 1.3: the data members being packaged into is read out and converts thereof into global object;
Step 1.4: will convert into the data members combination of global object, it is achieved the unified component that accesses of multi-source heterogeneous data is put down Platform;
Step 1.5: gather multi-source heterogeneous data by component platform and transmit to data center, completing adopting of multi-source heterogeneous data Collecting, it concretely comprises the following steps:
Step 1.5.1 arranges data acquisition modes, including frequency acquisition, acquisition node, acquisition range;
Step 1.5.2. is intended to gather digital independent out by component platform;
The data read out are transmitted through the network to data center by step 1.5.3..
Step 2: the multi-source heterogeneous data collected are integrated, and will integrate after data be stored in Hbase data Storehouse.
Step 3: utilize Hive that the data being stored in Hbase data base after integrating are carried out ETL process, and be stored in Hbase number According to storehouse, the data being stored in Hbase data base are carried out obtaining clean data, and clean data are stored in In Hbase data base, its detailed process is:
As in figure 2 it is shown, utilize Hive that the data being stored in Hbase data base after integrating are carried out ETL process, ETL based on Sqoop Module first passes through Java data base and connects JDBC and data source and set up the metadata information being connected and checking data source, then general The SQL categorical data that JDBC end obtains is converted to the Sqoop record of java class form, and submits to as formatting input MapReduce task, finally, by initiating Map task and the Reduce task of respective numbers, and then writes data into HDFS In, client node calls HDFS API, and whole file is divided into packet (packet) one by one, simultaneously by data queue The mode of (Data queue) manages packet, and etc. pending, then, to the data block that NameNode application is new, and obtain Taking one group of DataNode and carry out actual storage data block copy (replicas), DataNode constitutes a pipeline (pipeline), Packet is write successively in corresponding DataNode, when last DataNode data has write, and opposite direction will confirm that Information returns successively, finally submits to NameNode to represent that write completes;
Being carried out obtaining clean data to the data being stored in Hbase data base, it concretely comprises the following steps:
Step 3.1: the data being stored in Hbase data base are carried out duplicate checking process, it concretely comprises the following steps:
Step 3.1.1: the data being stored in Hbase data base are repeated inquiry, filters out all fields and repeat completely Data;Retain a pen data, remove other data repeated completely;
Step 3.1.2: carry out Data duplication inquiry with critical field;Filter out the data that critical field repeats;Heavier plural number According to integrity, more complete one of reserved field data, remove remaining repeat data.
Step 3.2: the missing data after duplicate checking is carried out interpolation data process, it concretely comprises the following steps:
Step 3.2.1: for regular missing data and inessential data, then delete disappearance;For regular missing data and More important data, utilize partial data to calculate data weighting and augment;For irregular missing data according to missing data Type processes respectively;
Step 3.2.2: the same attribute data acquisition value that there is data mean value the highest with this property value probability of occurrence is carried out Fill up;First it is that each missing values produces possible interpolation value for different attribute missing at random data separate data, according to can The partial data that the interpolation value of energy is formed carries out statistical analysis, evaluates analysis result, forms final interpolation value to lacking Mistake value carries out interpolation.
Step 3.3: the data after filling a vacancy are carried out cluster analysis, analyzes the data being free in cluster edge;According to not Set effective range with data type, get rid of extraneous value, obtain clean data, and be stored in Hbase data base.
Step 4: be modeled analyzing to the clean data being stored in Hbase data base based on Hadoop technology, and will Analysis result is stored in Hive data base, comprising:
Carry out cluster analysis based on hadoop technology, will be stored in data in Hbase data base and cluster, Hbase data base Middle data object is multiple class, has higher similarity between the object in same class, and the object difference in inhomogeneity Relatively big, the data after cluster are stored in Hive data base, for future use;K-means cluster be a kind of widely used based on The cluster algorithm divided, the process of implementing is:
(1) create an initialization point, select k object from the clean data being stored in Hbase data base randomly, by this A little objects are as a bunch center;(2) remaining clean data and the distance at each bunch of center in Hbase data base are judged;(3) by remaining Under clean data be assigned to a bunch center successively;(4) when having data object to join and depart from bunch when, the flat of this bunch is automatically calculated These data, if being unsatisfactory for minimum range, are redistributed bunch by average;(5) circulation repeat the above steps, until bunch center Data no longer change, and now record result;(6) result is stored in Hive data base.
Carrying out Collaborative Recommendation analysis based on hadoop technology, use habit and data customization label according to user are to difference User type or the corresponding data of object recommendation, by being stored in Hive data base after training;Collaborative filtering is commending system Widely used a kind of technology, it is mainly by considering the similarity between user and user, between article and article, come to Family is recommended, and Collaborative Filtering with ALS-WR is a conventional proposed algorithm, this algorithm core Thought exactly all of user and project are imagined as a two-dimensional table, this form has data cell (i, j), Being the i-th user scoring to jth project, then utilizing this algorithm to use in form has the cell of data to be predicted as Empty cell.The data that prediction obtains are user's scoring to project, then according to the project of prediction is marked from high to low Sequence, just can recommend, and the process of implementing is:
(1) obtain the clean data being stored in Hbase data base, and be converted into the data set analyzing desirable format;
(2) data set is divided into training dataset and test data set;
(3) recommended models is trained with training dataset;
(4) precision of recommended models is assessed by test set data;
(5) when the precision of recommended models meets demand, recommend, export result, otherwise re-start training and obtain model, Till reevaluating until being met the data of demand;
(6) result of output is stored in Hive data base.
Classifying based on hadoop technology, the data after gathering training carry out classifying and are stored in Hive data base In, and stamp different label for different pieces of information, in order to and follow-up use, Naive Bayes Classification is a kind of conventional sorting algorithm, Its core concept is the item to be sorted for being given, solve each classification occurs under conditions of this occurs probability which Greatly, being considered as which classification this item to be sorted belongs to, the process of implementing is:
(1) obtain the clean data being stored in Hbase data base, and be converted into the data set analyzing desirable format;
(2) it is that data set gives characteristic attribute, according to characteristic attribute, data set is suitably divided into multiple item to be sorted, right A part of sorting item is classified, and forms training sample set;
(3) frequency and each occurred according to each classification in our data classified counting training sample set to be obtained The characteristic attribute probability Estimation to each classification, obtains grader;
(4) use grader that the data needing classification are classified, export result;
(5) result is saved in Hive data base.
Step 5: use service architecture based on SOA set up data exchange and share service bus, be then based on service total Data architecture for exchanging set up by line, by data architecture for exchanging will be stored in the analysis result in Hive data base push to business should With in system database, in order to analysis result data is applied in corresponding service system.

Claims (7)

1. a service platform integration method based on big data, it is characterised in that comprise the following steps:
Step 1: gather multi-source heterogeneous data;
Step 2: the multi-source heterogeneous data collected are integrated, and will integrate after data be stored in Hbase data base;
Step 3: utilize Hive that the data being stored in Hbase data base after integrating are carried out ETL process, and be stored in Hbase data base, The data being stored in Hbase data base are carried out obtaining clean data, and clean data are stored in Hbase data base In;
Step 4: be modeled analyzing to the clean data being stored in Hbase data base based on Hadoop technology, and will analyze Result is stored in Hive data base;
Step 5: use service architecture based on SOA set up data exchange and share service bus, be then based on service bus and build Vertical data architecture for exchanging, will be stored in the analysis result in Hive data base by data architecture for exchanging and pushes to service application system In system data base, in order to apply in corresponding service system.
A kind of service platform integration methods based on big data the most according to claim 1, it is characterised in that described step Gathering multi-source heterogeneous data in 1, its step is specific as follows:
Step 1.1: configuration multiple and distributing sources;
Step 1.2: multiple and distributing sources is packaged into data members;
Step 1.3: the data members being packaged into is read out and converts thereof into global object;
Step 1.4: will convert into the data members combination of global object, it is achieved the unified component that accesses of multi-source heterogeneous data is put down Platform;
Step 1.5: gather multi-source heterogeneous data by component platform and transmit to data center, completing adopting of multi-source heterogeneous data Collection.
A kind of service platform integration methods based on big data the most according to claim 1, it is characterised in that described step Being carried out obtaining clean data to the data being stored in Hbase data base in 3, its step is specific as follows:
Step 3.1: the data being stored in Hbase data base are carried out duplicate checking process;
Step 3.2: the missing data after duplicate checking is carried out interpolation data process;
Step 3.3: the data after filling a vacancy are carried out cluster analysis, analyzes the data being free in cluster edge;According to different numbers According to type set effective range, get rid of extraneous value, obtain clean data, and be stored in Hbase data base.
A kind of service platform integration methods based on big data the most according to claim 1, it is characterised in that described step It is modeled analyzing to the clean data being stored in Hbase data base based on Hadoop technology in 4, including:
Based on hadoop technology, the clean data being stored in Hbase data base are carried out cluster analysis, the number after cluster analysis According to being respectively stored in Hive data base, for future use, its detailed process is as follows:
(1) create an initialization point, select k object from the clean data being stored in Hbase data base randomly, by this A little objects are as a bunch center;(2) remaining clean data and the distance at each bunch of center in Hbase data base are judged;(3) by remaining Under clean data be assigned to a bunch center successively;(4) when having data object to join and depart from bunch when, the flat of this bunch is automatically calculated These data, if being unsatisfactory for minimum range, are redistributed bunch by average;(5) circulation repeat the above steps, until bunch center Data no longer change, and now record result;(6) result is stored in Hive data base;
The clean data that will be stored in Hbase data base based on hadoop technology carry out Collaborative Recommendation analysis, and Collaborative Recommendation divides Data after analysis are stored in Hive data base, and for future use, its detailed process is as follows:
(1) obtain the clean data being stored in Hbase data base, and be converted into the data set analyzing desirable format;
(2) data set is divided into training dataset and test data set;
(3) recommended models is trained with training dataset;
(4) precision of recommended models is assessed by test set data;
(5) when the precision of recommended models meets demand, recommend, export result, otherwise re-start training and obtain model, Till reevaluating until being met the data of demand;
(6) result of output is stored in Hive data base;
The clean data that will be stored in Hbase data base based on hadoop technology carry out classification analysis, after classification analysis Data are stored in Hive data base, and stamp different label for different pieces of information, and for future use, its detailed process is as follows:
(1) obtain the clean data being stored in Hbase data base, and be converted into the data set analyzing desirable format;
(2) it is that data set gives characteristic attribute, according to characteristic attribute, data set is suitably divided into multiple item to be sorted, right A part of sorting item is classified, and forms training sample set;
(3) frequency and each occurred according to each classification in our data classified counting training sample set to be obtained The characteristic attribute probability Estimation to each classification, obtains grader;
(4) use grader that the data needing classification are classified, export result;
(5) result is saved in Hive data base.
A kind of service platform integration methods based on big data the most according to claim 3, it is characterised in that step 3.1 In the data being stored in Hbase data base are carried out duplicate checking process, it specifically comprises the following steps that
Step 3.1.1: the data being stored in Hbase data base are repeated inquiry, filters out all fields and repeat completely Data;Retain a pen data, remove other data repeated completely;
Step 3.1.2: carry out Data duplication inquiry with critical field;Filter out the data that critical field repeats;Heavier plural number According to integrity, more complete one of reserved field data, remove remaining repeat data.
A kind of service platform integration methods based on big data the most according to claim 3, it is characterised in that step 3.2 In the missing data after duplicate checking is carried out interpolation data process, it specifically comprises the following steps that
Step 3.2.1: for regular missing data and inessential data, then delete disappearance;For regular missing data and More important data, utilize partial data to calculate data weighting and augment;For irregular missing data according to missing data Type processes respectively;
Step 3.2.2: the same attribute data acquisition value that there is data mean value the highest with this property value probability of occurrence is carried out Fill up;First it is that each missing values produces possible interpolation value for different attribute missing at random data separate data, according to can The partial data that the interpolation value of energy is formed carries out statistical analysis, evaluates analysis result, forms final interpolation value to lacking Mistake value carries out interpolation.
A kind of service platform integration methods based on big data the most according to claim 2, it is characterised in that step 1.2 Middle multiple and distributing sources being packaged into data members, it concretely comprises the following steps:
Step 1.2.1: utilize database table structure to prepare component object;
Step 1.2.2: gone out the tabular table in data base by data base querying;
Step 1.2.3: go out Database field and the field data structure of each table with the tables of data in tabular table for Object Query;
Step 1.2.4: read out by list structure for object with tables of data, is set to the basic of table object by data field attributes Attribute information;
Step 1.2.5: Object table is packaged into a component can inquired about by attribute field.
CN201610254729.0A 2016-04-22 2016-04-22 Service platform integration method based on big data Pending CN105956015A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610254729.0A CN105956015A (en) 2016-04-22 2016-04-22 Service platform integration method based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610254729.0A CN105956015A (en) 2016-04-22 2016-04-22 Service platform integration method based on big data

Publications (1)

Publication Number Publication Date
CN105956015A true CN105956015A (en) 2016-09-21

Family

ID=56914723

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610254729.0A Pending CN105956015A (en) 2016-04-22 2016-04-22 Service platform integration method based on big data

Country Status (1)

Country Link
CN (1) CN105956015A (en)

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446255A (en) * 2016-10-18 2017-02-22 安徽天达网络科技有限公司 Data processing method based on cloud server
CN106649801A (en) * 2016-12-29 2017-05-10 广东精规划信息科技股份有限公司 Time-space relationship analysis system based on multi-source internet-of-things position awareness
CN106651188A (en) * 2016-12-27 2017-05-10 贵州电网有限责任公司贵阳供电局 Electric transmission and transformation device multi-source state assessment data processing method and application thereof
CN106844585A (en) * 2017-01-10 2017-06-13 广东精规划信息科技股份有限公司 A kind of time-space relationship analysis system based on multi-source Internet of Things location aware
CN106844636A (en) * 2017-01-21 2017-06-13 亚信蓝涛(江苏)数据科技有限公司 A kind of unstructured data processing method based on deep learning
CN107103050A (en) * 2017-03-31 2017-08-29 海通安恒(大连)大数据科技有限公司 A kind of big data Modeling Platform and method
CN107291951A (en) * 2017-07-24 2017-10-24 北京都在哪智慧城市科技有限公司 Data processing method, device, storage medium and processor
CN107465692A (en) * 2017-09-15 2017-12-12 湖北省楚天云有限公司 Unification user identity identifying method, system and storage medium
CN107590225A (en) * 2017-09-05 2018-01-16 江苏电力信息技术有限公司 A kind of Visualized management system based on distributed data digging algorithm
CN107656995A (en) * 2017-09-20 2018-02-02 温州市鹿城区中津先进科技研究院 Towards the data management system of big data
CN107807956A (en) * 2017-09-30 2018-03-16 平安科技(深圳)有限公司 Electronic installation, data processing method and computer-readable recording medium
CN107909493A (en) * 2017-12-04 2018-04-13 泰康保险集团股份有限公司 Policy information processing method, device, computer equipment and storage medium
CN108052574A (en) * 2017-12-08 2018-05-18 南京中新赛克科技有限责任公司 Slave ftp server based on Kafka technologies imports the ETL system and implementation method of mass data
CN108121508A (en) * 2017-12-15 2018-06-05 华中师范大学 Multi-source heterogeneous data collecting system and processing method based on education big data
CN108459842A (en) * 2018-01-29 2018-08-28 北京奇艺世纪科技有限公司 A kind of model configuration method, device and electronic equipment
CN108520003A (en) * 2018-03-12 2018-09-11 新华三大数据技术有限公司 A kind of storing process scheduling system and method
CN108595480A (en) * 2018-03-13 2018-09-28 广州市优普科技有限公司 A kind of big data ETL tool systems and application process based on cloud computing
CN109033174A (en) * 2018-06-21 2018-12-18 北京国网信通埃森哲信息技术有限公司 A kind of power quality data cleaning method and device
CN109033454A (en) * 2018-08-27 2018-12-18 广东电网有限责任公司 Data filling method, apparatus, equipment and storage medium based on attributes similarity
CN109145031A (en) * 2018-08-20 2019-01-04 国网安徽省电力有限公司合肥供电公司 A kind of multi-source data multidimensional reconstructing method of service-oriented market access demand
CN109635026A (en) * 2018-11-29 2019-04-16 宝晟(广州)生物信息技术有限公司 A kind of biological sample bank data distributing nodes sharing method, system and device
CN109800220A (en) * 2019-01-29 2019-05-24 浙江国贸云商企业服务有限公司 A kind of big data cleaning method, system and relevant apparatus
CN110059952A (en) * 2019-04-12 2019-07-26 中国人民财产保险股份有限公司 Vehicle insurance methods of risk assessment, device, equipment and storage medium
CN110309152A (en) * 2019-06-26 2019-10-08 广州探迹科技有限公司 A kind of date storage method and device based on HBase
CN110347480A (en) * 2019-06-26 2019-10-18 联动优势科技有限公司 The preferred access path method and device of data source containing coincidence data item label
CN110377598A (en) * 2018-04-11 2019-10-25 西安邮电大学 A kind of multi-source heterogeneous date storage method based on intelligence manufacture process
CN110427357A (en) * 2018-04-28 2019-11-08 新疆金风科技股份有限公司 Anemometer tower data processing method and device
CN110457300A (en) * 2019-07-15 2019-11-15 中国平安人寿保险股份有限公司 A kind of method for cleaning and device, electronic equipment in common test library
WO2019223601A1 (en) * 2018-05-23 2019-11-28 杭州海康威视数字技术股份有限公司 Database system, and establishment method and apparatus therefor
CN111126661A (en) * 2019-11-21 2020-05-08 格创东智(深圳)科技有限公司 Self-service modeling method and system based on data analysis platform
CN111200590A (en) * 2019-12-09 2020-05-26 杭州安恒信息技术股份有限公司 Algorithm for checking consistency of multiple period statistical data
WO2020135048A1 (en) * 2018-12-29 2020-07-02 颖投信息科技(上海)有限公司 Data merging method and apparatus for knowledge graph
CN111680082A (en) * 2020-04-30 2020-09-18 四川弘智远大科技有限公司 Government financial data acquisition system and data acquisition method based on data integration
CN112100525A (en) * 2020-11-02 2020-12-18 中国人民解放军国防科技大学 Multi-source heterogeneous aerospace information resource storage method, retrieval method and device
CN112506930A (en) * 2020-12-15 2021-03-16 北京三维天地科技股份有限公司 Data insight platform based on machine learning technology
CN112597225A (en) * 2020-12-22 2021-04-02 南京三眼精灵信息技术有限公司 Data acquisition method and device based on distributed model
CN112783962A (en) * 2021-02-01 2021-05-11 盐城郅联空间科技有限公司 ETL technology-based time-space big data artificial intelligence analysis method and system
CN113360493A (en) * 2021-07-12 2021-09-07 兰州领新网络信息科技有限公司 Innovative entrepreneurship big data service platform
WO2022000169A1 (en) * 2020-06-29 2022-01-06 深圳大学 Data analysis method and apparatus spanning data centers, and device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030233369A1 (en) * 2002-06-17 2003-12-18 Fujitsu Limited Data classifying device, and active learning method used by data classifying device and active learning program of data classifying device
CN101546325A (en) * 2008-12-23 2009-09-30 重庆邮电大学 Grid heterogeneous data integrating method based on SOA
CN102170449A (en) * 2011-04-28 2011-08-31 浙江大学 Web service QoS prediction method based on collaborative filtering
CN104616180A (en) * 2015-03-09 2015-05-13 浪潮集团有限公司 Method for predicting hot sellers
CN104932895A (en) * 2015-06-26 2015-09-23 南京邮电大学 Middleware based on SOA (Service-Oriented Architecture) and information publishing method thereof
CN105184424A (en) * 2015-10-19 2015-12-23 国网山东省电力公司菏泽供电公司 Mapreduced short period load prediction method of multinucleated function learning SVM realizing multi-source heterogeneous data fusion

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030233369A1 (en) * 2002-06-17 2003-12-18 Fujitsu Limited Data classifying device, and active learning method used by data classifying device and active learning program of data classifying device
CN101546325A (en) * 2008-12-23 2009-09-30 重庆邮电大学 Grid heterogeneous data integrating method based on SOA
CN102170449A (en) * 2011-04-28 2011-08-31 浙江大学 Web service QoS prediction method based on collaborative filtering
CN104616180A (en) * 2015-03-09 2015-05-13 浪潮集团有限公司 Method for predicting hot sellers
CN104932895A (en) * 2015-06-26 2015-09-23 南京邮电大学 Middleware based on SOA (Service-Oriented Architecture) and information publishing method thereof
CN105184424A (en) * 2015-10-19 2015-12-23 国网山东省电力公司菏泽供电公司 Mapreduced short period load prediction method of multinucleated function learning SVM realizing multi-source heterogeneous data fusion

Cited By (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446255A (en) * 2016-10-18 2017-02-22 安徽天达网络科技有限公司 Data processing method based on cloud server
CN106651188A (en) * 2016-12-27 2017-05-10 贵州电网有限责任公司贵阳供电局 Electric transmission and transformation device multi-source state assessment data processing method and application thereof
CN106649801A (en) * 2016-12-29 2017-05-10 广东精规划信息科技股份有限公司 Time-space relationship analysis system based on multi-source internet-of-things position awareness
CN106844585A (en) * 2017-01-10 2017-06-13 广东精规划信息科技股份有限公司 A kind of time-space relationship analysis system based on multi-source Internet of Things location aware
CN106844636A (en) * 2017-01-21 2017-06-13 亚信蓝涛(江苏)数据科技有限公司 A kind of unstructured data processing method based on deep learning
CN107103050A (en) * 2017-03-31 2017-08-29 海通安恒(大连)大数据科技有限公司 A kind of big data Modeling Platform and method
CN107291951B (en) * 2017-07-24 2020-09-29 北京都在哪智慧城市科技有限公司 Data processing method, device, storage medium and processor
CN107291951A (en) * 2017-07-24 2017-10-24 北京都在哪智慧城市科技有限公司 Data processing method, device, storage medium and processor
CN107590225A (en) * 2017-09-05 2018-01-16 江苏电力信息技术有限公司 A kind of Visualized management system based on distributed data digging algorithm
CN107465692A (en) * 2017-09-15 2017-12-12 湖北省楚天云有限公司 Unification user identity identifying method, system and storage medium
CN107465692B (en) * 2017-09-15 2019-12-20 湖北省楚天云有限公司 Unified user identity authentication method, system and storage medium
CN107656995A (en) * 2017-09-20 2018-02-02 温州市鹿城区中津先进科技研究院 Towards the data management system of big data
CN107807956A (en) * 2017-09-30 2018-03-16 平安科技(深圳)有限公司 Electronic installation, data processing method and computer-readable recording medium
CN107909493B (en) * 2017-12-04 2020-07-17 泰康保险集团股份有限公司 Policy information processing method and device, computer equipment and storage medium
CN107909493A (en) * 2017-12-04 2018-04-13 泰康保险集团股份有限公司 Policy information processing method, device, computer equipment and storage medium
CN108052574A (en) * 2017-12-08 2018-05-18 南京中新赛克科技有限责任公司 Slave ftp server based on Kafka technologies imports the ETL system and implementation method of mass data
CN108121508A (en) * 2017-12-15 2018-06-05 华中师范大学 Multi-source heterogeneous data collecting system and processing method based on education big data
CN108459842A (en) * 2018-01-29 2018-08-28 北京奇艺世纪科技有限公司 A kind of model configuration method, device and electronic equipment
CN108459842B (en) * 2018-01-29 2021-05-14 北京奇艺世纪科技有限公司 Model configuration method and device and electronic equipment
CN108520003A (en) * 2018-03-12 2018-09-11 新华三大数据技术有限公司 A kind of storing process scheduling system and method
CN108595480A (en) * 2018-03-13 2018-09-28 广州市优普科技有限公司 A kind of big data ETL tool systems and application process based on cloud computing
CN108595480B (en) * 2018-03-13 2022-01-21 广州市优普科技有限公司 Big data ETL tool system based on cloud computing and application method
CN110377598A (en) * 2018-04-11 2019-10-25 西安邮电大学 A kind of multi-source heterogeneous date storage method based on intelligence manufacture process
CN110427357A (en) * 2018-04-28 2019-11-08 新疆金风科技股份有限公司 Anemometer tower data processing method and device
CN110597801B (en) * 2018-05-23 2021-09-17 杭州海康威视数字技术股份有限公司 Database system and establishing method and device thereof
CN110597801A (en) * 2018-05-23 2019-12-20 杭州海康威视数字技术股份有限公司 Database system and establishing method and device thereof
WO2019223601A1 (en) * 2018-05-23 2019-11-28 杭州海康威视数字技术股份有限公司 Database system, and establishment method and apparatus therefor
CN109033174A (en) * 2018-06-21 2018-12-18 北京国网信通埃森哲信息技术有限公司 A kind of power quality data cleaning method and device
CN109145031A (en) * 2018-08-20 2019-01-04 国网安徽省电力有限公司合肥供电公司 A kind of multi-source data multidimensional reconstructing method of service-oriented market access demand
CN109033454A (en) * 2018-08-27 2018-12-18 广东电网有限责任公司 Data filling method, apparatus, equipment and storage medium based on attributes similarity
CN109635026A (en) * 2018-11-29 2019-04-16 宝晟(广州)生物信息技术有限公司 A kind of biological sample bank data distributing nodes sharing method, system and device
WO2020135048A1 (en) * 2018-12-29 2020-07-02 颖投信息科技(上海)有限公司 Data merging method and apparatus for knowledge graph
CN109800220A (en) * 2019-01-29 2019-05-24 浙江国贸云商企业服务有限公司 A kind of big data cleaning method, system and relevant apparatus
CN110059952A (en) * 2019-04-12 2019-07-26 中国人民财产保险股份有限公司 Vehicle insurance methods of risk assessment, device, equipment and storage medium
CN110347480A (en) * 2019-06-26 2019-10-18 联动优势科技有限公司 The preferred access path method and device of data source containing coincidence data item label
CN110309152A (en) * 2019-06-26 2019-10-08 广州探迹科技有限公司 A kind of date storage method and device based on HBase
CN110347480B (en) * 2019-06-26 2021-06-25 联动优势科技有限公司 Data source preferred access path method and device containing coincident data item label
CN110457300A (en) * 2019-07-15 2019-11-15 中国平安人寿保险股份有限公司 A kind of method for cleaning and device, electronic equipment in common test library
CN110457300B (en) * 2019-07-15 2024-02-02 中国平安人寿保险股份有限公司 Method and device for cleaning public test library and electronic equipment
CN111126661B (en) * 2019-11-21 2023-11-24 格创东智(深圳)科技有限公司 Self-help modeling method and system based on data analysis platform
CN111126661A (en) * 2019-11-21 2020-05-08 格创东智(深圳)科技有限公司 Self-service modeling method and system based on data analysis platform
CN111200590A (en) * 2019-12-09 2020-05-26 杭州安恒信息技术股份有限公司 Algorithm for checking consistency of multiple period statistical data
CN111200590B (en) * 2019-12-09 2022-08-19 杭州安恒信息技术股份有限公司 Algorithm for checking consistency of multiple period statistical data
CN111680082B (en) * 2020-04-30 2023-08-18 四川弘智远大科技有限公司 Government financial data acquisition system and method based on data integration
CN111680082A (en) * 2020-04-30 2020-09-18 四川弘智远大科技有限公司 Government financial data acquisition system and data acquisition method based on data integration
WO2022000169A1 (en) * 2020-06-29 2022-01-06 深圳大学 Data analysis method and apparatus spanning data centers, and device and storage medium
CN112100525B (en) * 2020-11-02 2021-02-12 中国人民解放军国防科技大学 Multi-source heterogeneous aerospace information resource storage method, retrieval method and device
CN112100525A (en) * 2020-11-02 2020-12-18 中国人民解放军国防科技大学 Multi-source heterogeneous aerospace information resource storage method, retrieval method and device
CN112506930A (en) * 2020-12-15 2021-03-16 北京三维天地科技股份有限公司 Data insight platform based on machine learning technology
CN112597225A (en) * 2020-12-22 2021-04-02 南京三眼精灵信息技术有限公司 Data acquisition method and device based on distributed model
CN112783962B (en) * 2021-02-01 2021-12-28 盐城郅联空间科技有限公司 ETL technology-based time-space big data artificial intelligence analysis method and system
CN112783962A (en) * 2021-02-01 2021-05-11 盐城郅联空间科技有限公司 ETL technology-based time-space big data artificial intelligence analysis method and system
CN113360493A (en) * 2021-07-12 2021-09-07 兰州领新网络信息科技有限公司 Innovative entrepreneurship big data service platform

Similar Documents

Publication Publication Date Title
CN105956015A (en) Service platform integration method based on big data
CN106709035B (en) A kind of pretreatment system of electric power multidimensional panoramic view data
CN104820670B (en) A kind of acquisition of power information big data and storage method
CN102521386B (en) Method for grouping space metadata based on cluster storage
WO2016101628A1 (en) Data processing method and device in data modeling
CN110502509B (en) Traffic big data cleaning method based on Hadoop and Spark framework and related device
CN107193967A (en) A kind of multi-source heterogeneous industry field big data handles full link solution
CN102591917B (en) Data processing method and system and related device
CN104809244B (en) Data digging method and device under a kind of big data environment
CN104462222A (en) Distributed storage method and system for checkpoint vehicle pass data
CN105512167A (en) Multi-business user data managing system based on mixed database and method for same
CN104317789A (en) Method for building passenger social network
CN104156403A (en) Clustering-based big data normal-mode extracting method and system
CN104679827A (en) Big data-based public information association method and mining engine
CN106846082B (en) Travel cold start user product recommendation system and method based on hardware information
CN105488211A (en) Method for determining user group based on feature analysis
Scannapieco et al. Placing big data in official statistics: a big challenge
CN102750367A (en) Big data checking system and method thereof on cloud platform
CN105956932A (en) Distribution and utilization data fusion method and system
CN104143006A (en) Method and device for processing city data
CN104615734A (en) Community management service big data processing system and processing method thereof
Karim et al. Spatiotemporal Aspects of Big Data.
CN113254517A (en) Service providing method based on internet big data
CN110597796B (en) Big data real-time modeling method and system based on full life cycle
CN110826845B (en) Multidimensional combination cost allocation device and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160921