CN109241107A - Big data controlling device based on Hadoop - Google Patents
Big data controlling device based on Hadoop Download PDFInfo
- Publication number
- CN109241107A CN109241107A CN201810879556.0A CN201810879556A CN109241107A CN 109241107 A CN109241107 A CN 109241107A CN 201810879556 A CN201810879556 A CN 201810879556A CN 109241107 A CN109241107 A CN 109241107A
- Authority
- CN
- China
- Prior art keywords
- data
- tables
- information
- module
- source
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of big data controlling device based on Hadoop, comprising: data administer information management module, for safeguarding that the data of each data source administer operation information;Data source capability module, for carrying out improvement operation to the data for importing big data platform;Data preview module, for showing the essential information of each tables of data from the angle of structured database;Metadata management module, for the metadata information in tables of data to be presented to user's various dimensions;For checking the specific missing information of each field in tables of data, and corresponding fill rule is arranged to complete the filling of missing information in data quality management module;Multisource data fusion module summarizes for multiple tables of data of multiple data sources to be carried out fusion again, after obtaining new tables of data, new tables of data is further analyzed.The device completes multiple functional modules using big data component, provides highly reliable data basis for subsequent analysis and inquiry.
Description
Technical field
The present invention relates to technical field of data processing, in particular to a kind of big data controlling device based on Hadoop.
Background technique
Currently, universal with big data technology and related application, data have become in addition to manpower, material object, finance, skill
Another critical asset except art, intellectual property and relationship.By analyzing data with existing, enterprise can be become more apparent upon
Recent traffic-operating period, user's service condition etc., to optimize the operation of enterprise more accurately.But under present condition by
In the truth to business data and do not know about, analysis personnel need take a significant amount of time pursue one's vocational study database document or
Consultation service personnel, and Data Preparation also needs special data engineering teacher to be ETL, be easy to cause the speed of delivery with
On not, it is also easy to go wrong in pilot process.With massaging device development to certain phase, data resource will become war
Slightly assets, and effective data administer the necessary condition for being only data assets formation.Effective data are administered for ensuring data
It is accurate, appropriateness share and protect most important.As enterprise administers the gradually attention of link to data, have already appeared some
Commercial data controlling device, mainly comprising functional modules such as metadata management, data standard management, data quality managements
In the related technology, including following technical scheme: (1) defining metadata;Import the metadata;To the metadata
It is administered and is analyzed, obtain analysis result;Metadata map is obtained according at least to the analysis result.(2) it proposes first only
It stands in the normal data resource set of application, integrate and functionalization and hardware and software platform processing, one of overall importance, distributed number of formation
According to standardization support and QCC quality control center;By the big concentrations of metadata, meta-model, associated metadata elements to each field etc. with
Unified resourceization processing realizes that the standardization, standardization and quality to each application layer data resource control;At data normalization
Reason mainly for thousands of metadata standards, the object class of data standard, defines class, characteristic class, expression class, codomain class, application
It is carried out with management suitability, the data of each application field of S1~Sn pass through the specification number recalled in interface repository and " normal data source "
It is handled according to comparison is standardized with suitability.(3) at least one tables of data is obtained, wherein at least one described tables of data is come
From at least one information for hospital device HIS (Hospital Information System, information for hospital device);Described in determination
The feature of data in each of at least one tables of data tables of data;The feature is used to indicate the classification of the data;
According to the corresponding relationship of the feature of storage and data result, the result of the data in each tables of data is determined;Wherein, described
Corresponding relationship is to pass through machine according to the feature of the data in each tables of data and with obtained data result before current time
Study determination.
However, the focal point of the big data controlling device of the relevant technologies is substantially in the management of metadata, for first number
According to definition, use and analyze etc. unified standard, to reach the specification improvement to metadata information.But these are managed
Scheme is excessively specialized, and the user for needing relevant professional knowledge could understand.Meanwhile for the data under big data scene
Improvement is not limited only to metadata management, further includes the links such as data quality management, multisource data fusion, data modeling, these rings
Section is to subsequent analysis and dredge operation no less important.In addition, current big data controlling device is all directed to and a certain specifically makes
With scene, there is certain limitation in terms of use, management and extension.
Summary of the invention
The present invention is directed to solve at least some of the technical problems in related technologies.
For this purpose, it is an object of the invention to propose that a kind of big data controlling device based on Hadoop, the device effectively mention
The applicability and practicability that high big data is administered are simple easily to realize.
In order to achieve the above objectives, one aspect of the present invention embodiment proposes a kind of big data improvement dress based on Hadoop
It sets, comprising: data administer information management module, for safeguarding that the data of each data source administer operation information, and provide improvement
The copy function of operation;Data source capability module for carrying out improvement operation to the data for importing big data platform, and supports knot
The improvement operation of the MySQL data source types and Hive data source types of structure database;Data preview module is used for from described
The angle of structured database shows the essential information of each tables of data;Metadata management module, for being presented to user's various dimensions
Metadata information in tables of data;Data quality management module, for checking that the specific missing of each field in the tables of data is believed
Breath, and corresponding fill rule is set to complete the filling of the missing information;Multisource data fusion module is used for multiple numbers
Fusion again is carried out according to multiple tables of data in source to summarize, and after obtaining new tables of data, the new tables of data is carried out into one
Step analysis.
The big data controlling device based on Hadoop of the embodiment of the present invention, using big data component complete data preview,
The functional modules such as metadata management, multisource data fusion, the quality of data help user to understand that data really contain from multiple angles
Justice provides highly reliable data basis for subsequent analysis and inquiry, meanwhile, complicated operation is hidden in below, is externally mentioned
For can click interface so that the user for not having big data professional skill can also be operated with the improvement of complete paired data,
The practicability of device has been fully demonstrated, so that the applicability and practicability of big data improvement are effectively increased, it is simple easily to realize.
In addition, the big data controlling device according to the above embodiment of the present invention based on Hadoop can also have it is following attached
The technical characteristic added:
Further, in one embodiment of the invention, the data preview module is further used for through table shape
Formula and bar graph form show the essential information, wherein the histogram reflects the record number that each tables of data possesses,
And the detailed essential information of form display data table.
Further, in one embodiment of the invention, the data preview module is also provided for based on current number
According to the change historical information and output information in source.
Further, in one embodiment of the invention, be further used for will be same for the multisource data fusion module
It carries out summarizing fusion according to any primary attribute between the different data table of data source;And/or by the different data of different data sources
It completes to merge according to any primary attribute between table.
Further, in one embodiment of the invention, the multisource data fusion module is with data quality management mould
It is realized based on the data obtained after block processing and by SQL statement and is merged.
The additional aspect of the present invention and advantage will be set forth in part in the description, and will partially become from the following description
Obviously, or practice through the invention is recognized.
Detailed description of the invention
Above-mentioned and/or additional aspect and advantage of the invention will become from the following description of the accompanying drawings of embodiments
Obviously and it is readily appreciated that, in which:
Fig. 1 is the structural schematic diagram according to the big data controlling device based on Hadoop of one embodiment of the invention;
Fig. 2 is the structural representation according to the big data controlling device based on Hadoop of a specific embodiment of the invention
Figure.
Specific embodiment
The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end
Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached
The embodiment of figure description is exemplary, it is intended to is used to explain the present invention, and is not considered as limiting the invention.
The big data controlling device based on Hadoop proposed according to embodiments of the present invention is described with reference to the accompanying drawings.
Fig. 1 is the structural schematic diagram of the big data controlling device based on Hadoop of one embodiment of the invention.
As shown in Figure 1, should big data controlling device 10 based on Hadoop include: data administer information management module 100,
Data source capability module 200, data preview module 300, metadata management module 400, data quality management module 500 and multi-source
Data fusion module 600.
Wherein, data administer information management module 100 and are used to safeguard that the data of each data source to administer operation information, and mention
For administering the copy function of operation.Data source capability module 200 is used to carry out improvement operation to the data for importing big data platform,
And support the improvement operation of the MySQL data source types and Hive data source types of structuring database.Data preview module 300
For showing the essential information of each tables of data from the angle of structured database.Metadata management module 400 is used for more to user
The metadata information in tables of data is presented in dimension.Data quality management module 500 is used to check the specific of each field in tables of data
Missing information, and corresponding fill rule is set to complete the filling of missing information.Multisource data fusion module 600 is used for will be more
Multiple tables of data of a data source carry out fusion again and summarize, and after obtaining new tables of data, carry out new tables of data into one
Step analysis.The device 10 of the embodiment of the present invention completes multiple functional modules using big data component, helps user from multiple angles
Understand data real meaning, provides highly reliable data basis for subsequent analysis and inquiry.
It is understood that as shown in Fig. 2, the device 10 of the embodiment of the present invention includes that data administer information management module
100, data source capability module 200, data preview module 300, metadata management module 400,500 and of data quality management module
Multisource data fusion module 600, wherein each module completes different data and administers operation, and then helps user more preferably geographical
Business datum is solved, while carrying out the data preparation of high quality for the analysis of subsequent data.The device 10 of the embodiment of the present invention solves
Industry and enterprise analyzes many data problems encountered in scene in data mining, data, and wherein data, which administer information management, is
Uniform management module in device 10, for managing the essential information that data administer operation.Operation is administered when user increases data newly
When, according to data source capability, data preview, metadata management, data quality management, multisource data fusion process for using successively
Each module operation is executed, and then completes to administer operating process for the complete data of a certain data source.Separately below to each module
Concrete function describes in detail.
In one embodiment of the invention, data administer information management module 100 and administer information management, number for data
Refer to that administering operation information to the data of each data source safeguards according to information management is administered, so that user can be before
Processing is iterated in the improvement operation carried out;Meanwhile data administer information management module 100 and also provide answering for improvement operation
Function processed reaches common understanding or improvement that treatment process is perfect operation may be copied operation for a certain, reduces user's
Repeatable operation process.
Specifically, it is the uniform management module in whole device 10 that data, which administer information management module 100, for each
The data of data source administer operation can all form unique one in data source Basic Information Table (data_source_info)
Record, and be shown in the homepage of apparatus platform.User can both administer operation continuing with data based on existing data
Source carries out improvement operation, also can choose " newdata improvement ", to do not carry out also any data administer the data source of operation into
Row operation.Meanwhile data improvement information management module 100 also can carry out statistical operation to each data source, summarize platform from whole
On each data source operation information etc., specifically include:
(1) statistics of total item number, i.e., the data for carrying out or completing on this platform administer the data source number of operation
Amount.The data information is obtained by calculating the record sum in data source Basic Information Table (data_source_info);
(2) each tables of data committed memory higher preceding ten that the statistics of table committed memory, i.e. statistics import big data platform
Data table name is opened, the information is according to the Storage_num (physics in tables of data Basic Information Table (table_basic_info)
Memory space occupies) field is ranked up and obtains;
(3) statistics of popular table, that is, count by higher preceding ten tables of data of number of operations in this platform, which passes through
Sort method is carried out to Table_id (tables of data ID) field in data change history lists (data_modify_info) to obtain.
Further, data source capability module 200 is used to carry out improvement operation to the data for having been introduced into big data platform,
And support the improvement operation of two kinds of data source types of structuring database MySQL and Hive.Specifically, the embodiment of the present invention is supported
The data of the two kinds of data source of MySQL and Hive administer operation, so, in data source capability module 200, user is first
Selection needs to carry out the data source types that data administer operation, and device 10 can be from tables of data Basic Information Table (table_basic_
Info all data table names read under current data source in) need to carry out the data source that data administer operation for user's selection
Title.
Further, in one embodiment of the invention, data preview module 300 is further used for passing through form
Essential information is shown with bar graph form, wherein histogram reflects the record number that each tables of data possesses, and form is shown
The detailed essential information of tables of data.
Wherein, in one embodiment of the invention, data preview module 300 is also provided for based on current data source
Change historical information and output information.
It is understood that data preview is the essential information for showing each tables of data from the angle of database to user;Number
Two kinds of display form, that is, forms and bar graph form are provided according to previewing module 300.Wherein, histogram reflects each tables of data
How much is the record number possessed, and form then illustrates the detailed essential information of tables of data.Meanwhile device 10 provides a user
Change historical information and output information based on current data source, to facilitate user to be best understood from the service condition of data source.
Specifically, data preview module 300 is the case where helping user to understand business datum on the whole, comprising current
The record sum of each tables of data and output information, the change historical information of data source etc. under select data source, specifically include:
(1) the record sum of tables of data passes through the Row_ in statistics table Basic Information Table (table_basic_info)
Num (possessing field quantity), field obtained, and front end is shown in the form of histogram, more intuitively to view each number
According to the record number in table.
(2) the output information of tables of data summarize by the insertion time of every record in statistics raw data table
Out, it is equally presented in a manner of histogram, conveniently checks the operational circumstances in a period of time to certain tables of data.
(3) the change historical information of data, the information are obtained by data change history lists (data_modify_info).
(4) preceding ten records of tables of data, ten records are before former tables of data is directly inquired by MySQL or HiveQL
It can.
Further, the letter of the detailed metadata in specific tables of data is presented to user in 400 various dimensions of metadata management module
It ceases (field information, partition information, index information etc.), business datum meaning can be more clearly understood for a user,
Be also convenient for user as needed intelligently obtain business datum environment in metadata information;Meanwhile metadata management module 400 mentions
For a variety of truths for showing form and user being helped to quickly understand data.
Specifically, data administer in a critically important part how be by hundreds and thousands of tables of data in database
Or the data information of disparate databases is presented to the user with visualizing.The major function of metadata management module 400 is to aid in
User quickly understands the concrete condition and metadata meaning of business datum.
Wherein, for Hive database, the embodiment of the present invention is by being arranged corresponding profile parameters, the member of database
Data are stored in the specified database Hive of MySQL, wherein comprising the relevant metadata table of Hive database (DBS,
DATABASE_PARAMS), Hive table and the relevant metadata table of view (TBLS, TABLE_PARAMS, TBL_PRIVS), Hive
File stores relevant metadata table (SDS, SD_PARAMS, SERDES, SERDE_PARAMS) of information etc..For MySQL data
Library, metadata information are stored in information_schema database, wherein comprising data essential information (TABLES,
COLUMNS, VIEWS), partition information (PARTITIONS) etc..For two kinds of database, it is all made of Java and passes through JDBC
Mode connect the mode of database and obtain information.In addition, providing a variety of display forms (form, number based on the module 400
According to cloud atlas, blood relationship management):
(1) details for the tables of data metadata information that form is shown, user can according to need modification data
Information, the modified information such as the SQL type of storage or column description can be in metadata fields information table (table_field_
Info) respective field is updated, the respective field in original data source can be also updated.Meanwhile this operation is related
Record information can be also inserted into data change history lists (data_modify_info).
(2) it for data cloud atlas, is realized by knowledge mapping technology.In knowledge mapping, each node indicates real generation
" entity " present in boundary, " relationship " of each edge between entity and entity.The presentation mode of map is used for reference, it will in this system
Ready-portioned theme and tables of data as node, incidence relation between tables of data as side be stored in chart database (such as
Neo4j in), user can more intuitively understand the relationship between tables of data by browsing the figure.
(3) for blood relationship management, the mainly parsing of source table and object table, for example, source table passes through table Naming conventions
It is parsed, object table mainly passes through the sentences solution such as " insert into table " and " insert overwrite table "
Analysis, finally obtains the relationship between table and table.
It should be noted that the design of database is as shown in table 1, table 1 is database structure table.
Table 1
Further, data quality management module 500 is used to have due to the possible business datum of setting of filling in of operation system
Situations such as a large amount of missings, mistake are filled out, causes the miss rate of data excessively high, user checks the tool of each field in tables of data by this module
Body deletion condition, and corresponding fill rule is set and completes filling.
Specifically, the major function of data quality management module 500 is the missing number of each data table data filling of inquiry
It measures and calculates miss rate.This function is by calling pandas library function in python can be realized, the result deposit checked out
In tables of data deletion condition information table (data_missing_info) and show user.Certain field higher for miss rate,
User can choose configuration fill rule and complete configuration, and system, which is provided, fills or make by oneself according to median filling, according to mode
Adopted fill rule.Wherein, first two filling mode first calculates the median of the original filling data of field and mode is filled;
Customized filling is the customized filling content of user, such as filling " -1 " etc..Passed through according to the fill rule that user selects
Pandas library function is completed to recalculating miss rate after the filling of specific field, and is updated in tables of data deletion condition information table
Corresponding field missing information.
Further, in one embodiment of the invention, multisource data fusion module 600 is further used for same number
According to carrying out summarizing fusion according to any primary attribute between the different data table in source;And/or by the different data table of different data sources
Between according to any primary attribute complete merge.
It is understood that in actual big data analysis and excavating in scene, may be not limited only to for a certain number
It is analyzed according to tables of data existing in source, needs to carry out multiple tables of data of multiple data sources fusion again and summarize, obtain
It is analyzed it again after new tables of data.Therefore, the multisource data fusion module 600 in the device of that embodiment of the invention 10 is intended to
Solve the problems, such as that different data source data hits library.
Wherein, there are mainly two types of modes for multisource data fusion: single library fusion and the fusion of more libraries.Wherein, single library fusion refers to
It is according to a certain primary attribute between the different data table of same data source (for example, number information, regional information, job category letter
Breath etc.) it carries out summarizing fusion;More library fusions refer to complete according to a certain primary attribute between the different data table of different data sources
At fusion.The module 600 is melted based on the data obtained after the processing of data quality management module 500 by SQL statement realization
It closes.
Specifically, multisource data fusion module 600 is the operation module of procedure.Specific design and implementation process are such as
Under:
(1) fused type (single library fusion/more libraries fusion) is selected.If selecting " single library fusion ", need to select to be merged
The data table name of operation;If selecting " more library fusions ", system can be from data source Basic Information Table (data_source_
Info other data source name identical with current data Source Type are read in), and the data source name merged is selected for user, then
Equally with " single library fusion ", selection carries out the data table name of mixing operation.
(2) fusion rule is configured, each fusion rule is based on certain two tables of data in selected tables of data and is matched
It sets.System chooses Property Name all in tables of data from metadata fields information table (table_field_info) acquisition, uses
The field name that family selects two tables of data in fusion rule to need.Configuration fusion rule form is " Table_
A.Column_A=Table_B.Column_B ".
(3) high level rules configure, i.e., whether fused tables of data, which allows to retain, repeats record or Repeating Field.
(4) storage setting, configures fused title and storage location, default storage is under current data source.Together
When, system, which can be shown, to be selected with other data source name of current data source same type for user.
(5) complete fusion, be based on the configured fusion rule in step (1)-(4), be converted into corresponding MySQL or
HiveQL sentence, operates the tables of data in corresponding data source, and completion fusion forms new tables of data and is stored under specified data source
Complete fusion.Meanwhile data fusion Basic Information Table (data_ will be stored in about the essential information of this fusion process
Fusion_info it is shown in) and in the homepage of multisource data fusion module 600, helps user to understand and be based on current data
The data fusion operation that source has been completed.
To sum up, the device of the embodiment of the present invention solves existing big data governing system substantially only comprising metadata management
Content, there is no from data to big data resource carry out missing filling, multisource data fusion etc. operate the problem of,
To provide sufficient data preparation for subsequent analysis and excavation, and on the basis of metadata management, data preview, number are increased newly
According to modules such as quality management, multisource data fusions.The device of the embodiment of the present invention is suitable for various data and administers scene demand, than
Such as: user being helped to quickly understand data service metadata information;Missing data in business datum is filled and has obtained high quality
Data;The E-R relationship that concatenation improves each tables of data is carried out to the tables of data of multi-data source.
In addition, in view of requirement of the mass data to processing environment is handled in big data scene in time, the embodiment of the present invention
Device completes big data using Hadoop ecology component and administers each link operation.Firstly, use premise is data guiding structure
Change and is stored in database (MySQL) or the Hive of Hadoop ecology.Based on the data imported, user will be seen that importing number
According to basic condition (metadata information including data in tables of data and table), and based on this data complete data quality management,
Multisource data fusion operation carries out sufficient data preparation for subsequent analysis link.The embodiment of the present invention is led from user data
Enter the result data to after improvement output, need by selection data source, data preview, metadata management, data quality management,
Multisource data fusion totally five steps.In addition, being iterated place in the improvement operation carried out before for the convenience of the user
Reason, the device of the embodiment of the present invention also design data administer information management module to the data of each data source administer operation information into
Row management.
The big data controlling device based on Hadoop proposed according to embodiments of the present invention completes number using big data component
According to functional modules such as preview, metadata management, multisource data fusion, the qualities of data, user is helped to understand data from multiple angles
Real meaning provides highly reliable data basis for subsequent analysis and inquiry, meanwhile, complicated operation is hidden in below,
Externally provide can click interface so that the user for not having big data professional skill can also be with the improvement of complete paired data
Operation, has fully demonstrated the practicability of device, so that the applicability and practicability of big data improvement are effectively increased, it is simple easily real
It is existing.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance
Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or
Implicitly include at least one this feature.In the description of the present invention, the meaning of " plurality " is at least two, such as two, three
It is a etc., unless otherwise specifically defined.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example
Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not
It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office
It can be combined in any suitable manner in one or more embodiment or examples.In addition, without conflicting with each other, the skill of this field
Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples
It closes and combines.
Although the embodiments of the present invention has been shown and described above, it is to be understood that above-described embodiment is example
Property, it is not considered as limiting the invention, those skilled in the art within the scope of the invention can be to above-mentioned
Embodiment is changed, modifies, replacement and variant.
Claims (5)
1. a kind of big data controlling device based on Hadoop characterized by comprising
Data administer information management module, for safeguarding that the data of each data source administer operation information, and provide improvement operation
Copy function;
Data source capability module for carrying out improvement operation to the data for importing big data platform, and supports structuring database
MySQL data source types and Hive data source types improvement operation;
Data preview module, for showing the essential information of each tables of data from the angle of the structured database;
Metadata management module, for the metadata information in tables of data to be presented to user's various dimensions;
Data quality management module for checking the specific missing information of each field in the tables of data, and is arranged and fills out accordingly
Rule is filled to complete the filling of the missing information;And
Multisource data fusion module summarizes for multiple tables of data of multiple data sources to be carried out fusion again, new to obtain
After tables of data, the new tables of data is further analyzed.
2. the big data controlling device according to claim 1 based on Hadoop, which is characterized in that the data preview mould
Block is further used for showing the essential information by form and bar graph form, wherein the histogram reflects institute
State the record number that each tables of data possesses, and the detailed essential information of form display data table.
3. the big data controlling device according to claim 1 or 2 based on Hadoop, which is characterized in that the data are pre-
Module of looking at also provides for change historical information and output information based on current data source.
4. the big data controlling device according to claim 1 based on Hadoop, which is characterized in that the multi-source data melts
Molding block is further used for carrying out summarizing fusion according to any primary attribute between the different data table by same data source;And/or
It will complete to merge according to any primary attribute between the different data table of different data sources.
5. the big data controlling device according to claim 4 based on Hadoop, which is characterized in that the multi-source data melts
Block is molded based on the data obtained after data quality management resume module and is realized by SQL statement and is merged.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810879556.0A CN109241107A (en) | 2018-08-03 | 2018-08-03 | Big data controlling device based on Hadoop |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810879556.0A CN109241107A (en) | 2018-08-03 | 2018-08-03 | Big data controlling device based on Hadoop |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109241107A true CN109241107A (en) | 2019-01-18 |
Family
ID=65070544
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810879556.0A Pending CN109241107A (en) | 2018-08-03 | 2018-08-03 | Big data controlling device based on Hadoop |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109241107A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110019150A (en) * | 2019-04-11 | 2019-07-16 | 软通动力信息技术有限公司 | A kind of data administering method, system and electronic equipment |
CN110825744A (en) * | 2019-10-31 | 2020-02-21 | 武汉工程大学 | Air quality monitoring big data partition storage method based on cluster environment |
CN112434023A (en) * | 2019-08-26 | 2021-03-02 | 长鑫存储技术有限公司 | Process data analysis method and device, storage medium and computer equipment |
CN112506906A (en) * | 2020-12-04 | 2021-03-16 | 北京三维天地科技股份有限公司 | Data governance platform based on artificial intelligence technique |
CN112632178A (en) * | 2021-01-05 | 2021-04-09 | 上海明略人工智能(集团)有限公司 | Method and system for visualizing treatment data |
CN112700157A (en) * | 2021-01-07 | 2021-04-23 | 杭州数梦工场科技有限公司 | Data asset generation method and device and electronic equipment |
CN112988734A (en) * | 2021-04-29 | 2021-06-18 | 贵州数据宝网络科技有限公司 | Multi-element and multi-dimensional data fusion treatment method and system |
CN113223726A (en) * | 2021-04-23 | 2021-08-06 | 武汉大学 | Visualized interactive system for data treatment mode and treatment result in medical big data |
CN114595214A (en) * | 2022-03-03 | 2022-06-07 | 江苏鼎驰电子科技有限公司 | Big data management system |
CN115048452A (en) * | 2022-05-07 | 2022-09-13 | 网诺数据科技(云南)集团有限公司 | Big data management system based on block chain |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104111996A (en) * | 2014-07-07 | 2014-10-22 | 山大地纬软件股份有限公司 | Health insurance outpatient clinic big data extraction system and method based on hadoop platform |
US20150026114A1 (en) * | 2013-07-18 | 2015-01-22 | Dania M. Triff | System and method of automatically extracting data from plurality of data sources and loading the same to plurality of target databases |
US20160253340A1 (en) * | 2015-02-27 | 2016-09-01 | Podium Data, Inc. | Data management platform using metadata repository |
CN106547892A (en) * | 2016-11-01 | 2017-03-29 | 山东浪潮云服务信息科技有限公司 | A kind of data resource management platform gathered based on internet data |
CN107103050A (en) * | 2017-03-31 | 2017-08-29 | 海通安恒(大连)大数据科技有限公司 | A kind of big data Modeling Platform and method |
CN107657052A (en) * | 2017-10-17 | 2018-02-02 | 上海计算机软件技术开发中心 | A kind of data governing system based on metadata management |
CN107704608A (en) * | 2017-10-17 | 2018-02-16 | 北京览群智数据科技有限责任公司 | A kind of OLAP multidimensional analyses and data digging system |
-
2018
- 2018-08-03 CN CN201810879556.0A patent/CN109241107A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150026114A1 (en) * | 2013-07-18 | 2015-01-22 | Dania M. Triff | System and method of automatically extracting data from plurality of data sources and loading the same to plurality of target databases |
CN104111996A (en) * | 2014-07-07 | 2014-10-22 | 山大地纬软件股份有限公司 | Health insurance outpatient clinic big data extraction system and method based on hadoop platform |
US20160253340A1 (en) * | 2015-02-27 | 2016-09-01 | Podium Data, Inc. | Data management platform using metadata repository |
CN106547892A (en) * | 2016-11-01 | 2017-03-29 | 山东浪潮云服务信息科技有限公司 | A kind of data resource management platform gathered based on internet data |
CN107103050A (en) * | 2017-03-31 | 2017-08-29 | 海通安恒(大连)大数据科技有限公司 | A kind of big data Modeling Platform and method |
CN107657052A (en) * | 2017-10-17 | 2018-02-02 | 上海计算机软件技术开发中心 | A kind of data governing system based on metadata management |
CN107704608A (en) * | 2017-10-17 | 2018-02-16 | 北京览群智数据科技有限责任公司 | A kind of OLAP multidimensional analyses and data digging system |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110019150A (en) * | 2019-04-11 | 2019-07-16 | 软通动力信息技术有限公司 | A kind of data administering method, system and electronic equipment |
CN112434023A (en) * | 2019-08-26 | 2021-03-02 | 长鑫存储技术有限公司 | Process data analysis method and device, storage medium and computer equipment |
CN110825744A (en) * | 2019-10-31 | 2020-02-21 | 武汉工程大学 | Air quality monitoring big data partition storage method based on cluster environment |
CN110825744B (en) * | 2019-10-31 | 2023-06-20 | 武汉工程大学 | Cluster environment-based air quality monitoring big data partition storage method |
CN112506906A (en) * | 2020-12-04 | 2021-03-16 | 北京三维天地科技股份有限公司 | Data governance platform based on artificial intelligence technique |
CN112632178A (en) * | 2021-01-05 | 2021-04-09 | 上海明略人工智能(集团)有限公司 | Method and system for visualizing treatment data |
CN112700157A (en) * | 2021-01-07 | 2021-04-23 | 杭州数梦工场科技有限公司 | Data asset generation method and device and electronic equipment |
CN113223726A (en) * | 2021-04-23 | 2021-08-06 | 武汉大学 | Visualized interactive system for data treatment mode and treatment result in medical big data |
CN112988734A (en) * | 2021-04-29 | 2021-06-18 | 贵州数据宝网络科技有限公司 | Multi-element and multi-dimensional data fusion treatment method and system |
CN114595214A (en) * | 2022-03-03 | 2022-06-07 | 江苏鼎驰电子科技有限公司 | Big data management system |
CN115048452A (en) * | 2022-05-07 | 2022-09-13 | 网诺数据科技(云南)集团有限公司 | Big data management system based on block chain |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109241107A (en) | Big data controlling device based on Hadoop | |
US10878064B2 (en) | Clinical data management system | |
US7464087B2 (en) | Method and system of unifying data | |
CN108038222B (en) | System of entity-attribute framework for information system modeling and data access | |
CN105849726B (en) | For efficiently supporting the general index of the extemporaneous inquiry by demixing marking data | |
US8255368B2 (en) | Apparatus and method for positioning user-created data in OLAP data sources | |
JP6492008B2 (en) | Cohort identification system | |
US7668888B2 (en) | Converting object structures for search engines | |
DE102014103279A1 (en) | Pivot facets for text mining and search | |
CN111145855A (en) | Automatic generation method and system for clinical PDF report | |
KR20060067812A (en) | Complex data access | |
US20050027674A1 (en) | Metadata modelling for reporting | |
US20130060733A1 (en) | Method and querying and controlling database | |
US11068459B2 (en) | Computer implemented and computer controlled method, computer program product and platform for arranging data for processing and storage at a data storage engine | |
US10437872B2 (en) | Computer implemented and computer controlled method, computer program product and platform for arranging data for processing and storage at a data storage engine | |
US7702643B2 (en) | System and method for metamodel-based gap analysis | |
CA2814328C (en) | Standardized database access system and method | |
US20070022137A1 (en) | Data source business component generator | |
CN114443120A (en) | Intelligent embedded point management system and method | |
US20150331565A1 (en) | Method for Generating Database Components Code | |
Azarm | Tool Support and Data Management for Business Analytics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |