CN106951475A - Big data distributed approach and system based on cloud computing - Google Patents
Big data distributed approach and system based on cloud computing Download PDFInfo
- Publication number
- CN106951475A CN106951475A CN201710130418.8A CN201710130418A CN106951475A CN 106951475 A CN106951475 A CN 106951475A CN 201710130418 A CN201710130418 A CN 201710130418A CN 106951475 A CN106951475 A CN 106951475A
- Authority
- CN
- China
- Prior art keywords
- file
- value
- relation
- mapping
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5015—Service provider selection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A kind of big data distributed approach based on cloud computing, it comprises the following steps:S1, reception input file, input burst is carried out according to input file size, and a mapping tasks are distributed by each input burst, and input burst stores the array of the position of burst length and record data;S2, by the mapping function write in advance on data memory node map obtaining intermediate file;S3, the duplicate key value merged in intermediate file;S4, open up circulating memory buffering area in internal memory, circulating memory buffering area is used to map output file output;Configuration file is created in circulating memory buffering area;Protection thread pause writes data into internal memory, and writes spill file in internal memory, and spill file determines the file of write-in disk, and the file of circulating memory buffering area is write into disk until all mapping output file output is finished;S5, by all mapping output files and store on distributed file storage system.
Description
Technical field
The present invention relates to big data field of cloud computer technology, at more particularly to a kind of big data distribution based on cloud computing
Manage method and system.
Background technology
With the arriving of cloud era, big data (Big data) has also attracted increasing concern.Big data (Big
Data a large amount of unstructured datas and semi-structured data) are conventionally used to indicate, these data are downloading to relevant database
For purposes analysis.Big data analysis is often linked together with cloud computing, because large data set analysis needs picture in real time
Framework the same MapReduce shares out the work to tens of, hundreds of or even thousands of computer.Big data needs special skill
Art, effectively to handle the data in the substantial amounts of tolerance elapsed time.Suitable for the technology of big data, including at large-scale parallel
Manage (MPP) database, data digging system, distributed file system, distributed data base, cloud computing platform, internet and can
The storage system of extension.
Data source is enriched very much under big data environment and data type is various, and the data volume of storage and analysis mining is huge
Greatly, the requirement to data exhibiting is higher, and values very much the high efficiency and availability of data processing.But traditional data processing side
Method has the following disadvantages:1st, traditional data acquisition source is single, and storage, management and analyze data amount are also relatively small, greatly
It is many to be handled using relevant database and parallel data warehouse.To by parallel computation lifting data processing speed aspect
Speech, traditional parallel database technology pursues high consistency and fault-tolerance, theoretical according to CAP, it is difficult to ensure its availability and
Autgmentability.2nd, traditional data processing method is the expense that calculating is considerably increased centered on processor, it is impossible to adapt to big number
According to a large amount of unstructuredness data process demand.
The content of the invention
In view of this, the present invention proposes a kind of big data distributed approach and system based on cloud computing.
A kind of big data distributed approach based on cloud computing, it comprises the following steps:
S1, reception input file, input burst is carried out according to input file size, and distributing one by each input burst reflects
Task is penetrated, input burst stores the array of the position of burst length and record data;
S2, by the mapping function write in advance on data memory node map obtaining intermediate file;
S3, the duplicate key value merged in intermediate file, to reduce mapping output file redundancy;And the key assignments after merging is entered
Row serializing obtains mapped cache file;Automatically the computational load value of each calculate node is obtained, according to the calculating of calculate node
Each mapped cache file is assigned in each calculate node by load value;
S4, open up circulating memory buffering area in internal memory, circulating memory buffering area is used to map output file output;In ring
Configuration file is created in shape core buffer, the EMS memory occupation threshold value of core buffer is configured in configuration file;In annular
Deposit in buffering area EMS memory occupation to be more than or equal to when taking threshold value, protection thread pause writes data into internal memory, and in internal memory
Write spill file, spill file determines the file of write-in disk, and the file of circulating memory buffering area write into disk until
All mapping output file outputs are finished;
S5, by all mapping output files and store on distributed file storage system.
In the big data distributed approach of the present invention based on cloud computing, to input text in the step S1
Part size, which carries out input burst, to be included:
Incidence relation table is set up, input file is split as position relationship value, activity relationship value, structural relation value, function
Relation value, functional relationship value, behavior relation value and other relation values, and by the correspondence of each relation value of each input file
In relation write-in incidence relation table;
The corresponding data of each relation value are included in input burst.
In the big data distributed approach of the present invention based on cloud computing, the step S2 includes:
Mapped by the mapping function write in advance by burst is inputted according to mapping tasks, the mapping including according to
The data form pre-set will input burst content and enter row-column list alignment, judge that position relationship value, activity relationship value, structure are closed
Set occurrence, functional relationship value, functional relationship value, behavior relation value and other relation values whether there is, if each relation value is deposited
Then directly retaining, if there is no a certain item or a few n-th-trem relation n values, then the relation value lacked is sky;The arrangement of each relation
Order is consistent.
In the big data distributed approach of the present invention based on cloud computing,
The step S5 includes:
Each corresponding all index information of mapping output file is inquired about from incidence relation table, by each mapping output text
One segment data of each correspondence of part is inserted into section list;The position relationship value, activity relationship value, structure for recording segment data are closed
Set occurrence, functional relationship value, functional relationship value, behavior relation value and other relation values.
In the big data distributed approach of the present invention based on cloud computing,
Burst will be inputted in the step S2 to the mapping function by writing in advance to carry out mapping also according to mapping tasks
Including judging that input burst whether there is logic error according to incidence relation table, the input burst is abandoned if existing.
The present invention also provides a kind of big data distributed processing system(DPS) based on cloud computing, and it includes such as lower unit:
Split cells, for receiving input file, input burst is carried out according to input file size, by each input burst
A mapping tasks are distributed, input burst stores the array of the position of burst length and record data;
Map unit, on data memory node map and obtains middle text for the mapping function by writing in advance
Part;
Computing unit, for merging the duplicate key value in intermediate file, to reduce mapping output file redundancy;And to merging
Key assignments afterwards serialize obtaining mapped cache file;Automatically the computational load value of each calculate node is obtained, according to calculating
Each mapped cache file is assigned in each calculate node by the computational load value of node;
Output unit, for opening up circulating memory buffering area in internal memory, circulating memory buffering area is used to map output text
Part is exported;Configuration file is created in circulating memory buffering area, the EMS memory occupation threshold of core buffer is configured in configuration file
Value;When EMS memory occupation is more than or equal to occupancy threshold value in circulating memory buffering area, protection thread pause writes data into internal memory,
And spill file is write in internal memory, spill file determines the file of write-in disk, and the file of circulating memory buffering area is write
Enter disk until all mapping output file output is finished;
Merge memory cell, for by all mapping output files and storing to distributed file storage system.
In the big data distributed processing system(DPS) of the present invention based on cloud computing, to input in the split cells
File size, which carries out input burst, to be included:
Incidence relation table is set up, input file is split as position relationship value, activity relationship value, structural relation value, function
Relation value, functional relationship value, behavior relation value and other relation values, and by the correspondence of each relation value of each input file
In relation write-in incidence relation table;
The corresponding data of each relation value are included in input burst.
In the big data distributed processing system(DPS) of the present invention based on cloud computing, the map unit includes:
Mapped by the mapping function write in advance by burst is inputted according to mapping tasks, the mapping including according to
The data form pre-set will input burst content and enter row-column list alignment, judge that position relationship value, activity relationship value, structure are closed
Set occurrence, functional relationship value, functional relationship value, behavior relation value and other relation values whether there is, if each relation value is deposited
Then directly retaining, if there is no a certain item or a few n-th-trem relation n values, then the relation value lacked is sky;The arrangement of each relation
Order is consistent.
In the big data distributed processing system(DPS) of the present invention based on cloud computing,
The merging memory cell includes:
Each corresponding all index information of mapping output file is inquired about from incidence relation table, by each mapping output text
One segment data of each correspondence of part is inserted into section list;The position relationship value, activity relationship value, structure for recording segment data are closed
Set occurrence, functional relationship value, functional relationship value, behavior relation value and other relation values.
In the big data distributed processing system(DPS) of the present invention based on cloud computing,
The mapping function by writing in advance is mapped burst is inputted according to mapping tasks in the map unit
Also include judging that input burst whether there is logic error according to incidence relation table, the input burst is abandoned if existing.
Implement big data distributed approach based on cloud computing that the present invention provides and system compared with prior art
Have the advantages that:By the way that if the big data data of magnanimity have been divided into stem portion according to the rule pre-set, point
To many processor parallel processings;Then the result after each processor processing is carried out collecting operation to obtain final result;
Have the following effects that:A large amount of, the non-structured data of processing can be realized, data processing type and speed is improved.
Brief description of the drawings
Fig. 1 be the embodiment of the present invention modified wireless communication procedure in language transfer method flow chart.
Embodiment
As shown in figure 1, a kind of big data distributed approach based on cloud computing, it comprises the following steps:
S1, reception input file, input burst is carried out according to input file size, and distributing one by each input burst reflects
Task is penetrated, input burst stores the array of the position of burst length and record data;
S2, by the mapping function write in advance on data memory node map obtaining intermediate file;
S3, the duplicate key value merged in intermediate file, to reduce mapping output file redundancy;And the key assignments after merging is entered
Row serializing obtains mapped cache file;Automatically the computational load value of each calculate node is obtained, according to the calculating of calculate node
Each mapped cache file is assigned in each calculate node by load value;
S4, open up circulating memory buffering area in internal memory, circulating memory buffering area is used to map output file output;In ring
Configuration file is created in shape core buffer, the EMS memory occupation threshold value of core buffer is configured in configuration file;In annular
Deposit in buffering area EMS memory occupation to be more than or equal to when taking threshold value, protection thread pause writes data into internal memory, and in internal memory
Write spill file, spill file determines the file of write-in disk, and the file of circulating memory buffering area write into disk until
All mapping output file outputs are finished;
S5, by all mapping output files and store on distributed file storage system.
In the big data distributed approach of the present invention based on cloud computing, to input text in the step S1
Part size, which carries out input burst, to be included:
Incidence relation table is set up, input file is split as position relationship value, activity relationship value, structural relation value, function
Relation value, functional relationship value, behavior relation value and other relation values, and by the correspondence of each relation value of each input file
In relation write-in incidence relation table;
The corresponding data of each relation value are included in input burst.
By implementing the embodiment of the present invention, various types of data can uniformly be split into each relation value, even if having
A little relation value specific type of data do not have.Then distributed treatment is carried out to each relation value, data can be greatly improved
Disposal ability.
In the big data distributed approach of the present invention based on cloud computing, the step S2 includes:
Mapped by the mapping function write in advance by burst is inputted according to mapping tasks, the mapping including according to
The data form pre-set will input burst content and enter row-column list alignment, judge that position relationship value, activity relationship value, structure are closed
Set occurrence, functional relationship value, functional relationship value, behavior relation value and other relation values whether there is, if each relation value is deposited
Then directly retaining, if there is no a certain item or a few n-th-trem relation n values, then the relation value lacked is sky;The arrangement of each relation
Order is consistent.
By implementing the present embodiment, it will input burst content according to the data form pre-set and enter row-column list alignment, make
The process resource for obtaining follow-up calculate node takes less.
In the big data distributed approach of the present invention based on cloud computing,
The step S5 includes:
Each corresponding all index information of mapping output file is inquired about from incidence relation table, by each mapping output text
One segment data of each correspondence of part is inserted into section list;The position relationship value, activity relationship value, structure for recording segment data are closed
Set occurrence, functional relationship value, functional relationship value, behavior relation value and other relation values.
In the big data distributed approach of the present invention based on cloud computing,
Burst will be inputted in the step S2 to the mapping function by writing in advance to carry out mapping also according to mapping tasks
Including judging that input burst whether there is logic error according to incidence relation table, the input burst is abandoned if existing.
By implementing the present embodiment, redundancy, false judgment can be carried out to data, reduce operand.
The present invention also provides a kind of big data distributed processing system(DPS) based on cloud computing, and it includes such as lower unit:
Split cells, for receiving input file, input burst is carried out according to input file size, by each input burst
A mapping tasks are distributed, input burst stores the array of the position of burst length and record data;
Map unit, on data memory node map and obtains middle text for the mapping function by writing in advance
Part;
Computing unit, for merging the duplicate key value in intermediate file, to reduce mapping output file redundancy;And to merging
Key assignments afterwards serialize obtaining mapped cache file;Automatically the computational load value of each calculate node is obtained, according to calculating
Each mapped cache file is assigned in each calculate node by the computational load value of node;
Output unit, for opening up circulating memory buffering area in internal memory, circulating memory buffering area is used to map output text
Part is exported;Configuration file is created in circulating memory buffering area, the EMS memory occupation threshold of core buffer is configured in configuration file
Value;When EMS memory occupation is more than or equal to occupancy threshold value in circulating memory buffering area, protection thread pause writes data into internal memory,
And spill file is write in internal memory, spill file determines the file of write-in disk, and the file of circulating memory buffering area is write
Enter disk until all mapping output file output is finished;
Merge memory cell, for by all mapping output files and storing to distributed file storage system.
In the big data distributed processing system(DPS) of the present invention based on cloud computing, to input in the split cells
File size, which carries out input burst, to be included:
Incidence relation table is set up, input file is split as position relationship value, activity relationship value, structural relation value, function
Relation value, functional relationship value, behavior relation value and other relation values, and by the correspondence of each relation value of each input file
In relation write-in incidence relation table;
The corresponding data of each relation value are included in input burst.
In the big data distributed processing system(DPS) of the present invention based on cloud computing, the map unit includes:
Mapped by the mapping function write in advance by burst is inputted according to mapping tasks, the mapping including according to
The data form pre-set will input burst content and enter row-column list alignment, judge that position relationship value, activity relationship value, structure are closed
Set occurrence, functional relationship value, functional relationship value, behavior relation value and other relation values whether there is, if each relation value is deposited
Then directly retaining, if there is no a certain item or a few n-th-trem relation n values, then the relation value lacked is sky;The arrangement of each relation
Order is consistent.
In the big data distributed processing system(DPS) of the present invention based on cloud computing,
The merging memory cell includes:
Each corresponding all index information of mapping output file is inquired about from incidence relation table, by each mapping output text
One segment data of each correspondence of part is inserted into section list;The position relationship value, activity relationship value, structure for recording segment data are closed
Set occurrence, functional relationship value, functional relationship value, behavior relation value and other relation values.
In the big data distributed processing system(DPS) of the present invention based on cloud computing,
The mapping function by writing in advance is mapped burst is inputted according to mapping tasks in the map unit
Also include judging that input burst whether there is logic error according to incidence relation table, the input burst is abandoned if existing.
Implement big data distributed approach based on cloud computing that the present invention provides and system compared with prior art
Have the advantages that:By the way that if the big data data of magnanimity have been divided into stem portion according to the rule pre-set, point
To many processor parallel processings;Then the result after each processor processing is carried out collecting operation to obtain final result;
Have the following effects that:A large amount of, the non-structured data of processing can be realized, data processing type and speed is improved.Can be with
Apply in fields such as Study of Intelligent Robot Control, track traffic controls, have broad application prospects.
It is understood that for the person of ordinary skill of the art, can be done with technique according to the invention design
Go out other various corresponding changes and deformation, and all these changes and deformation should all belong to the protection model of the claims in the present invention
Enclose.
Claims (10)
1. a kind of big data distributed approach based on cloud computing, it is characterised in that it comprises the following steps:
S1, reception input file, input burst is carried out according to input file size, and a mapping times is distributed by each input burst
The array of the position of business, input burst storage burst length and record data;
S2, by the mapping function write in advance on data memory node map obtaining intermediate file;
S3, the duplicate key value merged in intermediate file, to reduce mapping output file redundancy;And sequence is carried out to the key assignments after merging
Row obtain mapped cache file;Automatically the computational load value of each calculate node is obtained, according to the computational load of calculate node
Each mapped cache file is assigned in each calculate node by value;
S4, open up circulating memory buffering area in internal memory, circulating memory buffering area is used to map output file output;In annular
Deposit and configuration file is created in buffering area, the EMS memory occupation threshold value of core buffer is configured in configuration file;It is slow in circulating memory
Rush in area EMS memory occupation to be more than or equal to when taking threshold value, protection thread pause writes data into internal memory, and writes in internal memory
Spill file, spill file determines the file of write-in disk, and the file of circulating memory buffering area is write into disk until all
Mapping output file output finish;
S5, by all mapping output files and store on distributed file storage system.
2. the big data distributed approach as claimed in claim 1 based on cloud computing, it is characterised in that the step S1
In to input file size carry out input burst include:
Incidence relation table is set up, input file is split as position relationship value, activity relationship value, structural relation value, functional relationship
Value, functional relationship value, behavior relation value and other relation values, and by the corresponding relation of each relation value of each input file
Write in incidence relation table;
The corresponding data of each relation value are included in input burst.
3. the big data distributed approach as claimed in claim 2 based on cloud computing, it is characterised in that the step S2
Including:
Mapped by the mapping function write in advance by burst is inputted according to mapping tasks, the mapping is included according to advance
The data form of setting will input burst content and enter row-column list alignment, judge position relationship value, activity relationship value, structural relation
Value, functional relationship value, functional relationship value, behavior relation value and other relation values whether there is, if each relation value is present
Then directly retain, if there is no a certain item or a few n-th-trem relation n values, then the relation value lacked is sky;The arrangement of each relation is suitable
Sequence is consistent.
4. the big data distributed approach as claimed in claim 3 based on cloud computing, it is characterised in that
The step S5 includes:
Each corresponding all index information of mapping output file is inquired about from incidence relation table, by each mapping output file
Each one segment data of correspondence is inserted into section list;Record the position relationship value, activity relationship value, structural relation of segment data
Value, functional relationship value, functional relationship value, behavior relation value and other relation values.
5. the big data distributed approach as claimed in claim 3 based on cloud computing, it is characterised in that
The mapping function by writing in advance, which will be inputted burst and be mapped according to mapping tasks, in the step S2 also includes
Judge that input burst whether there is logic error according to incidence relation table, the input burst is abandoned if existing.
6. a kind of big data distributed processing system(DPS) based on cloud computing, it is characterised in that it includes such as lower unit:
Split cells, for receiving input file, input burst is carried out according to input file size, by each input burst distribution
The array of the position of one mapping tasks, input burst storage burst length and record data;
Map unit, on data memory node map obtaining intermediate file for the mapping function by writing in advance;
Computing unit, for merging the duplicate key value in intermediate file, to reduce mapping output file redundancy;And to merging after
Key assignments serialize obtaining mapped cache file;Automatically the computational load value of each calculate node is obtained, according to calculate node
Computational load value each mapped cache file is assigned in each calculate node;
Output unit, for opening up circulating memory buffering area in internal memory, circulating memory buffering area is defeated for mapping output file
Go out;Configuration file is created in circulating memory buffering area, the EMS memory occupation threshold value of core buffer is configured in configuration file;
When EMS memory occupation is more than or equal to occupancy threshold value in circulating memory buffering area, protection thread, which suspends, writes data into internal memory, and
Spill file is write in internal memory, spill file determines the file of write-in disk, and the file of circulating memory buffering area is write into magnetic
Disk is until all mapping output file output is finished;
Merge memory cell, for by all mapping output files and storing to distributed file storage system.
7. the big data distributed processing system(DPS) as claimed in claim 6 based on cloud computing, it is characterised in that the fractionation list
Carrying out input burst to input file size in member includes:
Incidence relation table is set up, input file is split as position relationship value, activity relationship value, structural relation value, functional relationship
Value, functional relationship value, behavior relation value and other relation values, and by the corresponding relation of each relation value of each input file
Write in incidence relation table;
The corresponding data of each relation value are included in input burst.
8. the big data distributed approach as claimed in claim 7 based on cloud computing, it is characterised in that the mapping list
Member includes:
Mapped by the mapping function write in advance by burst is inputted according to mapping tasks, the mapping is included according to advance
The data form of setting will input burst content and enter row-column list alignment, judge position relationship value, activity relationship value, structural relation
Value, functional relationship value, functional relationship value, behavior relation value and other relation values whether there is, if each relation value is present
Then directly retain, if there is no a certain item or a few n-th-trem relation n values, then the relation value lacked is sky;The arrangement of each relation is suitable
Sequence is consistent.
9. the big data distributed processing system(DPS) as claimed in claim 8 based on cloud computing, it is characterised in that
The merging memory cell includes:
Each corresponding all index information of mapping output file is inquired about from incidence relation table, by each mapping output file
Each one segment data of correspondence is inserted into section list;Record the position relationship value, activity relationship value, structural relation of segment data
Value, functional relationship value, functional relationship value, behavior relation value and other relation values.
10. the big data distributed processing system(DPS) as claimed in claim 9 based on cloud computing, it is characterised in that
Burst will be inputted in the map unit to the mapping function by writing in advance to be mapped and also wrap according to mapping tasks
Include and judge that input burst whether there is logic error according to incidence relation table, the input burst is abandoned if existing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710130418.8A CN106951475A (en) | 2017-03-07 | 2017-03-07 | Big data distributed approach and system based on cloud computing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710130418.8A CN106951475A (en) | 2017-03-07 | 2017-03-07 | Big data distributed approach and system based on cloud computing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106951475A true CN106951475A (en) | 2017-07-14 |
Family
ID=59467025
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710130418.8A Pending CN106951475A (en) | 2017-03-07 | 2017-03-07 | Big data distributed approach and system based on cloud computing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106951475A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108132970A (en) * | 2017-12-04 | 2018-06-08 | 洛阳师范学院 | Big data distributed approach and system based on cloud computing |
CN109033137A (en) * | 2018-06-06 | 2018-12-18 | 千寻位置网络有限公司 | Dynamic RINEX date storage method and device |
CN109117275A (en) * | 2018-08-31 | 2019-01-01 | 平安科技(深圳)有限公司 | Account checking method, device, computer equipment and storage medium based on data fragmentation |
CN109584068A (en) * | 2018-11-02 | 2019-04-05 | 深圳市快付通金融网络科技服务有限公司 | A kind of distribution of funds formula liquidation method, apparatus and system |
CN110019234A (en) * | 2017-12-28 | 2019-07-16 | 中国电信股份有限公司 | Method and system for fragment storing data |
CN110955637A (en) * | 2019-11-27 | 2020-04-03 | 集奥聚合(北京)人工智能科技有限公司 | Method for realizing ordering of oversized files based on low memory |
CN111339041A (en) * | 2020-03-10 | 2020-06-26 | 中国建设银行股份有限公司 | File parsing and warehousing and file generating method and device |
CN112416865A (en) * | 2020-11-20 | 2021-02-26 | 中国建设银行股份有限公司 | File processing method and device based on big data |
CN112529736A (en) * | 2020-12-28 | 2021-03-19 | 成都工百利自动化设备有限公司 | Online wave recording method and system for distributed power grid |
CN112653771A (en) * | 2021-03-15 | 2021-04-13 | 浙江贵仁信息科技股份有限公司 | Water conservancy data fragment storage method, on-demand method and processing system |
CN113608775A (en) * | 2021-06-18 | 2021-11-05 | 天津津航计算技术研究所 | Flow configuration method based on direct memory read-write |
CN113835634A (en) * | 2021-09-23 | 2021-12-24 | 中国自然资源航空物探遥感中心 | Multi-parameter data synchronous recording method and device based on annular memory double buffering |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140379632A1 (en) * | 2013-06-19 | 2014-12-25 | International Business Machines Corporation | Smarter big data processing using collaborative map reduce frameworks |
CN106202278A (en) * | 2016-07-01 | 2016-12-07 | 武汉泰迪智慧科技有限公司 | A kind of public sentiment based on data mining technology monitoring system |
-
2017
- 2017-03-07 CN CN201710130418.8A patent/CN106951475A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140379632A1 (en) * | 2013-06-19 | 2014-12-25 | International Business Machines Corporation | Smarter big data processing using collaborative map reduce frameworks |
CN106202278A (en) * | 2016-07-01 | 2016-12-07 | 武汉泰迪智慧科技有限公司 | A kind of public sentiment based on data mining technology monitoring system |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108132970A (en) * | 2017-12-04 | 2018-06-08 | 洛阳师范学院 | Big data distributed approach and system based on cloud computing |
CN110019234A (en) * | 2017-12-28 | 2019-07-16 | 中国电信股份有限公司 | Method and system for fragment storing data |
CN109033137B (en) * | 2018-06-06 | 2021-11-05 | 千寻位置网络有限公司 | Dynamic RINEX data storage method and device |
CN109033137A (en) * | 2018-06-06 | 2018-12-18 | 千寻位置网络有限公司 | Dynamic RINEX date storage method and device |
CN109117275A (en) * | 2018-08-31 | 2019-01-01 | 平安科技(深圳)有限公司 | Account checking method, device, computer equipment and storage medium based on data fragmentation |
WO2020042427A1 (en) * | 2018-08-31 | 2020-03-05 | 平安科技(深圳)有限公司 | Reconciliation method and apparatus based on data fragments, computer device, and storage medium |
CN109117275B (en) * | 2018-08-31 | 2024-05-28 | 平安科技(深圳)有限公司 | Account checking method and device based on data slicing, computer equipment and storage medium |
CN109584068A (en) * | 2018-11-02 | 2019-04-05 | 深圳市快付通金融网络科技服务有限公司 | A kind of distribution of funds formula liquidation method, apparatus and system |
CN110955637A (en) * | 2019-11-27 | 2020-04-03 | 集奥聚合(北京)人工智能科技有限公司 | Method for realizing ordering of oversized files based on low memory |
CN111339041A (en) * | 2020-03-10 | 2020-06-26 | 中国建设银行股份有限公司 | File parsing and warehousing and file generating method and device |
CN111339041B (en) * | 2020-03-10 | 2024-01-12 | 中国建设银行股份有限公司 | File analysis and storage method and device and file generation method and device |
CN112416865A (en) * | 2020-11-20 | 2021-02-26 | 中国建设银行股份有限公司 | File processing method and device based on big data |
CN112529736A (en) * | 2020-12-28 | 2021-03-19 | 成都工百利自动化设备有限公司 | Online wave recording method and system for distributed power grid |
CN112653771B (en) * | 2021-03-15 | 2021-06-01 | 浙江贵仁信息科技股份有限公司 | Water conservancy data fragment storage method, on-demand method and processing system |
CN112653771A (en) * | 2021-03-15 | 2021-04-13 | 浙江贵仁信息科技股份有限公司 | Water conservancy data fragment storage method, on-demand method and processing system |
CN113608775A (en) * | 2021-06-18 | 2021-11-05 | 天津津航计算技术研究所 | Flow configuration method based on direct memory read-write |
CN113608775B (en) * | 2021-06-18 | 2023-10-13 | 天津津航计算技术研究所 | Flow configuration method based on memory direct reading and writing |
CN113835634A (en) * | 2021-09-23 | 2021-12-24 | 中国自然资源航空物探遥感中心 | Multi-parameter data synchronous recording method and device based on annular memory double buffering |
CN113835634B (en) * | 2021-09-23 | 2024-09-17 | 中国自然资源航空物探遥感中心 | Multi-parameter data synchronous recording method and device based on annular memory double buffering |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106951475A (en) | Big data distributed approach and system based on cloud computing | |
WO2018214388A1 (en) | Multi-platform big data system and method for aviation electronics | |
CN107766402A (en) | A kind of building dictionary cloud source of houses big data platform | |
CN110674154B (en) | Spark-based method for inserting, updating and deleting data in Hive | |
Gilbert | Simulation: A new way of doing social science | |
CN107544984A (en) | A kind of method and apparatus of data processing | |
CN107301214A (en) | Data migration method, device and terminal device in HIVE | |
CN104036029A (en) | Big data consistency comparison method and system | |
CN103699660A (en) | Large-scale network streaming data cache-write method | |
CN108255966A (en) | A kind of data migration method and storage medium | |
CN106570145B (en) | Distributed database result caching method based on hierarchical mapping | |
WO2018214387A1 (en) | Distributed mining system and method for aviation-oriented electronic data | |
CN106528898A (en) | Method and device for converting data of non-relational database into relational database | |
Jun et al. | Cloud computing based solution to decision making | |
CN106055590A (en) | Power grid data processing method and system based on big data and graph database | |
US20230106106A1 (en) | Text backup method, apparatus, and device, and computer-readable storage medium | |
CN104219088A (en) | Hive-based network alarm information OLAP method | |
KR101955376B1 (en) | Processing method for a relational query in distributed stream processing engine based on shared-nothing architecture, recording medium and device for performing the method | |
Jin et al. | Association rules redundancy processing algorithm based on hypergraph in data mining | |
CN108132970A (en) | Big data distributed approach and system based on cloud computing | |
CN109947743A (en) | A kind of the NoSQL big data storage method and system of optimization | |
CN107679133B (en) | Mining method applicable to massive real-time PMU data | |
Ravichandran | Big Data processing with Hadoop: a review | |
Anusha et al. | Big data techniques for efficient storage and processing of weather data | |
CN116227989A (en) | Multidimensional business informatization supervision method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170714 |