CN110134704B - Big data cluster transaction implementation method based on distributed cache - Google Patents

Big data cluster transaction implementation method based on distributed cache Download PDF

Info

Publication number
CN110134704B
CN110134704B CN201910467807.9A CN201910467807A CN110134704B CN 110134704 B CN110134704 B CN 110134704B CN 201910467807 A CN201910467807 A CN 201910467807A CN 110134704 B CN110134704 B CN 110134704B
Authority
CN
China
Prior art keywords
transaction
big data
access
distributed
distributed cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910467807.9A
Other languages
Chinese (zh)
Other versions
CN110134704A (en
Inventor
朱喜娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University Tan Kah Kee College
Original Assignee
Xiamen University Tan Kah Kee College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University Tan Kah Kee College filed Critical Xiamen University Tan Kah Kee College
Priority to CN201910467807.9A priority Critical patent/CN110134704B/en
Publication of CN110134704A publication Critical patent/CN110134704A/en
Application granted granted Critical
Publication of CN110134704B publication Critical patent/CN110134704B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2336Pessimistic concurrency control approaches, e.g. locking or multiple versions without time stamps
    • G06F16/2343Locking methods, e.g. distributed locking or locking implementation details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/235Update request formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity

Abstract

The invention relates to a distributed cache-based big data cluster transaction implementation method, which is characterized in that access agent modules are used for performing syntax interpretation on all terminal user access requests of big data, transaction-related access is intercepted, for transaction-related operation requests comprising row adding, row deleting, column changing and the like, the transactions are intercepted by a transaction interception module and transaction consistency management is performed by a distributed cache transaction management cluster, and a database and a table of a data warehouse of the big data with transaction operation are locked, so that the distributed transaction consistency of a big data platform is realized.

Description

Big data cluster transaction implementation method based on distributed cache
Technical Field
The invention relates to the technical field of big data, in particular to a big data cluster transaction implementation method based on distributed cache.
Background
The prior art framework of the big data platform has the difficulty that the transaction consistency cannot be realized, namely, the distributed transaction cannot be realized temporarily on the big data cluster platform. The prior art is mainly a single-engine Database Transaction (Database Transaction), which refers to a series of operations executed by a group of Database operation commands as a single logical work unit on a single Database, and the operations are either completely executed or not executed at all, but the above techniques can only implement transactions in a non-distributed environment.
Disclosure of Invention
In view of this, the present invention provides a method for implementing big data cluster transaction based on distributed cache, which can implement distributed database transaction on a big data platform.
The invention is realized by adopting the following scheme: a big data cluster transaction implementation method based on distributed cache specifically comprises the following steps:
step S1: providing an access agent module, intercepting and acting access requests of all terminal users;
step S2: providing a grammar parsing module, parsing the access request of the step S1, and entering a band containing the transaction operation into the next step;
step S3: providing a transaction interception module, transmitting the grammar analysis module to a table containing transaction operation, locking the transaction of the table, and forbidding the query and operation of the table in the time period of the transaction operation;
step S4: the transaction attributes of the table are defined through the transaction interception module and stored in the distributed cache cluster, so that the access requests of all nodes of the big data distributed cluster can directly acquire the transaction attributes of the database table of the distributed big data in the distributed cache, and distributed transactions are realized. The distributed cache transaction management cluster can be realized on different access agent modules or other access interfaces to realize transaction consistency and avoid the problem of transaction conflict.
Further, in step S1, the access agent module is a unique necessary interface for the external end user to access the big data, so that all the access related to the transaction can be effectively intercepted and processed, and the transaction consistency of all the users can be realized.
Preferably, the syntax parsing module defines a unified SQL standard and syntax, and needs to perform unified parsing on the SQL syntax, so as to transform an input 'character string' into a 'structure body' describing the character string, so that the computer can more easily understand what the user inputs the character string. This stage includes three processes, lexical analysis, syntactic analysis, and output of abstract syntax trees.
1) Lexical analysis: is a Deterministic Finite Automata (DFA) that can convert an input character set into 'words' according to a defined lexical method.
2) And (3) syntax analysis: after lexical analysis, the next process is grammatical analysis, the result of the lexical analysis can be used as the input of the grammatical analysis, the grammatical analysis judges whether a word input by a user accords with a sentence containing a transaction operation on the basis of the lexical analysis, for example, an INSERT INTO table valuES is a sentence conforming to the transaction operation, a corresponding operated table is handed to a transaction interception module for processing, the grammatical interpretation can carry out the transaction analysis of a table partition with finer granularity, so that the transaction interception of the table partition is realized, other partitions can be normally accessed and cannot be locked, and the concurrency performance of the transaction is improved.
Further, in step S3, the prohibiting the table modification and query operation during the time period of the transaction operation further includes: if the table has partitions, the partitions are only locked at the transaction level, and the consistency of the transactions is realized.
Further, in step S3, the transaction intercepting module filters the access requests of all terminals, and performs transaction consistency determination for the query and operation related to the transaction, and if the corresponding table is performing the transaction operation, the access requests are made to enter an access blocking state until the related access request operation is executed after the transaction is finished.
Particularly, all terminal user access requests of the big data are subjected to syntax interpretation through the access agent module, transaction related access is intercepted, for transaction related operation requests containing row adding, row deleting, column changing data and the like, the transactions are intercepted by the transaction intercepting module, transaction consistency management is carried out through the distributed cache transaction management cluster, and libraries and tables of a data warehouse of the big data with transaction operation are locked, so that distributed transaction consistency of the big data platform is achieved.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention realizes distributed transaction management and solves the problem of inconsistent transactions of the big data distributed data warehouse.
2. The invention can realize the distributed transaction consistency in the distributed environment of big data and realize the transaction consistency of distributed application programs through the transaction attribute of the cluster distributed cache.
3. The transaction granularity in the invention is table partitioning: the transaction locking granularity is finer, the transaction locking can be performed on the table partition, other table partitions are not affected, better transaction concurrency performance can be achieved, and the access and transaction operation of other partitions are not affected.
Drawings
FIG. 1 is a schematic block diagram of an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, this embodiment provides a method for implementing a big data cluster transaction based on a distributed cache, which specifically includes the following steps:
step S1: providing an access agent module, intercepting and acting access requests of all terminal users;
step S2: providing a grammar parsing module, parsing the access request of the step S1, and entering a band containing the transaction operation into the next step;
step S3: providing a transaction interception module, transmitting the grammar analysis module to a table containing transaction operation, locking the transaction of the table, and forbidding the query and operation of the table in the time period of the transaction operation;
step S4: the transaction attribute of the table is defined by the transaction interception module and is stored in the cluster based on the distributed cache (memory), so that the access requests of all nodes of the big data distributed cluster can directly acquire the transaction attribute of the database table of the distributed big data in the distributed cache, and the distributed transaction is realized. The distributed cache transaction management cluster can be realized on different access agent modules or other access interfaces to realize transaction consistency and avoid the problem of transaction conflict.
In this embodiment, in step S1, the access proxy module is the only necessary interface for the external end user to access the big data, so that all the access related to the transaction can be effectively intercepted and processed, and the transaction consistency of all the users can be realized.
Preferably, in this embodiment, the syntax parsing module defines a unified SQL standard and syntax, and needs to perform unified parsing on the SQL syntax, so as to transform an input 'character string' into a 'structure body' describing the character string, so that the computer can more easily understand what the user inputs the character string. This stage includes three processes, lexical analysis, syntactic analysis, and output of abstract syntax trees.
1) Lexical analysis: is a Deterministic Finite Automata (DFA) that can convert an input character set into 'words' according to a defined lexical method.
2) And (3) syntax analysis: after lexical analysis, the next process is grammatical analysis, the result of the lexical analysis can be used as the input of the grammatical analysis, the grammatical analysis judges whether a word input by a user accords with a sentence containing a transaction operation on the basis of the lexical analysis, for example, an INSERT INTO table valuES is a sentence conforming to the transaction operation, a corresponding operated table is handed to a transaction interception module for processing, the grammatical interpretation can carry out the transaction analysis of a table partition with finer granularity, so that the transaction interception of the table partition is realized, other partitions can be normally accessed and cannot be locked, and the concurrency performance of the transaction is improved.
In this embodiment, in step S3, the prohibiting the table modification and query operation during the time period of the transaction operation further includes: if the table has partitions, the partitions are only locked at the transaction level, and the consistency of the transactions is realized.
In this embodiment, in step S3, the transaction intercepting module filters the access requests of all terminals, performs transaction consistency determination for query and operation related to the transaction, and if the corresponding table is performing the transaction operation, makes the access requests enter an access blocking state until the related access request operation is executed after the transaction is finished.
Particularly, in this embodiment, the access agent module performs syntax interpretation on all terminal user access requests of the big data, intercepts transaction-related access, intercepts such transactions by the transaction interception module and performs transaction consistency management by the distributed cache transaction management cluster for transaction-related operation requests including row addition, row deletion, column change data and the like, and locks libraries and tables of a data warehouse of the big data with transaction operation, thereby implementing distributed transaction consistency of the big data platform.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.

Claims (3)

1. A big data cluster transaction implementation method based on distributed cache is characterized by comprising the following steps:
step S1: providing an access agent module, intercepting and acting access requests of all terminal users;
step S2: providing a grammar parsing module, parsing the access request of the step S1, and entering a band containing the transaction operation into the next step;
step S3: providing a transaction interception module, transmitting the grammar analysis module to a table containing transaction operation, locking the transaction of the table, and forbidding modification and query operation of the table in the time period of the transaction operation;
step S4: the transaction attributes of the table are defined through the transaction interception module and are stored on the distributed cache cluster, so that the access requests of all nodes of the big data distributed cluster can directly acquire the transaction attributes of the database table of the distributed big data in the distributed cache, and distributed transactions are realized;
in step S3, the prohibiting the table modification and query operation during the time period of the transaction operation further includes: if the table has partitions, the partitions are only locked at the transaction level, and the consistency of the transactions is realized.
2. The distributed cache based big data cluster transaction implementation method of claim 1, wherein in step S1, the access agent module is a unique necessary interface for external end users to access big data.
3. The method according to claim 1, wherein in step S3, the transaction interception module filters access requests from all terminals, performs transaction consistency determination for modification and query operations related to the transaction, and if the corresponding table is performing transaction operation, makes the access requests enter an access blocking state until the related access request operation is executed after the transaction is completed.
CN201910467807.9A 2019-05-31 2019-05-31 Big data cluster transaction implementation method based on distributed cache Active CN110134704B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910467807.9A CN110134704B (en) 2019-05-31 2019-05-31 Big data cluster transaction implementation method based on distributed cache

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910467807.9A CN110134704B (en) 2019-05-31 2019-05-31 Big data cluster transaction implementation method based on distributed cache

Publications (2)

Publication Number Publication Date
CN110134704A CN110134704A (en) 2019-08-16
CN110134704B true CN110134704B (en) 2021-11-02

Family

ID=67583425

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910467807.9A Active CN110134704B (en) 2019-05-31 2019-05-31 Big data cluster transaction implementation method based on distributed cache

Country Status (1)

Country Link
CN (1) CN110134704B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112199391B (en) * 2020-09-30 2024-02-23 深圳前海微众银行股份有限公司 Data locking detection method, equipment and computer readable storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104050276A (en) * 2014-06-26 2014-09-17 北京思特奇信息技术股份有限公司 Cache processing method and system of distributed database
US9104714B2 (en) * 2012-05-31 2015-08-11 Red Hat, Inc. Incremental optimistic locking of data distributed on multiple nodes to avoid transaction deadlock
CN105787064A (en) * 2016-03-01 2016-07-20 广州铭诚计算机科技有限公司 Mining platform establishment method based on big data
CN106372136A (en) * 2010-12-30 2017-02-01 脸谱公司 Distributed cache system and method and storage medium
CN106780027A (en) * 2016-12-08 2017-05-31 北京金融资产交易所有限公司 A kind of data handling system and method
CN107977376A (en) * 2016-10-24 2018-05-01 腾讯科技(深圳)有限公司 Distributed data base system and transaction methods
CN108459913A (en) * 2017-12-26 2018-08-28 阿里巴巴集团控股有限公司 data parallel processing method, device and server

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101183377B (en) * 2007-12-10 2010-09-08 华中科技大学 High availability data-base cluster based on message middleware
CN102521028B (en) * 2011-12-02 2013-07-24 华中科技大学 Transactional memory system under distributed environment
CN102831156B (en) * 2012-06-29 2014-12-31 浙江大学 Distributed transaction processing method on cloud computing platform
CN103747060B (en) * 2013-12-26 2017-12-08 惠州华阳通用电子有限公司 A kind of distributed monitoring system and method based on streaming media service cluster
US10235687B1 (en) * 2014-03-14 2019-03-19 Walmart Apollo, Llc Shortest distance to store
US10884869B2 (en) * 2015-04-16 2021-01-05 Nuodb, Inc. Backup and restore in a distributed database utilizing consistent database snapshots
CN109725987A (en) * 2018-12-15 2019-05-07 深圳壹账通智能科技有限公司 A kind of distributed transaction consistency solution and relevant device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372136A (en) * 2010-12-30 2017-02-01 脸谱公司 Distributed cache system and method and storage medium
US9104714B2 (en) * 2012-05-31 2015-08-11 Red Hat, Inc. Incremental optimistic locking of data distributed on multiple nodes to avoid transaction deadlock
CN104050276A (en) * 2014-06-26 2014-09-17 北京思特奇信息技术股份有限公司 Cache processing method and system of distributed database
CN105787064A (en) * 2016-03-01 2016-07-20 广州铭诚计算机科技有限公司 Mining platform establishment method based on big data
CN107977376A (en) * 2016-10-24 2018-05-01 腾讯科技(深圳)有限公司 Distributed data base system and transaction methods
CN106780027A (en) * 2016-12-08 2017-05-31 北京金融资产交易所有限公司 A kind of data handling system and method
CN108459913A (en) * 2017-12-26 2018-08-28 阿里巴巴集团控股有限公司 data parallel processing method, device and server

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Cache consistency and concurrency control in a client/server DBMS architecture;Yongdong Wang 等;《ACM SIGMOD Record》;19910401;第20卷(第2期);367-376 *
云计算环境下分布式缓存技术的现状与挑战;秦秀磊 等;《软件学报》;20120727;第24卷(第1期);50-66 *
基于BP算法PID控制器的研究;朱喜娜 等;《计算机技术与发展》;20100510;第20卷(第5期);183-186 *

Also Published As

Publication number Publication date
CN110134704A (en) 2019-08-16

Similar Documents

Publication Publication Date Title
US11392586B2 (en) Data protection method and device and storage medium
US5664173A (en) Method and apparatus for generating database queries from a meta-query pattern
CN108536705B (en) Coding and operation method of object in database system and database server
CN109614432B (en) System and method for acquiring data blood relationship based on syntactic analysis
US9471476B2 (en) Error injection into the leaf functions of call graphs
EP1107135B1 (en) Parallel optimized triggers in parallel processing database systems
US10885032B2 (en) Query execution pipelining with shared states for query operators
CN111949650A (en) Multi-language fusion query method and multi-mode database system
CN111177788A (en) Hive dynamic desensitization method and dynamic desensitization system
US11269829B2 (en) Row level locking for columnar data
US5764949A (en) Query pass through in a heterogeneous, distributed database environment
CN111581234B (en) RAC multi-node database query method, device and system
CN110502532B (en) Method, device, equipment and storage medium for optimizing remote database object
CN113204571B (en) SQL execution method and device related to write-in operation and storage medium
CN110134704B (en) Big data cluster transaction implementation method based on distributed cache
US7174553B1 (en) Increasing parallelism of function evaluation in a database
JPH11134368A (en) Language management program interface
US10558661B2 (en) Query plan generation based on table adapter
US9870399B1 (en) Processing column-partitioned data for row-based operations in a database system
US11016973B2 (en) Query plan execution engine
US9280582B2 (en) Optimization of join queries for related data
US8090943B1 (en) Preventing unauthorized access of routines in a library
CN108153799B (en) Database access control method and device and database system
US11487737B2 (en) Take over table opening for operators
CN111723104A (en) Method, device and system for syntax analysis in data processing system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20190816

Assignee: ZHANGZHOU KEHUI SPECIAL VEHICLE MANUFACTURING CO.,LTD.

Assignor: XIAMEN UNIVERSITY TAN KAH KEE College

Contract record no.: X2023350000214

Denomination of invention: A Method for Implementing Big Data Cluster Transactions Based on Distributed Caching

Granted publication date: 20211102

License type: Common License

Record date: 20230428

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20190816

Assignee: Zhangzhou Ouli Food Co.,Ltd.

Assignor: XIAMEN UNIVERSITY TAN KAH KEE College

Contract record no.: X2023980052503

Denomination of invention: A Transaction Implementation Method for Big Data Clusters Based on Distributed Caching

Granted publication date: 20211102

License type: Common License

Record date: 20231218

Application publication date: 20190816

Assignee: Longhai Oubeiluo Food Co.,Ltd.

Assignor: XIAMEN UNIVERSITY TAN KAH KEE College

Contract record no.: X2023980052501

Denomination of invention: A Transaction Implementation Method for Big Data Clusters Based on Distributed Caching

Granted publication date: 20211102

License type: Common License

Record date: 20231218

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20190816

Assignee: Longhai Dongzhen Food Co.,Ltd.

Assignor: XIAMEN UNIVERSITY TAN KAH KEE College

Contract record no.: X2023980052499

Denomination of invention: A Transaction Implementation Method for Big Data Clusters Based on Distributed Caching

Granted publication date: 20211102

License type: Common License

Record date: 20231221

Application publication date: 20190816

Assignee: Longhai Songcheng Food Co.,Ltd.

Assignor: XIAMEN UNIVERSITY TAN KAH KEE College

Contract record no.: X2023980050354

Denomination of invention: A Transaction Implementation Method for Big Data Clusters Based on Distributed Caching

Granted publication date: 20211102

License type: Common License

Record date: 20231221

Application publication date: 20190816

Assignee: Fujian Yihao Construction Machinery Co.,Ltd.

Assignor: XIAMEN UNIVERSITY TAN KAH KEE College

Contract record no.: X2023980050291

Denomination of invention: A Transaction Implementation Method for Big Data Clusters Based on Distributed Caching

Granted publication date: 20211102

License type: Common License

Record date: 20231221

EE01 Entry into force of recordation of patent licensing contract