CN103761255A - Method and system for optimizing data storage of NoSQL mode - Google Patents

Method and system for optimizing data storage of NoSQL mode Download PDF

Info

Publication number
CN103761255A
CN103761255A CN201310741340.5A CN201310741340A CN103761255A CN 103761255 A CN103761255 A CN 103761255A CN 201310741340 A CN201310741340 A CN 201310741340A CN 103761255 A CN103761255 A CN 103761255A
Authority
CN
China
Prior art keywords
cache table
local cache
time
database
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310741340.5A
Other languages
Chinese (zh)
Inventor
崔晶晶
林佳婕
刘立娜
张帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING GEO POLYMERIZATION NETWORK TECHNOLOGY Co Ltd
Original Assignee
BEIJING GEO POLYMERIZATION NETWORK TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING GEO POLYMERIZATION NETWORK TECHNOLOGY Co Ltd filed Critical BEIJING GEO POLYMERIZATION NETWORK TECHNOLOGY Co Ltd
Priority to CN201310741340.5A priority Critical patent/CN103761255A/en
Publication of CN103761255A publication Critical patent/CN103761255A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Computer Security & Cryptography (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of databases and discloses a method and a system for optimizing data storage of a NoSQL mode. The method comprises an enquiry process, an aging process and a synchronization process, wherein according to the enquiry process, due to the adoption of an asynchronous enquiry manner, the enquiry process and a process of returning results are separated to improve enquiry efficiency; according to the aging process, data nodes which are not accessed for a long time are recycled, so that internal memory is effectively used; according to the synchronization process, a local cache table is updated through real-time synchronization during updating the databases so as to ensure validity of data in cache.

Description

Optimization method and the system of the storage of NoSQL mode data
Technical field
The present invention relates to a kind of technical field of data storage, the technical field of the optimization of particularly storing for NoSQL mode data.
Background technology
Large-scale business application system, generally all the relevant data of system are stored in database, but database access is subject to the factor of IO, network, cause read or write speed to run into bottleneck, make the speed of business processing slack-off, the whole flow process of the business directly having influence on, particularly, in quick streaming data handling procedure, this problem seems more outstanding.
In order to improve the inquiry velocity of high-speed data, existing solution has:
1, use more high performance server, improve network bandwidth handling capacity;
2, use distributed data base or NoSQL database;
3, provide memory cache mechanism.
But there is following defect in above-mentioned existing solution:
Adopt more high performance server and network bandwidth amount is provided, need very high cost, and still can be limited to the problem of network, just speed can increase to a certain extent, but can't meet the demand of Stream Processing in enormous quantities;
Adopt distributed data base to increase synchronous complexity, and can not solve the problem of the network bandwidth; Adopt NoSQL database can solve to a certain extent the demand of big data quantity and high concurrent read-write, but this scheme has increased synchronous complexity equally;
Adopt memory cache mechanism can effectively solve the transmission problem of data, still, when data volume is excessive, buffer memory can be restricted, and directly affects business processing flow.
Therefore, how to improve the transmission speed of a large amount of high-speed datas, not being subject to again network and storage quantitative limitation is simultaneously current problem demanding prompt solution.
Summary of the invention
In view of this, the invention provides a kind of optimization method and system of NoSQL mode data storage, to address the above problem.
For reaching above-mentioned purpose, the invention provides a kind of optimization method of NoSQL mode data storage, the method comprises: querying flow, aging flow process and synchronous flow process; Wherein,
Querying flow adopts asynchronous query mode, and query script is separated with the process returning results to improve search efficiency;
Aging flow process is by the back end of long-time unmanned access is reclaimed, effectively to utilize internal memory;
Synchronous flow process is by real-time synchronization, when new database more, carries out the renewal of local cache table, to guarantee the validity of data in buffer memory.
Preferably, described querying flow is specially: by KEY value, inquire about at every turn, first by KEY value, in local cache table, inquire about, if the result of inquiring is upgraded the access time of this KEY value in local cache table and stabbed, then directly return results; If inquiry, less than result, is initiated inquiry request to far-end NoSQL database, inquiry adopts asynchronous query pattern, do not wait for Query Result, and directly return, wait database to find after result, automatically upgrade local cache table, while waiting inquire about again this list item next time, just can normally find.
Preferably, described aging flow process is specially: by the last visit time of all nodes of daemon thread searching loop, when the time, surpass after set digestion time length, by this node deletion.
Preferably, described synchronous flow process is specially: when this has write operation to data, when writing local cache table, write NoSQL database concurrency and send synchronizing content to arrive other machine cache table, complete multimachine device cache table and synchronize with the data of NoSQL database.
In addition, the present invention also provides a kind of optimization system of NoSQL mode data storage, and this system comprises: NoSQL database, local cache table, far-end cache table, enquiry module, ageing module, synchronization module.Wherein local cache table and NoSQL database, far-end cache table, enquiry module, ageing module and synchronization module are all realized interconnection; Wherein,
Enquiry module adopts asynchronous query mode, and query script is separated with the process returning results to improve search efficiency;
Ageing module is by the back end of long-time unmanned access is reclaimed, effectively to utilize internal memory;
Synchronization module is by real-time synchronization, when new database more, carries out the renewal of local cache table, to guarantee the validity of data in buffer memory.
Preferably, described enquiry module is inquired about by KEY value at every turn, first by KEY value, in local cache table, inquires about, if the result of inquiring is upgraded the access time of this KEY value in local cache table and stabbed, then directly returns results; If inquiry, less than result, is initiated inquiry request to far-end NoSQL database, inquiry adopts asynchronous query pattern, do not wait for Query Result, and directly return, wait database to find after result, automatically upgrade local cache table, while waiting inquire about again this list item next time, just can normally find.
Preferably, described ageing module, by the last visit time of all nodes of daemon thread searching loop, surpasses after set digestion time length, by this node deletion when the time.
Preferably, in described synchronization module, when this has write operation to data, when writing local cache table, write NoSQL database concurrency and send synchronizing content to arrive other machine cache table, complete multimachine device cache table and synchronize with the data of NoSQL database.
By adopting method and system provided by the invention, data access efficiency is improved, the data volume of processing 1,000,000 grades per second, adopt asynchronous data storehouse operator scheme simultaneously, improved the processing speed of system by the local cache mechanism of the aging node function of timestamp, back end is reclaimed, make, in limited internal memory situation, to obtain well inquiring about effect; By synchronization mechanism is provided, make when having multiple devices to carry out read-write operation, guarantee the consistance of data.Meanwhile, owing to directly adopting NoSQL database, for basic search efficiency also provides assurance.
Accompanying drawing explanation
Fig. 1 is the optimization method process flow diagram of the NoSQL mode that the realizes data storage of the embodiment of the present invention;
Fig. 2 is the optimization system structural drawing of the NoSQL mode that the realizes data storage of the embodiment of the present invention.
Embodiment
Adopt NoSQL database can effectively improve the inquiry velocity of data, when query demand surpasses more than ten thousand grades, NoSQL database is a good selection, but so still can not meet the query demand of the higher order of magnitude, so the present invention is by adding local cache mechanism, after each inquiry, the Query Result of far-end is cached in local internal memory, per secondly so just can processes the up to ten million data stream that arrive up to a million.
We do not need synchronously to know Query Result in some cases, now can adopt asynchronous query mode, be about to query script separated with the process returning results, a few thing thread is responsible for sending request, other worker thread is responsible for reception result, can make like this main flow meet with a response fast, increase work efficiency.
On above-mentioned working foundation, although improved validity, sacrificed internal memory, general memory is the highest only has tens G, if the node of buffer memory is too much, after possible certain hour, new node cannot carry out buffer memory work.Therefore, added aging mechanism here, employing be that node is eliminated algorithm farthest, the back end of long-time unmanned access is reclaimed, can, in limited internal memory situation, obtain well inquiring about effect like this.
Should also be noted that data change constantly, for data in buffer memory and the data in database are consistent, we carry out real-time synchronization simultaneously, when new database more, carry out the renewal of local cache table, to guarantee the validity of data in buffer memory.
With reference to figure 1, describe the basic procedure of the method below in detail.
Querying flow: inquire about by KEY value at every turn, first inquire about in this earth's surface by KEY value, if the result of inquiring is upgraded the access time of this KEY value in this earth's surface and stabbed, then directly return results; If inquiry, less than result, is initiated inquiry request to far-end NoSQL database, inquiry adopts asynchronous query pattern, do not wait for Query Result, and directly return, wait database to find after result, automatically upgrade this earth's surface, while waiting inquire about again this list item next time, just can normally find.
Aging flow process: by the last visit time of all nodes of daemon thread searching loop, surpass after set digestion time length, by this node deletion when the time.
Synchronous flow process: when this has write operation to data, write local in, write NoSQL database concurrency simultaneously and send synchronizing content to arrive other machine cache table, complete multimachine device cache table and synchronize with the data of NoSQL database.
With reference to figure 2, wherein show the optimization system structural drawing of the NoSQL mode that the realizes data storage of the embodiment of the present invention, will describe in detail below.
System comprises NoSQL database, local cache table, far-end cache table, enquiry module, ageing module, synchronization module.Wherein local cache table and NoSQL database, far-end cache table, enquiry module, ageing module and synchronization module are all realized interconnection.Describe the reciprocal process between enquiry module, ageing module and synchronization module and cache table and database below in detail.
Enquiry module is inquired about by KEY value at every turn, first by KEY value, in this earth's surface, inquires about, if the result of inquiring is upgraded the access time of this KEY value in this earth's surface and stabbed, then directly returns results; If inquiry, less than result, is initiated inquiry request to far-end NoSQL database, inquiry adopts asynchronous query pattern, do not wait for Query Result, and directly return, wait database to find after result, automatically upgrade this earth's surface, while waiting inquire about again this list item next time, just can normally find.
Ageing module, by the last visit time of all nodes of daemon thread searching loop, surpasses after set digestion time length when the time, and this node deletion is fallen.
When this has write operation to data, write local in, by synchronization module, write NoSQL database concurrency simultaneously and send synchronizing content to arrive other machine cache table, complete multimachine device cache table and synchronize with the data of NoSQL database.
Elaborate below and use method and system of the present invention to realize the HTTP daily record process quick associated with user.
In gauze, mostly use and dial up on the telephone, IP can not be corresponding one by one with user account, so when HTTP message is carried out to log analysis, by the counter unique identification information of looking into user of the User IP in message wherein, and HTTP quantity is the quantity of 1,000,000 grades per second, general database all cannot satisfy the demands, and uses method of the present invention to address this problem.Detailed process is as follows:
IP-user ID process: set up IP->USER_INFO cache table, by the user source IP in the 3rd layer of HTTP message, inquiry IP->USER_INFO cache table, if had in cache table, directly returns to user profile; If do not had in table, to far-end TT database, initiate inquiry request, and return, do not wait for Query Result, after database query result returns, result is added in local cache table, when the HTTP of this IP that comes again message, inquire about this earth's surface just can normal queries to user account information.
Because existing network user is that number is with millions, if each user is cached in this earth's surface, to cause very large waste to internal memory, therefore, with aging mechanism, delete some nodes, when finding that a user does not have inquiry request for a long time, will this user's node being stored in local cache be deleted.
When user's the message of reaching the standard grade or roll off the production line arrives, to upgrade this user profile of this earth's surface and NoSQL database, and be synchronized in this earth's surface of other machine, the consistance that completes data is integrated.
Although the present invention discloses as above in the mode of most preferred embodiment, yet not with it, limit the present invention, those skilled in the art without departing from the spirit and scope of the present invention, change arbitrarily and change can do.The scope that protection scope of the present invention is only limited by appended claims is as the criterion.

Claims (8)

1. an optimization method for NoSQL mode data storage, is characterized in that, the method comprises: querying flow, aging flow process and synchronous flow process; Wherein,
Querying flow adopts asynchronous query mode, and query script is separated with the process returning results to improve search efficiency;
Aging flow process is by the back end of long-time unmanned access is reclaimed, effectively to utilize internal memory;
Synchronous flow process is by real-time synchronization, when new database more, carries out the renewal of local cache table, to guarantee the validity of data in buffer memory.
2. the method for claim 1, is characterized in that:
Described querying flow is specially: by KEY value, inquire about at every turn, first by KEY value, in local cache table, inquire about, if the result of inquiring is upgraded the access time of this KEY value in local cache table and stabbed, then directly return results; If inquiry, less than result, is initiated inquiry request to far-end NoSQL database, inquiry adopts asynchronous query pattern, do not wait for Query Result, and directly return, wait database to find after result, automatically upgrade local cache table, while waiting inquire about again this list item next time, just can normally find.
3. method as claimed in claim 1 or 2, is characterized in that:
Described aging flow process is specially: by the last visit time of all nodes of daemon thread searching loop, when the time, surpass after set digestion time length, by this node deletion.
4. method as claimed in claim 1 or 2, is characterized in that:
Described synchronous flow process is specially: when this has write operation to data, when writing local cache table, write NoSQL database concurrency and send synchronizing content to arrive other machine cache table, complete multimachine device cache table and synchronize with the data of NoSQL database.
5. an optimization system for NoSQL mode data storage, is characterized in that, this system comprises: NoSQL database, local cache table, far-end cache table, enquiry module, ageing module, synchronization module.Wherein local cache table and NoSQL database, far-end cache table, enquiry module, ageing module and synchronization module are all realized interconnection; Wherein,
Enquiry module adopts asynchronous query mode, and query script is separated with the process returning results to improve search efficiency;
Ageing module is by the back end of long-time unmanned access is reclaimed, effectively to utilize internal memory;
Synchronization module is by real-time synchronization, when new database more, carries out the renewal of local cache table, to guarantee the validity of data in buffer memory.
6. system as claimed in claim 5, is characterized in that:
Described enquiry module is inquired about by KEY value at every turn, first by KEY value, in local cache table, inquires about, if the result of inquiring is upgraded the access time of this KEY value in local cache table and stabbed, then directly returns results; If inquiry, less than result, is initiated inquiry request to far-end NoSQL database, inquiry adopts asynchronous query pattern, do not wait for Query Result, and directly return, wait database to find after result, automatically upgrade local cache table, while waiting inquire about again this list item next time, just can normally find.
7. the system as described in claim 5 or 6, is characterized in that:
Described ageing module, by the last visit time of all nodes of daemon thread searching loop, surpasses after set digestion time length, by this node deletion when the time.
8. the system as described in claim 5 or 6, is characterized in that:
In described synchronization module, when this has write operation to data, when writing local cache table, write NoSQL database concurrency and send synchronizing content to arrive other machine cache table, complete multimachine device cache table and synchronize with the data of NoSQL database.
CN201310741340.5A 2013-12-27 2013-12-27 Method and system for optimizing data storage of NoSQL mode Pending CN103761255A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310741340.5A CN103761255A (en) 2013-12-27 2013-12-27 Method and system for optimizing data storage of NoSQL mode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310741340.5A CN103761255A (en) 2013-12-27 2013-12-27 Method and system for optimizing data storage of NoSQL mode

Publications (1)

Publication Number Publication Date
CN103761255A true CN103761255A (en) 2014-04-30

Family

ID=50528493

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310741340.5A Pending CN103761255A (en) 2013-12-27 2013-12-27 Method and system for optimizing data storage of NoSQL mode

Country Status (1)

Country Link
CN (1) CN103761255A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615763A (en) * 2015-02-13 2015-05-13 百度在线网络技术(北京)有限公司 Intermediate table updating method and device
CN109254995A (en) * 2018-08-02 2019-01-22 浪潮通用软件有限公司 A method of utilizing cache synchronization data
CN110795457A (en) * 2019-09-24 2020-02-14 苏宁云计算有限公司 Data caching processing method and device, computer equipment and storage medium
CN114974605A (en) * 2022-05-24 2022-08-30 山东浪潮智慧医疗科技有限公司 Method for inquiring multiple nucleic acid reports in high-concurrency scene
CN117076466A (en) * 2023-10-18 2023-11-17 河北因朵科技有限公司 Rapid data indexing method for large archive database

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
卢益阳: "NoSQL数据管理系统综述", 《企业科技与发展》 *
申德荣等: "支持大数据管理的NoSQL系统研究综述", 《软件学报》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615763A (en) * 2015-02-13 2015-05-13 百度在线网络技术(北京)有限公司 Intermediate table updating method and device
CN104615763B (en) * 2015-02-13 2018-02-13 百度在线网络技术(北京)有限公司 Middle table updating method and device
CN109254995A (en) * 2018-08-02 2019-01-22 浪潮通用软件有限公司 A method of utilizing cache synchronization data
CN110795457A (en) * 2019-09-24 2020-02-14 苏宁云计算有限公司 Data caching processing method and device, computer equipment and storage medium
CN114974605A (en) * 2022-05-24 2022-08-30 山东浪潮智慧医疗科技有限公司 Method for inquiring multiple nucleic acid reports in high-concurrency scene
CN117076466A (en) * 2023-10-18 2023-11-17 河北因朵科技有限公司 Rapid data indexing method for large archive database
CN117076466B (en) * 2023-10-18 2023-12-29 河北因朵科技有限公司 Rapid data indexing method for large archive database

Similar Documents

Publication Publication Date Title
CN101510209B (en) Method, system and server for implementing real time search
CN102902730B (en) Based on data reading method and the device of data buffer storage
CN105224546B (en) Data storage and query method and equipment
CN102169507B (en) Implementation method of distributed real-time search engine
CN101493826B (en) Database system based on WEB application and data management method thereof
CN103761255A (en) Method and system for optimizing data storage of NoSQL mode
CN103020204A (en) Method and system for carrying out multi-dimensional regional inquiry on distribution type sequence table
CN104636500A (en) Method and device for querying heat data
CN101169790A (en) Matrix type data caching method and device based on WEB application
CN101930472A (en) Parallel query method for distributed database
CN103164525B (en) WEB application dissemination method and device
CN102779138B (en) The hard disk access method of real time data
CN106528847A (en) Multi-dimensional processing method and system for massive data
CN111597160A (en) Distributed database system, distributed data processing method and device
CN104331428A (en) Storage and access method of small files and large files
CN103353873A (en) Method and system for optimization realization based on time dimension data real-time inquiry service
CN101923571B (en) Method and device for managing terminal data logging
CN103544261A (en) Method and device for managing global indexes of mass structured log data
CN103645904A (en) Cache realization method of interface calling
CN103744913A (en) Database retrieval method based on search engine technology
CN103116627A (en) Database access method with high concurrency service-oriented architecture (SOA) technology and system
CN110245134B (en) Increment synchronization method applied to search service
CN103390045A (en) Time sequence storage method and time sequence storage device for monitoring system
CN109299111A (en) A kind of metadata query method, apparatus, equipment and computer readable storage medium
CN109271449A (en) A kind of distributed storage inquiry system file-based and querying method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent of invention or patent application
CB03 Change of inventor or designer information

Inventor after: Cui Jingjing

Inventor after: Lin Jiajie

Inventor after: Zhang Shuai

Inventor before: Cui Jingjing

Inventor before: Lin Jiajie

Inventor before: Liu Lina

Inventor before: Zhang Shuai

COR Change of bibliographic data

Free format text: CORRECT: INVENTOR; FROM: CUI JINGJING LIN JIAJIE LIU LINA ZHANG SHUAI TO: CUI JINGJING LIN JIAJIE ZHANG SHUAI

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20140430