CN114995764A - Data storage method and device based on stream computing - Google Patents

Data storage method and device based on stream computing Download PDF

Info

Publication number
CN114995764A
CN114995764A CN202210628473.0A CN202210628473A CN114995764A CN 114995764 A CN114995764 A CN 114995764A CN 202210628473 A CN202210628473 A CN 202210628473A CN 114995764 A CN114995764 A CN 114995764A
Authority
CN
China
Prior art keywords
data
result data
storage
target service
storage position
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210628473.0A
Other languages
Chinese (zh)
Inventor
郝应涛
王美青
吕军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Holding Co Ltd
Original Assignee
Jingdong Technology Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Holding Co Ltd filed Critical Jingdong Technology Holding Co Ltd
Priority to CN202210628473.0A priority Critical patent/CN114995764A/en
Publication of CN114995764A publication Critical patent/CN114995764A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0626Reducing size or complexity of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data storage method and device based on stream computing, and relates to the technical field of big data. One embodiment of the method comprises: in response to receiving a calculation request aiming at streaming data of a target service, acquiring historical result data corresponding to the target service from a first storage position; determining current result data corresponding to the target service according to the streaming data and the historical result data, and storing the current result data in a first storage position; judging whether a data synchronization event is triggered; and under the condition of triggering a data synchronization event, synchronizing the current result data to a second storage position so as to respond to a data query request of a user for the target service by utilizing the second storage position. According to the implementation method, under the scenes of high performance and large data volume, the storage resources consumed by data storage can be reduced, and the performance of data query is improved.

Description

Data storage method and device based on stream computing
Technical Field
The invention relates to the technical field of big data, in particular to a data storage method and device based on stream computing.
Background
The existing streaming computing process is to store the detail data or the aggregated real-time result data in a database. However, in a high-performance scenario, storing detailed data may cause storage consumption, increase hardware resource cost, and a large amount of data transmission may also cause performance degradation; in addition, in the process of calculating a large amount of data, the aggregated real-time result data is stored in a memory, so that the data reading and writing frequency is too high, and in the mainstream memory storage, the writing operation time is longer than the reading operation time, so that a large amount of resources are occupied by the writing operation, and the performance of reading data is influenced.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for storing data based on streaming computing, which can reduce storage resources consumed by data storage and ensure performance of data query in a high-performance and large-data-volume scenario.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a method for storing data based on streaming computing, including:
in response to receiving a calculation request aiming at streaming data of a target service, acquiring historical result data corresponding to the target service from a first storage position;
determining current result data corresponding to the target service according to the streaming data and the historical result data, and storing the current result data in the first storage position;
judging whether a data synchronization event is triggered or not;
and under the condition of triggering a data synchronization event, synchronizing the current result data to a second storage position so as to respond to a data query request of a user for the target service by utilizing the second storage position.
Optionally, the data synchronization event includes:
the historical result data and the current result data are not in the same preset data range; or determining that the time interval between the historical result data and the current result data is greater than a preset time.
Optionally, before determining whether to trigger the data synchronization event, the method further includes: and determining the preset data range or the preset duration according to the numerical value of the streaming data and/or the request frequency of the calculation request.
Optionally, before determining whether to trigger the data synchronization event, the method further includes: and setting a step threshold value of the preset data range.
Optionally, acquiring historical result data corresponding to the target service from the first storage location includes:
and determining a current data period corresponding to the calculation request, and acquiring historical result data of the target service in the current data period from the first storage location.
Optionally, the method further comprises:
traversing each historical result datum stored in the historical data period in the first storage location; judging whether result data which are not synchronized to a second storage position exist in each historical result data; and if so, synchronizing the result data which is not synchronized to the second storage position.
Optionally, the first storage location and the second storage location are both a Redis cluster.
According to still another aspect of the embodiments of the present invention, there is provided an apparatus for storing data based on streaming computing, including:
the acquisition module is used for responding to a received calculation request of streaming data aiming at a target service and acquiring historical result data corresponding to the target service from a first storage position;
the determining module is used for determining current result data corresponding to the target service according to the streaming data and the historical result data and storing the current result data in the first storage position;
the judging module is used for judging whether a data synchronization event is triggered or not; and under the condition of triggering a data synchronization event, synchronizing the current result data to a second storage position so as to respond to a data query request of a user for the target service by using the second storage position.
According to another aspect of an embodiment of the present invention, there is provided an electronic device including:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the streaming-based data storage methods provided herein.
According to a further aspect of the embodiments of the present invention, there is provided a computer-readable medium on which a computer program is stored, the program, when executed by a processor, implementing the method for streaming-based computing data storage provided by the present invention.
One embodiment of the above invention has the following advantages or benefits: when a calculation request of streaming data of a target service is received, historical result data of the target service is obtained from a first storage position, then current result data is obtained through calculation, whether a data synchronization event is triggered or not is judged, and the current result data is synchronized to a second storage position under the condition that the data synchronization event is triggered, so that a user can inquire data corresponding to the target service at the second storage position. According to the method, the data synchronization event is set in the streaming calculation process, and the result data after real-time aggregation is synchronized from the first storage position to the second storage position when the data synchronization event is triggered, so that the storage is saved, the isolation of data storage and data query functions is realized, and the influence of write operation on read operation is reduced, so that the performance of querying data is improved, and the high-performance requirement is met.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of a main flow of a method for data storage based on streaming computing according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a main flow of another method for data storage based on streaming computing according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a data storage structure based on stream computing according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a flow of data storage and data query based on streaming computing according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the major modules of an apparatus for streaming-based data storage according to an embodiment of the present invention;
FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;
fig. 7 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram of a main flow of a method for storing data based on streaming computing according to an embodiment of the present invention, as shown in fig. 1, the method includes the following steps:
step S101: in response to receiving a calculation request of streaming data aiming at the target service, acquiring historical result data corresponding to the target service from a first storage position;
step S102: determining current result data corresponding to the target service according to the streaming data and the historical result data, and storing the current result data in a first storage position;
step S103: judging whether a data synchronization event is triggered; if yes, go to step S104, otherwise go to step S105.
Step S104: synchronizing the current result data to a second storage position so as to respond to a data query request of a user for the target service by utilizing the second storage position;
step S105: the current result data is not synchronized to the second storage location.
The method provided by the embodiment of the invention can be applied to high-performance and large-data-volume scenes. The streaming calculation may be cumulative calculation under a large data stream, the streaming data may be a cumulative value, for example, if a cumulative result of a requested amount within 24 hours of a user, each requested amount is streaming data, that is, the streaming data may be an incremental value, when a calculation request of the streaming data is received, historical result data is obtained, a current result data, that is, a cumulative result, is obtained through calculation or aggregation, the historical result data and the streaming data may be summed to obtain current result data, the current result data is stored in a first storage location, the result data of the streaming calculation process is stored in the first storage location, and when a data synchronization event is triggered, the current result data meeting a condition is synchronized to a second storage location.
In the embodiment of the present invention, the data synchronization event may be that the historical result data and the current result data are not in the same preset data range, that is, when the historical result data and the current result data are not in the same preset data range, the data synchronization event is triggered, and if the historical result data and the current result data are in the same preset range, the data synchronization event is not triggered.
In the embodiment of the present invention, before determining whether to trigger the data synchronization event, a plurality of preset data ranges are set, where the plurality of preset data ranges may be a plurality of continuous data ranges or a plurality of discontinuous data ranges. There is no overlapping data in the plurality of preset data ranges. For example, the plurality of preset data ranges may be a plurality of continuous data ranges such as 0-10, 10-15, 15-25, 25-50, etc., or a plurality of discontinuous data ranges such as 0-10, 15-25, 30-50, etc., where each preset data range includes data of the right end point (i.e., the step threshold value) and does not include data of the left end point. Alternatively, the preset data range may be set according to the numerical size of the streaming data. If the numerical value of the streaming data is between 5 and 10, the preset data range can be set to be 0 to 10, 10 to 15, 15 to 25 and the like; if the value of the streaming data is between 100 and 150, the predetermined data range can be set to be 0-150, 150 and 300, 300 and 500, etc.
Optionally, before determining whether to trigger the data synchronization event, a step threshold of a preset data range is set, that is, the data ranges are stepped. If the historical result data and the current result data are not in the same preset data range, it is indicated that the current result data and the historical result data are not on the same step, that is, the current result data exceeds or crosses a step threshold of the preset data range in which the historical result data is located, and at this time, a data synchronization event is triggered.
Optionally, the step threshold may also be set according to the requirement of the target service. For example, the calculation task is "calculate the requested amount of money of the user within 24 hours", when the target service is in use, it is configured that when the requested amount of money of the user within 24 hours is greater than 10, then a true value is returned, and then the step threshold value may be set to 10, so as to perform subsequent service operations; when the first request amount of the user in 24 hours is 5, storing the 5 in a first storage position; when the user requests the amount of money for the second time to be 10, 15 is stored in the first storage position, and since 15 is larger than the step threshold 10, a data synchronization event is triggered. When data synchronization is carried out, the data synchronization is triggered once the accumulated request amount reaches the step threshold value, so that the data synchronization frequency is reduced, the data query performance is improved, and the data accuracy of the target service is guaranteed.
In the embodiment of the present invention, the data synchronization event may be that a time interval between the historical result data and the current result data is determined to be greater than a preset time length. That is, when the time for determining the historical result data and the time interval for determining the current result data exceed the preset time length, the data synchronization event is triggered, and if the time does not exceed the preset time length, the data synchronization time is not triggered. That is, the data synchronization event is a timing trigger, and the data synchronization event is triggered once every preset time interval, for example, if the data synchronization event is triggered once every 2 hours, the data synchronization event is not triggered within 2 hours of the interval.
Optionally, before determining whether to trigger the data synchronization event, the preset duration is determined according to the request frequency of the calculation request. If the request frequency of the calculation request is higher, a shorter preset time can be set, and if the frequency of the calculation request is once 2-3 hours, the preset time can be set to 6 hours; if the request frequency of the calculation request is low, a long preset time length can be set, and if the request frequency of the calculation request is once in 10-12 hours, the preset time length can be set to be 30 hours. By setting the preset time length according to the request frequency of the calculation request and triggering the data synchronization event when the preset time length is exceeded, the data in the second storage position can be less than the data stored in the first storage position, the data synchronization frequency is reduced, the influence of the data synchronization on the data query is reduced, and the performance of the data query is improved.
In this embodiment of the present invention, acquiring historical result data corresponding to a target service from a first storage location includes: and determining a current data period corresponding to the calculation request, and acquiring historical result data of the target service in the current data period from the first storage location.
In the embodiment of the present invention, the historical result data may be the latest historical result data of the target service. The current data period can be set according to the service requirement, the current data period can be a preset time range, and historical result data of the target service in the preset time range can be acquired from the first storage location by determining the preset time range corresponding to the calculation request. Optionally, obtaining historical result data corresponding to the target service from the first storage location may also include: and acquiring the historical result data of the target service which is closest to the current time from the first storage position.
In an embodiment of the present invention, as shown in fig. 2, the method for storing data based on streaming computing further includes:
step S201: traversing each historical result datum stored in the historical data period in the first storage location;
step S202: judging whether result data which are not synchronized to a second storage position exist in each historical result data; if yes, go to step S203; if not, the flow is ended.
Step S203: synchronizing the result data that is not synchronized to the second storage location.
Since the data in the second storage location is a part of result data obtained by real-time aggregation or calculation in the first storage location, when data query is performed through the second storage location, there may be a case where a query result cannot be obtained, and in this case, to meet a query requirement, historical result data in the first storage location that is not synchronized to the second storage location may be synchronized to the second storage location. Alternatively, the synchronization may be performed according to a historical data cycle. For example, the historical result data of the target service in the range from 0 point to 24 points may be acquired at 24 points of each day, and then the result data which is not synchronized to the second storage location in the historical result data is synchronized to the second storage location, so as to meet the data query requirement of the user at the second storage location.
In the embodiment of the present invention, the first storage location and the second storage location are both a Redis cluster, and may also be other databases, such as mysql, etc.
The first storage position and the second storage position correspond to different Redis clusters respectively, so that the first storage position and the second storage position are isolated, and the data reading operation and the data writing operation correspond to different storage positions respectively. The first storage location is mainly used for computing data storage, and can be called computing storage, and the second storage location is mainly used for data query, and can be called query storage. Through the double storage mode, the influence of the high-frequency write operation and the read operation on the CPU, the network and the storage resource of the storage resource on the query performance can be avoided.
In the embodiment of the invention, the first storage position is used for storing the result data obtained by real-time aggregation or calculation in the stream type calculation process, so that the storage resource is saved, the cost is reduced, and the defect of performance reduction caused by mass data transmission can be effectively avoided; the second storage position is used for storing part of result data synchronized by the first storage position, so that storage resources are saved, the data volume stored by the second storage position is reduced, the influence of data synchronization on query is reduced, the query performance of data is improved, and other result data can be synchronized when other query requirements exist.
Fig. 3 is a schematic structural diagram of a data storage based on streaming computing according to an embodiment of the present invention. Calculating the requested amount of money of a user within 24 hours, and setting a plurality of preset ranges of 0-10, 10-15, 15-25 and 25-35, wherein the corresponding step thresholds are respectively 10, 15, 25 and 35; setting a data synchronization event as that the current result data and the historical result data are not in the same preset range; triggering writing when receiving each request amount, and writing the accumulated calculation result into a first storage position; the first request amount is 5, and 5 is written into the first storage position; the second request amount is 15, and the accumulated calculation result 5+ 15-20 is written into the first storage position; judging that 20 and 5 are not on the same step, and 20 crosses the step threshold 15, and if a data synchronization event is triggered, synchronizing 20 to the second storage position so as to query the result from the second storage position; the third request amount is 30, the cumulative calculation result 30+20 is written into the first storage position as 50, and when 50 is judged not to be on the same step as 20 and 50 crosses the step threshold 35, and at this time, a data synchronization event is triggered, 50 is synchronized to the second storage position, so that the result is inquired from the second storage position.
Fig. 4 is a flow chart of data storage and data query based on streaming computing according to an embodiment of the present invention,
the data storage process based on stream computing: responding to a received calculation request of streaming data aiming at the target service, and acquiring historical result data corresponding to the target service from a first storage position; obtaining current result data according to the historical result data and the streaming data, and storing the current result data to a first storage position; judging whether the current result data and the historical result data are not in the same preset data range, if so, synchronizing the current result data to a second storage position; if not, ending the flow;
data query flow based on stream computing: responding to the received data query request aiming at the target service, generating a query identifier according to the data query request, querying a corresponding query result from the second storage position according to the query identifier, obtaining a task result corresponding to the data query request according to the query result, returning the task result, and ending the process.
The data storage method based on stream computing provided by the embodiment of the invention realizes data storage and query through a double storage mode, isolates the data storage of the first storage position from the data query of the second storage position, synchronizes data by adopting a data synchronization triggering mode, saves storage, reduces the influence of data writing operation on reading operation, improves data query performance, realizes that more complete data is reserved for computing and storing, ensures better performance for query and storing, and meets the read-write operation performance of data under high-performance and large-data-volume scenes.
As shown in fig. 5, another aspect of the embodiment of the present invention provides an apparatus 500 for storing data based on streaming computing, including:
the obtaining module 501, in response to receiving a calculation request for streaming data of a target service, obtains historical result data corresponding to the target service from a first storage location;
a determining module 502, configured to determine current result data corresponding to the target service according to the streaming data and the historical result data, and store the current result data in a first storage location;
the judging module 503 judges whether to trigger a data synchronization event; if so, synchronizing the current result data to a second storage position so as to respond to a data query request of the user for the target service by using the second storage position; otherwise, the current result data is not synchronized to the second storage location.
In an embodiment of the present invention, the data synchronization event includes: the historical result data and the current result data are not in the same preset data range; or determining that the time interval between the historical result data and the current result data is greater than the preset time.
In this embodiment of the present invention, the determining module 503 is further configured to: and before judging whether a data synchronization event is triggered, determining a preset data range or preset duration according to the numerical value of the streaming data and/or the request frequency of the calculation request.
In this embodiment of the present invention, the determining module 503 is further configured to: before judging whether a data synchronization event is triggered, a step threshold value of a preset data range is set.
In this embodiment of the present invention, the obtaining module 501 is further configured to: and determining a current data period corresponding to the calculation request, and acquiring historical result data of the target service in the current data period from the first storage location.
In this embodiment of the present invention, the determining module 503 is further configured to: traversing each historical result datum stored in the historical data period in the first storage location; judging whether result data which are not synchronized to a second storage position exist in each historical result data; and if so, synchronizing the result data which is not synchronized to the second storage position.
In the embodiment of the present invention, both the first storage location and the second storage location are Redis clusters.
In another aspect, an embodiment of the present invention provides an electronic device, including: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by one or more processors, the one or more processors implement the method for storing data based on streaming computing according to the embodiment of the invention.
Yet another aspect of the embodiments of the present invention provides a computer-readable medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a method for data storage based on streaming computing according to the embodiments of the present invention.
Fig. 6 illustrates an exemplary system architecture 600 of a method for streaming-based data storage or an apparatus for streaming-based data storage to which embodiments of the present invention may be applied.
As shown in fig. 6, the system architecture 600 may include terminal devices 601, 602, 603, a network 604, and a server 605. The network 604 serves to provide a medium for communication links between the terminal devices 601, 602, 603 and the server 605. Network 604 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
A user may use the terminal devices 601, 602, 603 to interact with a server 605, via a network 604, to receive or send messages or the like. The terminal devices 601, 602, 603 may have installed thereon various communication client applications, such as shopping applications, web browser applications, search applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 605 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 601, 602, 603. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.
It should be noted that the method for storing data based on streaming computing provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the apparatus for storing data based on streaming computing is generally disposed in the server 605.
It should be understood that the number of terminal devices, networks, and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for an implementation.
Referring now to FIG. 7, shown is a block diagram of a computer system 700 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the use range of the embodiment of the present invention.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that the computer program read out therefrom is mounted in the storage section 708 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 701.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes an obtaining module, a determining module, and a determining module. The names of the modules do not constitute a limitation to the modules themselves in some cases, for example, the obtaining module may also be described as a "module for obtaining historical result data corresponding to a target service from a first storage location in response to receiving a computation request for streaming data of the target service".
As another aspect, the present invention also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: in response to receiving a calculation request of streaming data aiming at the target service, acquiring historical result data corresponding to the target service from a first storage position; determining current result data corresponding to the target service according to the streaming data and the historical result data, and storing the current result data in a first storage position; judging whether a data synchronization event is triggered; and under the condition of triggering a data synchronization event, synchronizing the current result data to a second storage position so as to respond to a data query request of a user for the target service by using the second storage position.
According to the technical scheme of the embodiment of the invention, data storage and query are realized in a double-storage mode, data storage of a first storage position is isolated from data query of a second storage position, data is synchronized by adopting a data synchronization triggering mode, storage is saved, the influence of data writing operation on reading operation is reduced, the data query performance is improved, more complete data is reserved for calculation and storage, better performance is guaranteed for query and storage, and the read-write operation performance of data under high-performance and large-data-volume scenes is met.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for streaming-based data storage, comprising:
responding to a received calculation request of streaming data aiming at a target service, and acquiring historical result data corresponding to the target service from a first storage position;
determining current result data corresponding to the target service according to the streaming data and the historical result data, and storing the current result data in the first storage position;
judging whether a data synchronization event is triggered;
and under the condition of triggering a data synchronization event, synchronizing the current result data to a second storage position so as to respond to a data query request of a user for the target service by utilizing the second storage position.
2. The method of claim 1, wherein the data synchronization event comprises:
the historical result data and the current result data are not in the same preset data range; or determining that the time interval between the historical result data and the current result data is greater than a preset time.
3. The method of claim 2, prior to determining whether to trigger a data synchronization event, further comprising: and determining the preset data range or the preset duration according to the numerical value of the streaming data and/or the request frequency of the calculation request.
4. The method of claim 2, further comprising, prior to determining whether to trigger a data synchronization event: and setting a step threshold value of the preset data range.
5. The method of claim 1, wherein obtaining historical result data corresponding to the target service from the first storage location comprises:
and determining a current data cycle corresponding to the calculation request, and acquiring historical result data of the target service in the current data cycle from the first storage location.
6. The method of claim 5, further comprising:
traversing each historical result datum stored in the historical data period in the first storage location; judging whether result data which are not synchronized to a second storage position exist in each historical result data; and if so, synchronizing the result data which is not synchronized to the second storage position.
7. The method of any of claims 1-6, wherein the first storage location and the second storage location are both Redis clusters.
8. An apparatus for streaming-based data storage, comprising:
the acquisition module is used for responding to a received calculation request of streaming data aiming at a target service and acquiring historical result data corresponding to the target service from a first storage position;
the determining module is used for determining current result data corresponding to the target service according to the streaming data and the historical result data and storing the current result data in the first storage position;
the judging module is used for judging whether a data synchronization event is triggered or not; and under the condition of triggering a data synchronization event, synchronizing the current result data to a second storage position so as to respond to a data query request of a user for the target service by utilizing the second storage position.
9. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
10. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202210628473.0A 2022-06-06 2022-06-06 Data storage method and device based on stream computing Pending CN114995764A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210628473.0A CN114995764A (en) 2022-06-06 2022-06-06 Data storage method and device based on stream computing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210628473.0A CN114995764A (en) 2022-06-06 2022-06-06 Data storage method and device based on stream computing

Publications (1)

Publication Number Publication Date
CN114995764A true CN114995764A (en) 2022-09-02

Family

ID=83031046

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210628473.0A Pending CN114995764A (en) 2022-06-06 2022-06-06 Data storage method and device based on stream computing

Country Status (1)

Country Link
CN (1) CN114995764A (en)

Similar Documents

Publication Publication Date Title
CN109684358B (en) Data query method and device
CN109299348B (en) Data query method and device, electronic equipment and storage medium
CN109947668B (en) Method and device for storing data
US20200328984A1 (en) Method and apparatus for allocating resource
CN112948498A (en) Method and device for generating global identification of distributed system
CN110909022A (en) Data query method and device
CN116627333A (en) Log caching method and device, electronic equipment and computer readable storage medium
CN113312553B (en) User tag determining method and device
CN112884181A (en) Quota information processing method and device
CN113760982A (en) Data processing method and device
CN113220705A (en) Slow query identification method and device
CN112948138A (en) Method and device for processing message
CN112395337A (en) Data export method and device
CN111177109A (en) Method and device for deleting overdue key
CN114995764A (en) Data storage method and device based on stream computing
CN113722113A (en) Traffic statistic method and device
CN114257521A (en) Flow prediction method, device, electronic equipment and storage medium
CN114116247A (en) Redis-based message processing method, device, system, server and medium
CN112163176A (en) Data storage method and device, electronic equipment and computer readable medium
CN109087097B (en) Method and device for updating same identifier of chain code
CN113742376A (en) Data synchronization method, first server and data synchronization system
CN110019671B (en) Method and system for processing real-time message
CN110727694A (en) Data processing method and device, electronic equipment and storage medium
CN112783914A (en) Statement optimization method and device
CN112711572A (en) Online capacity expansion method and device suitable for sub-warehouse and sub-meter

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination