CN113779061A - SQL statement processing method, device, medium and electronic equipment - Google Patents

SQL statement processing method, device, medium and electronic equipment Download PDF

Info

Publication number
CN113779061A
CN113779061A CN202110153387.4A CN202110153387A CN113779061A CN 113779061 A CN113779061 A CN 113779061A CN 202110153387 A CN202110153387 A CN 202110153387A CN 113779061 A CN113779061 A CN 113779061A
Authority
CN
China
Prior art keywords
time
window
data
data processing
sql statement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110153387.4A
Other languages
Chinese (zh)
Inventor
何会远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202110153387.4A priority Critical patent/CN113779061A/en
Publication of CN113779061A publication Critical patent/CN113779061A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method, a device, a medium and electronic equipment for processing SQL statements. According to the scheme, in the big data processing process, window offset and periodic output intervals are added in SQL sentences in a task, in the SQL sentences are processed, the starting time of a window of data processing is determined according to the window offset, a trigger of a periodic output result is determined according to the periodic output intervals, after data enters window operation, the intermediate result of the data processing is output according to the starting time of the window and the trigger, the starting time of the window is adjusted, and the change condition of the data can be mastered in real time through the periodic output intermediate result.

Description

SQL statement processing method, device, medium and electronic equipment
Technical Field
The present invention relates to the field of big data technologies, and in particular, to a method, an apparatus, a medium, and an electronic device for processing Structured Query Language (SQL) statements.
Background
In recent years, with the development of big data technology, excellent big data computing engine frameworks such as Storm, Spark, Flink and the like emerge. With Flink becoming increasingly important in the position of big data. In the field of big data, SQL is also the most easy to use, and is an Application Programming Interface (API) layer for users, in the field of traditional Streaming computing, for example, Storm and Spark Streaming all provide some Function or Datastream APIs, users write service logic through Java or Scala, which has some problems, for example, having certain thresholds and being difficult to tune, and with continuous update of versions, APIs also appear in many incompatible places. The Flink SQL is a development language which is designed by using the Flink real-time calculation as a simplified calculation model and reducing the real-time calculation threshold used by a user and accords with the standard SQL semantics. At present, Spark and Flink are actively turning to the realization of SQL, an important concept exists in a flow calculation engine of the SQL, namely a window, and the two big data calculation engines support the use of the window, such as a sliding window, a rolling window, a session window and the like.
However, the current technical solution cannot support the start time of the adjustment window, and cannot grasp the intermediate result state in real time, and cannot grasp the change situation of the data in real time.
Disclosure of Invention
The embodiment of the invention provides a method, a device, a medium and an electronic device for processing an SQL statement, and aims to solve the problems that the prior art cannot support the start time of a window to be adjusted, cannot master the intermediate result state in real time, and cannot master the change condition of data in real time.
In a first aspect, an embodiment of the present invention provides a method for processing an SQL statement, including:
acquiring a Structured Query Language (SQL) statement to be processed, wherein the SQL statement comprises a window offset and a periodic output interval;
determining the starting time of a window for data processing according to the window offset;
determining a trigger of a periodic output result according to the periodic output interval;
and after the data enters the window operation, outputting an intermediate result of data processing according to the starting time of the window and the trigger.
In one embodiment, the method further comprises:
the time field is output while outputting the intermediate result of the data processing each time.
In one embodiment, the method further comprises:
and acquiring a time attribute in the data processing task to which the SQL statement belongs, wherein the time attribute is used for indicating that the type of the window time field is processing time or event time.
In a specific embodiment, the time attribute is used to indicate that the type of the window time field is processing time, and the outputting an intermediate result of data processing according to the start time of the time window and the trigger after the data enters the window operation includes:
after the data enters the window operation, registering a timer according to the trigger and the system time, wherein the system time is the system time when the data enters the window operation;
triggering said timer at a first integer multiple of said periodic output interval after said system time;
and triggering a callback function after the time of the timer reaches, outputting an intermediate result of data processing, triggering the timer again after the periodic output interval, and repeating the step until the triggering time of the timer exceeds the end time of the window.
In a specific embodiment, the time attribute is used to indicate that the type of the window time field is event time, and the outputting, after the data enters the window operation, an intermediate result of the data processing according to the start time of the time window and the periodic output interval includes:
after the data enters the window operation, registering a timer according to the trigger and the data time, wherein the data time is the time included in the data;
triggering the timer after the periodic output interval after the window;
and triggering a callback function after the time of the timer reaches, outputting an intermediate result of data processing, triggering the timer again after the periodic output interval, and repeating the step until the triggering time of the timer exceeds the end time of the window.
In a specific embodiment, the SQL statement further includes a time field and a window size;
correspondingly, the determining the start time of the window of the data processing according to the window offset includes:
determining the start time of the window of data processing according to the time field and the window offset.
In one embodiment, the method further comprises:
and determining the end time of the window for data processing according to the start time and the window size.
In one embodiment, the method further comprises:
and receiving a data processing task sent by a WEB client, wherein the data processing task comprises the SQL statement and the time attribute.
In a second aspect, an embodiment of the present invention provides an apparatus for processing an SQL statement, including:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an SQL statement to be processed, and the SQL statement comprises a window offset and a periodic output interval;
the processing module is used for determining the starting time of a window for data processing according to the window offset;
the processing module is also used for determining a trigger of a periodic output result according to the periodic output interval;
and the output module is used for outputting an intermediate result of data processing according to the starting time of the window and the trigger after the data enters the window operation.
In a third aspect, an embodiment of the present invention provides an electronic device, including:
a processor, an interactive interface; and the number of the first and second groups,
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of processing the SQL statement of any of the first aspect via execution of the executable instructions.
In a fourth aspect, an embodiment of the present invention provides a storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the processing method of the SQL statement according to any one of the first aspects.
In a fifth aspect, an embodiment of the present invention provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program is used to implement the processing method of the SQL statement in any one of the first aspects.
According to the SQL statement processing method, the SQL statement processing device, the SQL statement processing medium and the electronic equipment, in the big data processing process, the window offset and the periodic output interval are added in the SQL statement in the task, in the SQL statement processing process, the starting time of a window of data processing is determined according to the window offset, the trigger of a periodic output result is determined according to the periodic output interval, after data enters window operation, the intermediate result of the data processing is output according to the starting time of the window and the trigger, the scheme supports adjustment of the starting time of the window, and the change condition of the data can be mastered in real time through the periodic output of the intermediate result.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic flow chart of a first embodiment of a method for processing an SQL statement according to the present invention;
FIG. 2 is a schematic flow chart of a second embodiment of a method for processing an SQL statement according to the present invention;
fig. 3 is a schematic flow chart of a third embodiment of a processing method of an SQL statement according to the present invention;
FIG. 4 is a schematic structural diagram of a first embodiment of a SQL statement processing device provided in the present invention;
fig. 5 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments based on the embodiments in the present invention, which can be made by those skilled in the art in light of the present disclosure, are within the scope of the present invention.
In this specification and in other sections of the specification, the terms "comprises" and "comprising," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Spark and Flink provided in the prior art are actively turning to the implementation of SQL, however, there are several problems:
(1) the Flink SQL supports a tube Window, a Hop Window, and a Session Window, but cannot support the start time of a custom Window.
(2, a window supported by the Flink SQL cannot define interval time parameters by self, and then, an intermediate aggregation result of an unfinished window is periodically output according to an interval, so that the intermediate result state cannot be mastered in real time, and a report display cannot be performed in real time.
(3) If the window size used in the Flink SQL is on the day level (window size that is a multiple of 24 hours), there is a problem of aggregation statistics errors. For example, the window size is 24 hours, two pieces of data 2020-01-0106: 30:30 and 2020-01-0116: 30:30 exist, and assuming that there are only two pieces of data on the day 2020-01-01, the data size in the aggregated statistical window should be 2 for the correct result, but because of the east eight region problem, 2020-01-0106: 30:30 and 020-01-0116: 30:30 cannot be correctly allocated to the window (2020-01-0100: 00:00, 2020-01-0200: 00:00), so there is an error in the aggregated result.
In the above, it should be understood that Flink: an open-source computing engine under Apache can process streaming tasks and batch tasks, and support SQL consumption data processing and persistence to an external storage system, and the like.
Time zone: the Flink SQL platform runs in Kubernets, with the default time zone being the east eight zone. When a cross-day window is calculated, the starting time of the window is from eight points of each day to the next day to eight points, and the problem of aggregation statistics error exists; the non-across-the-day window start and end times were normal.
Aiming at the problems, the invention provides a processing method of SQL statements, which supports the adjustment of the start Time of a rolling window and a sliding window in an Event Time \ process Time scene, supports the periodic output of the interval between the two windows, and can solve the problem of aggregation statistical error of the window with the size of a day level.
The following is a processing method of the SQL statement provided by the present application by several specific embodiments.
Fig. 1 is a schematic flow diagram of a first embodiment of a processing method of an SQL statement provided by the present invention, as shown in fig. 1, the processing method of the SQL statement may be applied to an electronic device having a data processing function, for example: the SQL statement processing method comprises the following steps:
s101: and acquiring an SQL statement to be processed, wherein the SQL statement comprises a window offset and a periodic output interval.
Generally, when the user plane uses the SQL correlation function, the scrolling window only supports two parameters of time field and window size, and the sliding window only supports three parameters of time field, sliding interval and window size.
In this step, in the definition and processing process of the SQL statement, under the condition of compatible native, two other parameters are introduced: window offset and periodic output interval. When the user inputs the SQL statement, the SQL statement comprises a window offset and a periodic output interval, and the two parameters are used for adjusting the starting time of the window and supporting the intermediate output of data processing.
In a specific implementation of the present solution, a user may input the SQL statement through a client or a web, and the electronic device receives the SQL statement sent by the client or the web.
And for the electronic equipment, receiving a data processing task sent by a WEB client, wherein the data processing task comprises the SQL statement and the time attribute. The time attribute is used to indicate the type of the window time field as processing time or event time.
S102: the start time of the window of data processing is determined according to the window offset.
In this step, after the electronic device acquires the SQL statement, the start time of the window for data processing may be determined according to the window offset therein. Increasing the window offset may change the start time, end time of the window.
In a particular implementation, the electronic device may determine the start time of the window of data processing from the time field and the window offset.
The window offset refers to an operation of moving a window forward or backward in a time dimension, thereby changing a start end time of the window. For example, a window size of 1 hour, and the time of the data is 2020-01-0112:30:00, then the window is (2020-01-0112: 00:00, 2020-01-0113: 00:00), and by introducing a window offset of 5 minutes, the window can be made to be (2020-01-0112: 05:00, 2020-01-0113:05: 00). Or, the time field is 0:00 to 24:00, and after the window offset is obtained to be 8h, the time window may be determined to be 8:00 to 8 of the next day: 00, get the new window start time.
S103: and determining a trigger for periodically outputting the result according to the periodic output interval.
In this step, the electronic device may determine a trigger according to the periodic output interval, where the trigger is configured to trigger a callback process, and send an intermediate result of data processing with the periodic output interval as a period, and the trigger is configured to periodically trigger output of the intermediate result, and is configured to register the timing scheduler, and automatically trigger a callback function according to a registered time to a time, where the function is mainly used to output a window intermediate result.
S104: and after the data enters the window operation, outputting an intermediate result of the data processing according to the starting time of the window and the trigger.
In this step, after the start time of the window and the trigger are determined, the result of data processing is periodically output based on the start time and the trigger after the data enters the window operation. In this scenario, the specific trigger-based callback processing differs due to the type of time.
In this scheme, the meaning of the periodic output of the intermediate result in the window is: for example, the window size is 24 hours, the window only outputs the result at the time when the window ends, in this scheme, in order to show the intermediate result of the window in more real time, a periodic time interval parameter (incementerval) may be customized, for example, the size is 1 hour, and then the current result of the window is output once every 1 hour.
In a specific implementation, the electronic device outputs a time field indicating a time at which a corresponding intermediate result is obtained while outputting the intermediate result of the data processing each time.
In a specific implementation of this step, if the time attribute in the data processing task to which the SQL statement belongs indicates that the window time field type is processing time, after the data enters the window operation, a timer is registered according to the trigger and system time, where the system time is the system time when the data enters the window operation.
Triggering the timer at the first integral multiple of the periodic output interval after the system time, triggering a callback function after the timer time reaches, outputting an intermediate result of data processing, triggering the timer again after the periodic output interval, and repeating the step until the triggering time of the timer exceeds the end time of the window.
In another specific implementation of this step, if the time attribute in the data processing task to which the SQL statement belongs indicates that the window time field type is event time, after the data enters the window operation, a timer is registered according to the trigger and the data time, where the data time is time included in the data;
triggering the timer after the periodic output interval after the window; and triggering a callback function after the time of the timer reaches, outputting an intermediate result of data processing, triggering the timer again after the periodic output interval, and repeating the step until the triggering time of the timer exceeds the end time of the window.
In the processing method of the SQL statement provided in this embodiment, in the big data processing process, by adding the window offset and the periodic output interval to the SQL statement in the task, in the processing process of the SQL statement, the start time of the window for data processing is determined according to the window offset, the trigger for the periodic output result is determined according to the periodic output interval, and after the data enters the window operation, the intermediate result for data processing is output according to the start time of the window and the trigger.
Fig. 2 is a schematic flow diagram of a second embodiment of a processing method for an SQL statement provided in the present invention, and as shown in fig. 2, on the basis of the foregoing embodiment, this embodiment provides an implementation scheme of a specific processing method for an SQL statement, which includes the following steps:
s201: the method comprises the steps of obtaining an input SQL statement, wherein the input SQL statement comprises a time field, a window size, a window offset and a periodic output interval.
S202: the SQL statement is parsed, including time fields, window size, window offset, and periodic output intervals.
In this step, the parsing of the SQL statement makes it possible to correctly identify two parameters (window offset and periodic output interval) newly added in the window function TUMBLE in SQL.
In one specific implementation, the correlation function of the user plane using SQL is as follows:
introduction of the scroll Window function:
TUMBLE (time field, window size)
TUMBLE (time field, window size, window offset)
TUMBLE (time field, window size, window offset, periodic output interval)
The rolling window only supports two parameters (time field and window size), and the scheme introduces another two parameters under the condition of compatibility and originality: window offset and periodic output interval. The use of a rolling window is supported in the above three ways.
The window offset may change the start time, end time of the window, and if there is a periodic output interval, the window intermediate result output may be sent downstream periodically by the interval size.
Introduction of sliding window function:
HOP (time field, sliding interval, window size)
TUMBLE (time field, sliding interval, window size, window offset)
TUMBLE (time field, sliding interval, window size, window offset, periodic output interval)
The sliding window implementation principle is the same as the rolling window.
S203: and adjusting the starting time of the window according to the window offset.
S204: and determining a trigger for periodically outputting the result according to the periodic output interval.
In the two steps, after the window offset and the periodic output interval are obtained, the start time and the end time of the adjusted window can be obtained according to the time field, the window size and the window offset.
Determining implementation of the flip-flop according to the periodic output interval:
processing Time: and under the semantic of processing time, realizing the periodic output of window intermediate results, triggering a callback function by registering a Timer, and then outputting the results, and simultaneously registering a new Timer, wherein the interval of the Timer is fixed, namely the size of the periodic output interval set in the window function.
Event Time: in the event Time semantics, unlike Processing Time, it is determined by the event Time, which is the data being processed, and the data specifies a field of a certain Time type as the event Time of the data. The Timer is also registered, and the output of the intermediate result of the window is triggered only when the time contained in the data is greater than or equal to the time of the Timer, is completely driven by the event and has no relation with the system time.
S205: and binding a time field triggering a window output result in the window optimization rule.
S206: and outputting the intermediate result of the window.
In the above steps, a window optimization rule is configured in advance, a time field is set to be output while the intermediate result is output each time, and the intermediate result and the time corresponding to the intermediate result can be obtained at the same time.
In the middle mode provided by the scheme, a user can set the starting time of the window offset self-defined time window, and meanwhile, the user can set the periodic output interval to realize the self-defined output interval, and the middle aggregation result of the window is output in the processing process, so that the middle result state is mastered in real time.
Fig. 3 is a schematic flow diagram of a third embodiment of a processing method of an SQL statement provided in the present invention, and as shown in fig. 3, on the basis of the foregoing embodiment, the present embodiment provides a more specific implementation step of the processing method of the SQL statement:
s301: window function (time field, window size, window offset, periodic output interval).
S302: and analyzing parameters of the window function in the SQL statement, and acquiring parameter values if four parameters exist.
In the scheme, it should be understood that the window offset is mainly realized by introducing a function parameter of an SQL layer, then configuring a corresponding offset parameter to a specific window through SQL analysis, and finally allocating a window to data by using a corresponding window allocator in a window operator, where the allocated window is realized through the following disclosure, and by introducing offset, the start time of the window can be changed, and because the window size is fixed, the window end time can be obtained only by knowing the start time.
For example: public static long getwindStartWithOffset (long timestamp, long windSize) front face
Return timestamp-(timestamp-Offset+windowSize)%windowSize;
}
Wherein, timestamp: time in the data, specific values of time fields set in the window function; offset: the window offset value set in the window function; windowSize: window size set in the window function.
S303: and judging whether the processing time or the event time is the processing time or the event time according to the window time attribute.
S304: the time of the event.
In the solution of the invention, the event time is the time at which each event occurs on its production facility. This time is typically embedded in the records before entering the Flink, and the event timestamp can be extracted from each record. At event time, the progression of time depends on the data, not any clock. The event time program must specify how the event time watermark is generated, which is a mechanism to represent the progress of the event time.
S305: and (4) processing time.
In the solution of the invention, the processing time refers to the system time of the machine performing the corresponding operation. When the stream handler runs based on processing time, all time-based operations (e.g., time windows) will use the system clock of the machine running the corresponding operator. The hourly processing time window will include all records that arrive at a particular operator between the system clock indicating the entire hour. For example, if an application starts running at 9:15 AM, the first hourly processing time window will include events processed between 9:15 AM and 10:00 AM, the next window will include events processed between 10:00 AM and 11:00 AM, and so on.
S306: a trigger is created that periodically outputs the result.
S307: the data enters a windowing operation.
S308: after the window is allocated for the data, the corresponding timer is registered using the trigger.
S309: event time: the timer is registered according to the data time.
S310: treatment time: the timer is registered according to the system time.
S311: and automatically calling back when the timer expires, and outputting a window intermediate result.
As shown in fig. 3, in the processing of the SQL statement and the specific data processing process, the algorithm of the trigger registration Timer (Timer) is as follows:
(1) case of Processing Time (Processing Time):
assuming that a window of a piece of data has a start time and an end time of end, a system time corresponding to the piece of data is a timestamp, two timers are initially registered if a periodic output interval is incrementInterval,
the triggering time of the first Timer is start + incrementInterval (1+ (timestamp-start)/incrementInterval); the trigger time for the second Timer is: end.
The second Timer trigger time above is the end of the window, i.e., the last time the window was output. The first Timer trigger time is the first integer multiple of incrementInterval after the system time of the current data entering the window operation. When the first Timer reaches, the callback function is triggered, so that the output of the intermediate result is triggered, and meanwhile, a new Timer is registered, so that the intermediate result is periodically output. And if the new Timer trigger time exceeds the end time end of the window, the new Timer trigger time is not registered.
(2) Case of Event Time (Event Time):
data in Event Time will register two timers at the beginning, and the Time of the first Timer is: start + incrementInterval, the time of the second Timer is end. The expiration of the Timer automatically triggers a callback function, the specific function is the same as the processing time, but the time algorithm for registering a new Timer is different. The Timer is registered only when a new Timer trigger time (start + increment interval) ((ctx. getcurrentwatermark () -start)/increment interval) is satisfied and no more than end time is satisfied. In the Flink, the output of the window result at the event time is driven by the watermark, namely when the watermark time exceeds the time of the Timer, the Timer is triggered to execute the callback function, and the corresponding intermediate result is output.
To avoid registering unnecessary Timer, by ensuring that the registered Timer is not lower than the current watermark time.
In a specific implementation of the scheme, in this way, if the Processing Time is based, the direct timer Time reaches the callback, and if the Event Time is based, the timer trigger is dependent on the watermark, and only when the Time represented by the watermark exceeds the timer Time, the timer will call back. Watermark is a concept of a flink computing engine, and is mainly used in a window scene at event time, after each piece of data reaches the flink engine, time is extracted from the data, the data is stored in the Watermark, the Watermark is sent downstream at fixed time, and when the Watermark passes through the window, if the time represented by the Watermark exceeds the end time of the window or the timer time of incremental window registration, the output of a window result is triggered.
In a specific implementation of this scheme, when the window function of "TUMBLE" (time field, window size) is changed to "TUMBLE" (time field, window size, window offset), for example, the window size is 24 hours, and the time field is rowtime, the previous method is: changing from TUMBLE (rowtime, INTERVAL '24' HOUR) to TUMBLE (rowtime, INTERVAL '24' HOUR, INTERVAL '-8' HOUR) solves the window statistics error problem, such as: for a day-scale window, i.e., a window size of 24 hours, 48 hours, if the data time is 2020-01-2106: 00:00, the window to which the data belongs is (2020-01-2008: 00:00, 2020-01-2108: 00:00), but the data actually belongs to the window (2020-01-2100:00, 2020-01-2200: 00:00), the window can be corrected by introducing an offset to obtain a correct window for the data (2020-01-2100:00, 2020-01-2200: 00: 00).
On the whole, the technical scheme's that this scheme provided nuclear core point is: the method includes the steps that window offset and a periodic output interval are introduced, periodic output of window intermediate results is achieved according to two different Time semantics of Event Time and Processing Time, a corresponding Trigger (Trigger) needs to be achieved, how SQL (structured query language) level API is bound with an SQL optimization bottom layer, and two parameters of the introduced periodic output interval (increment interval) and offset (window offset) are correctly analyzed and identified, so that the problems that in the prior art, the starting Time of a window cannot be defined by users, the intermediate results cannot be mastered in real Time are solved, and meanwhile the problem that errors exist in aggregation results of data in the window can be avoided.
Fig. 4 is a schematic structural diagram of a first embodiment of a processing apparatus for an SQL statement provided in the present invention, and as shown in fig. 4, the processing apparatus 10 for an SQL statement further includes:
the system comprises an acquisition module 11, a processing module and a processing module, wherein the acquisition module is used for acquiring an SQL statement to be processed, and the SQL statement comprises a window offset and a periodic output interval;
a processing module 12, configured to determine a start time of a window for data processing according to the window offset;
the processing module 12 is further configured to determine a trigger of a periodic output result according to the periodic output interval;
and the output module 13 is configured to output an intermediate result of data processing according to the start time of the window and the trigger after the data enters the window operation.
On the basis of the above embodiment, the output module 13 is further configured to:
the time field is output while outputting the intermediate result of the data processing each time.
In a specific implementation of the foregoing embodiment, the obtaining module 11 is further configured to:
and acquiring a time attribute in the data processing task to which the SQL statement belongs, wherein the time attribute is used for indicating that the type of the window time field is processing time or event time.
Optionally, the time attribute is used to indicate that the type of the window time field is processing time, and the output module 13 is specifically configured to:
after the data enters the window operation, registering a timer according to the trigger and the system time, wherein the system time is the system time when the data enters the window operation;
triggering said timer at a first integer multiple of said periodic output interval after said system time;
and triggering a callback function after the time of the timer reaches, outputting an intermediate result of data processing, triggering the timer again after the periodic output interval, and repeating the step until the triggering time of the timer exceeds the end time of the window.
Optionally, the time attribute is used to indicate that the type of the window time field is event time, and the output module 13 is specifically configured to:
after the data enters the window operation, registering a timer according to the trigger and the data time, wherein the data time is the time included in the data;
triggering the timer after the periodic output interval after the window;
and triggering a callback function after the time of the timer reaches, outputting an intermediate result of data processing, triggering the timer again after the periodic output interval, and repeating the step until the triggering time of the timer exceeds the end time of the window.
Optionally, the SQL statement further includes a time field and a window size;
correspondingly, the processing module 12 is specifically configured to:
determining the start time of the window of data processing according to the time field and the window offset.
Optionally, the processing module 12 is further configured to:
and determining the end time of the window for data processing according to the start time and the window size.
Optionally, the obtaining module 11 is further configured to:
and receiving a data processing task sent by a WEB client, wherein the data processing task comprises the SQL statement and the time attribute.
The processing apparatus of the SQL statement provided in any of the embodiments is configured to execute the technical solutions of any of the foregoing method embodiments, and the implementation principle and the technical effects are similar, by adding a window offset and a periodic output interval in the SQL statement, during the processing of the SQL statement, the start time of a window for data processing is determined according to the window offset, a trigger for a periodic output result is determined according to the periodic output interval, after data enters a window operation, the start time of the window is supported to be adjusted according to the start time of the window and the intermediate result of the data processing output by the trigger, and the change condition of the data can be grasped in real time by outputting the intermediate result periodically.
Fig. 5 is a schematic structural diagram of an electronic device provided in the present invention. As shown in fig. 5, the present embodiment provides an electronic device 20 including:
a processor 21, an interactive interface 23; and the number of the first and second groups,
a memory 22 for storing executable instructions of the processor;
wherein the processor 21 is configured to execute the processing method of the SQL statement provided by any of the foregoing method embodiments by executing the executable instructions.
Alternatively, the memory 22 may be separate or integrated with the processor 21.
When the memory 22 is a device independent from the processor 21, the electronic device 20 may further include:
a bus 24 for connecting the processor 21 and the memory 22.
The embodiment of the present invention further provides a storage medium, where a computer program is stored in the storage medium, and when the computer program is executed by a processor, the method for processing the SQL statement provided in any of the above method embodiments is implemented.
Embodiments of the present invention also provide a computer program product, which includes a computer program, and the computer program is stored in a storage medium. The computer program can be read by at least one processor from a readable storage medium, and the computer program can be executed by at least one processor to implement the processing method of the SQL statement provided by any of the foregoing method embodiments.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (12)

1. A method for processing SQL statements is characterized by comprising the following steps:
acquiring a Structured Query Language (SQL) statement to be processed, wherein the SQL statement comprises a window offset and a periodic output interval;
determining the starting time of a window for data processing according to the window offset;
determining a trigger of a periodic output result according to the periodic output interval;
and after the data enters the window operation, outputting an intermediate result of data processing according to the starting time of the window and the trigger.
2. The method of claim 1, further comprising:
the time field is output while outputting the intermediate result of the data processing each time.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
and acquiring a time attribute in the data processing task to which the SQL statement belongs, wherein the time attribute is used for indicating that the type of the window time field is processing time or event time.
4. The method of claim 3, wherein the time attribute is used to indicate that the type of the window time field is processing time, and the outputting an intermediate result of data processing according to the start time of the time window and the trigger after the data enters the window operation comprises:
after the data enters the window operation, registering a timer according to the trigger and the system time, wherein the system time is the system time when the data enters the window operation;
triggering said timer at a first integer multiple of said periodic output interval after said system time;
and triggering a callback function after the time of the timer reaches, outputting an intermediate result of data processing, triggering the timer again after the periodic output interval, and repeating the step until the triggering time of the timer exceeds the end time of the window.
5. The method according to claim 3, wherein the time attribute is used to indicate that the type of the window time field is event time, and the outputting an intermediate result of data processing according to the start time of the time window and the periodic output interval after the data enters the window operation comprises:
after the data enters the window operation, registering a timer according to the trigger and the data time, wherein the data time is the time included in the data;
triggering the timer after the periodic output interval after the window;
and triggering a callback function after the time of the timer reaches, outputting an intermediate result of data processing, triggering the timer again after the periodic output interval, and repeating the step until the triggering time of the timer exceeds the end time of the window.
6. The method according to claim 1 or 2, wherein the SQL statement further comprises a time field, a window size;
correspondingly, the determining the start time of the window of the data processing according to the window offset includes:
determining the start time of the window of data processing according to the time field and the window offset.
7. The method of claim 6, further comprising:
and determining the end time of the window for data processing according to the start time and the window size.
8. The method of claim 3, further comprising:
and receiving a data processing task sent by a WEB client, wherein the data processing task comprises the SQL statement and the time attribute.
9. An apparatus for processing an SQL statement, comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an SQL statement to be processed, and the SQL statement comprises a window offset and a periodic output interval;
the processing module is used for determining the starting time of a window for data processing according to the window offset;
the processing module is also used for determining a trigger of a periodic output result according to the periodic output interval;
and the output module is used for outputting an intermediate result of data processing according to the starting time of the window and the trigger after the data enters the window operation.
10. An electronic device, comprising:
a processor, an interactive interface; and the number of the first and second groups,
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of processing the SQL statement according to any one of claims 1 to 8 via execution of the executable instructions.
11. A storage medium on which a computer program is stored, the program implementing the processing method of the SQL statement according to any one of claims 1 to 8 when executed by a processor.
12. A computer program product comprising a computer program for implementing the method of processing the SQL statement according to any one of claims 1 to 8 when executed by a processor.
CN202110153387.4A 2021-02-04 2021-02-04 SQL statement processing method, device, medium and electronic equipment Pending CN113779061A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110153387.4A CN113779061A (en) 2021-02-04 2021-02-04 SQL statement processing method, device, medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110153387.4A CN113779061A (en) 2021-02-04 2021-02-04 SQL statement processing method, device, medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN113779061A true CN113779061A (en) 2021-12-10

Family

ID=78835562

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110153387.4A Pending CN113779061A (en) 2021-02-04 2021-02-04 SQL statement processing method, device, medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN113779061A (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101957832A (en) * 2009-07-16 2011-01-26 Sap股份公司 Unified window support for the flow of event data management
CN103309873A (en) * 2012-03-09 2013-09-18 阿里巴巴集团控股有限公司 Method and device for processing data, and system
CN104765765A (en) * 2015-02-15 2015-07-08 杭州邦盛金融信息技术有限公司 Moveable dynamic data rapid processing method based on time window
CN104881203A (en) * 2015-04-24 2015-09-02 青岛海信移动通信技术股份有限公司 Touch operation method and device in terminal
US20160125033A1 (en) * 2013-06-21 2016-05-05 Hitachi, Ltd. Stream data processing method with time adjustment
CN107992516A (en) * 2017-10-27 2018-05-04 平安科技(深圳)有限公司 Electronic device, the method for data query and storage medium
US20190130004A1 (en) * 2017-10-27 2019-05-02 Streamsimple, Inc. Streaming Microservices for Stream Processing Applications
US20190251196A1 (en) * 2018-02-09 2019-08-15 International Business Machines Corporation Transforming a scalar subquery
CN110622152A (en) * 2017-02-27 2019-12-27 分秒库公司 Scalable database system for querying time series data
CN111177178A (en) * 2019-12-03 2020-05-19 腾讯科技(深圳)有限公司 Data processing method and related equipment
US20210019309A1 (en) * 2019-07-16 2021-01-21 Thoughtspot, Inc. Mapping Natural Language To Queries Using A Query Grammar

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101957832A (en) * 2009-07-16 2011-01-26 Sap股份公司 Unified window support for the flow of event data management
CN103309873A (en) * 2012-03-09 2013-09-18 阿里巴巴集团控股有限公司 Method and device for processing data, and system
US20160125033A1 (en) * 2013-06-21 2016-05-05 Hitachi, Ltd. Stream data processing method with time adjustment
CN104765765A (en) * 2015-02-15 2015-07-08 杭州邦盛金融信息技术有限公司 Moveable dynamic data rapid processing method based on time window
CN104881203A (en) * 2015-04-24 2015-09-02 青岛海信移动通信技术股份有限公司 Touch operation method and device in terminal
CN110622152A (en) * 2017-02-27 2019-12-27 分秒库公司 Scalable database system for querying time series data
CN107992516A (en) * 2017-10-27 2018-05-04 平安科技(深圳)有限公司 Electronic device, the method for data query and storage medium
US20190130004A1 (en) * 2017-10-27 2019-05-02 Streamsimple, Inc. Streaming Microservices for Stream Processing Applications
US20190251196A1 (en) * 2018-02-09 2019-08-15 International Business Machines Corporation Transforming a scalar subquery
US20210019309A1 (en) * 2019-07-16 2021-01-21 Thoughtspot, Inc. Mapping Natural Language To Queries Using A Query Grammar
CN111177178A (en) * 2019-12-03 2020-05-19 腾讯科技(深圳)有限公司 Data processing method and related equipment

Similar Documents

Publication Publication Date Title
US8402463B2 (en) Hardware threads processor core utilization
CN108280023B (en) Task execution method and device and server
CN106708617B (en) A kind of application process keep-alive system and keepalive method based on Service
CN104063441A (en) Database operation maintenance system and data operation maintenance method thereof
CN110795311B (en) Event playback method and device
CN109445856A (en) A kind of method and electronic equipment of the acceleration application starting based on educational system
US7958083B2 (en) Interacting methods of data summarization
CN106775620B (en) Timing method and device
CN109634822B (en) Function time consumption statistical method and device, storage medium and terminal equipment
CN107506293B (en) Software performance data acquisition method and device
CN113468196B (en) Method, apparatus, system, server and medium for processing data
CN113779061A (en) SQL statement processing method, device, medium and electronic equipment
CN112130849B (en) Code automatic generation method and device
CN110109672B (en) Analysis processing method and device for expression
US20230031224A1 (en) Log compression
CN115794186A (en) Game data hot updating method, device, server and storage medium
CN109799872A (en) Improve the method, apparatus and electronic equipment of low-res real-time clock waking-up precision
US20170192878A1 (en) Separating Test Coverage In Software Processes Using Shared Memory
CN108920722B (en) Parameter configuration method and device and computer storage medium
US11150961B2 (en) Accelerated operation of a graph streaming processor
CN107729058B (en) Method for automatically analyzing identification result of value-added tax invoice
CN108460129B (en) Server-based order batch statistical method, computer equipment and storage medium
CN117289754B (en) Time-synchronous chip architecture and software control method thereof
CN116955427B (en) Method and device for processing real-time multi-rule dynamic expression data based on Flink frame
Lakhani et al. Applying design patterns to improve the reliability of embedded systems through a process of architecture migration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination