CN110209700B - Data stream association method and device, electronic equipment and storage medium - Google Patents

Data stream association method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110209700B
CN110209700B CN201910439751.6A CN201910439751A CN110209700B CN 110209700 B CN110209700 B CN 110209700B CN 201910439751 A CN201910439751 A CN 201910439751A CN 110209700 B CN110209700 B CN 110209700B
Authority
CN
China
Prior art keywords
data stream
association
analysis result
correlation
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910439751.6A
Other languages
Chinese (zh)
Other versions
CN110209700A (en
Inventor
蒋戈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201910439751.6A priority Critical patent/CN110209700B/en
Publication of CN110209700A publication Critical patent/CN110209700A/en
Application granted granted Critical
Publication of CN110209700B publication Critical patent/CN110209700B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems

Abstract

The embodiment of the invention provides a data stream association method, a data stream association device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a data stream association instruction, wherein the data stream association instruction comprises an input source, an output source and an association condition of each data stream to be associated; analyzing the data stream association instruction to obtain data stream association information, wherein the data stream association information comprises an input source, an output source and an association condition of each data stream to be associated; configuring a Flink SQL frame according to the input source, the output source and the association condition of each data stream to be associated; and acquiring each data stream to be correlated through the configured Flink SQL frame, and performing correlation analysis on each data stream to be correlated according to the correlation conditions of the configured Flink SQL frame to obtain a correlation analysis result. The embodiment of the invention realizes the association between the real-time data streams in a simple SQL mode.

Description

Data stream association method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of internet technologies, and in particular, to a data stream association method and apparatus, an electronic device, and a storage medium.
Background
With the rapid development of internet technology, more and more data streams are emerging. In the prior art, corresponding data streams are processed through independent threads, however, as the number of internet users increases, the generated data streams are related to each other, so that real-time data streams need to be processed in a related manner, and further analysis is facilitated for implementation personnel.
Disclosure of Invention
Embodiments of the present invention provide a data stream association method and apparatus, an electronic device, and a storage medium, and implement association between real-time data streams in a simple SQL manner. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention discloses a data stream association method, where the method includes:
acquiring a data stream association instruction, wherein the data stream association instruction comprises an input source, an output source and an association condition of each data stream to be associated;
analyzing the data stream association instruction to obtain data stream association information, wherein the data stream association information comprises an input source, an output source and an association condition of each data stream to be associated;
configuring a Flink SQL frame according to the input source, the output source and the association condition of each data stream to be associated;
and acquiring each data stream to be correlated through the configured Flink SQL frame, and performing correlation analysis on each data stream to be correlated according to the correlation conditions of the configured Flink SQL frame to obtain a correlation analysis result.
Optionally, the input source includes a storage address and an identifier of each data stream to be associated; the output source comprises a storage location of the correlation analysis result; the association condition comprises an association field of each data stream to be associated.
Optionally, the obtaining, by the configured Flink SQL framework, each data stream to be associated, and performing association analysis on each data stream to be associated according to the association condition of the configured Flink SQL framework to obtain an association analysis result includes:
acquiring the data streams to be associated according to the storage addresses and the identifications of the data streams to be associated;
and performing correlation analysis on each data stream to be correlated according to the correlation field of each data stream to be correlated to obtain a correlation analysis result containing each field in each data stream to be correlated.
Optionally, the association condition further includes a preset field; the acquiring each data stream to be associated through the configured Flink SQL framework, and performing association analysis on each data stream to be associated according to the association conditions of the configured Flink SQL framework to obtain an association analysis result, including:
acquiring each data stream to be associated according to the storage address and the identification of each data stream to be associated;
and performing correlation analysis on each data stream to be correlated according to the correlation field of each data stream to be correlated and the preset field to obtain a correlation analysis result at least comprising the preset field.
In a second aspect, an embodiment of the present invention discloses a data association apparatus, where the apparatus includes:
the device comprises a data stream association instruction acquisition module, a data stream association instruction acquisition module and a data stream association processing module, wherein the data stream association instruction acquisition module is used for acquiring a data stream association instruction, and the data stream association instruction comprises an input source, an output source and an association condition of each data stream to be associated;
the data stream association instruction analysis module is used for analyzing the data stream association instruction to obtain data stream association information, and the data stream association information comprises an input source, an output source and association conditions of each data stream to be associated;
the Flink SQL frame configuration module is used for configuring a Flink SQL frame according to the input source, the output source and the association condition of each data stream to be associated;
and the association analysis result determining module is used for acquiring each data stream to be associated through the configured Flink SQL frame, and performing association analysis on each data stream to be associated according to the association conditions of the configured Flink SQL frame to obtain an association analysis result.
Optionally, the input source includes a storage address and an identifier of each data stream to be associated; the output source comprises a storage location of the correlation analysis result; the association condition comprises an association field of each data stream to be associated.
Optionally, the association analysis result determining module includes:
the first to-be-associated data stream acquisition submodule is used for acquiring each to-be-associated data stream according to the storage address and the identification of each to-be-associated data stream;
and the first association analysis result determining submodule is used for performing association analysis on each data stream to be associated according to the association field of each data stream to be associated to obtain an association analysis result containing each field in each data stream to be associated.
Optionally, the association condition further includes a preset field; the correlation analysis result determination module comprises:
the second data stream to be associated acquisition submodule is used for acquiring each data stream to be associated according to the storage address and the identification of each data stream to be associated;
and the second correlation analysis result determining submodule is used for performing correlation analysis on each data stream to be correlated according to the correlation field of each data stream to be correlated and the preset field to obtain a correlation analysis result at least comprising the preset field.
Optionally, the apparatus further comprises:
and the correlation analysis result storage module is used for storing the correlation analysis result to the output source.
In another aspect, an embodiment of the present invention discloses an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
the memory is used for storing a computer program;
the processor is configured to implement the method steps of any of the above data flow association methods when executing the program stored in the memory.
In another aspect, an embodiment of the present invention discloses a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the method steps of any one of the above data flow association methods are implemented.
In yet another aspect, an embodiment of the present invention discloses a computer program product containing instructions for implementing the method steps of any one of the above data stream association methods when the computer program product runs on a computer.
The embodiment of the invention provides a data stream association method, a data stream association device, electronic equipment and a storage medium, which realize association between real-time data streams in a simple SQL mode. Specifically, a data stream association instruction is obtained; analyzing the obtained data stream association instruction to obtain an input source, an output source and an association condition of the associated data stream contained in the data stream association instruction; configuring a preset Flink SQL frame through an input source, an output source and a correlation condition of a correlation data stream; and acquiring each data stream to be correlated through the configured Flink SQL frame, and performing correlation analysis on each data stream to be correlated according to the correlation conditions of the configured Flink SQL frame to obtain a correlation analysis result. In the embodiment of the invention, the preset Flink SQL frame is configured, and the data streams to be correlated are subjected to correlation analysis through the configured Flink SQL frame to obtain the correlation analysis result, so that the correlation of the big data stream is realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
FIG. 1 is a flow chart of a data flow association method according to an embodiment of the present invention;
fig. 2 is a flowchart of a method for determining a correlation analysis result in a data stream correlation method according to an embodiment of the present invention;
fig. 3 is a flowchart of a method for determining a correlation analysis result in a data stream correlation method according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a data association apparatus according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.
In a first aspect, an embodiment of the present invention discloses a data stream association method, as shown in fig. 1. Fig. 1 is a flowchart of a data stream association method according to an embodiment of the present invention, where the method includes:
s101, a data stream association instruction is obtained, wherein the data stream association instruction comprises an input source, an output source and an association condition of each data stream to be associated.
In this step, a data stream association instruction is obtained, where the data stream association instruction may be a data instruction written by a user according to a specified syntax similar to SQL (Structured Query Language). The data stream association instruction comprises an input source, an output source and an association condition of each associated data stream to be associated.
Optionally, the input source includes a storage address and an identifier of each data stream to be associated; the output source comprises a storage location of the correlation analysis result; the association condition comprises an association field and an association time range of each data stream to be associated.
The storage address of the data stream to be associated may be a Kafka address of the distributed publish-subscribe message system, and the identifier may be a unique identifier field such as a title, a name, and a number of the data stream to be associated.
For example, according to the embodiment of the invention, the user click stream and the advertisement characteristic stream can be correlated to obtain a correlation analysis result, so that a better decision is provided for next round of advertisement recommendation. The data stream association command includes the Kafka address and the title of the user click stream; kafka address, title of ad feature stream. The data stream association command further includes a storage address of the association analysis result, for example, an a folder newly created in the local database by the user. And the association condition contained in the data stream association instruction is user ID (identity) and advertisement type. That is, according to the user ID, the advertisement association between each user and each advertisement type is established, and the advertisement preference of each user is determined, so that the advertisement consistent with the advertisement preference of the user is recommended to each user at a later stage.
In addition, according to the embodiment of the invention, the IP (Internet Protocol, Protocol for interconnection between networks) flow in the security cloud can be associated with the port flow to obtain an association analysis result, and the past monitoring on the IP level is refined to the monitoring on the port level. The data stream association command contains the Kafka address and the title of the IP stream; kafka address, title of port flow. The information to be associated also includes a storage address of the result of the association analysis, for example, a B folder newly created by the user in the local database. And the association condition contained in the data stream association instruction is as follows: user IP (identity) is associated for a time range of 12:00-12: 30. Namely, according to the user IP, establishing the network information association between each user IP with the time of 12:00-12:30 and the port to obtain the association analysis result of the user IP and the port, thereby being convenient for monitoring the port information.
In addition, the embodiment of the present invention may also establish associations for three or more data streams to be associated, and the specific application scenarios of the associations are set by implementers, which is not limited herein.
S102, analyzing the data stream association instruction to obtain data stream association information, wherein the data stream association information comprises an input source, an output source and an association condition of each data stream to be associated.
In this step, the data flow related command obtained in step S101 is analyzed, for example, the data flow related command is analyzed by an analyzer, so as to obtain parameters and processing logic included in the data flow related command.
For example, analyzing the data stream association command in S101, and obtaining the input source included in the information to be associated is: clicking the Kafka address and the title of the stream by the user; kafka address, title of ad feature stream. The output sources contained in the data stream association instruction are obtained as follows: the correlation analysis result is stored in the folder a in the local database. And obtaining the association condition contained in the data stream association instruction as follows: user ID (identity), advertisement type.
S103, configuring a Flink SQL frame according to the input source, the output source and the association condition of each data stream to be associated.
After the input source, the output source and the correlation condition of each correlation data stream are obtained, the obtained data are configured in a previously established Flink SQL frame in the step.
Flink SQL is a standard ANSI SQL (American National Standards Institute Structured Query Language) syntax. Part of the core functions are as follows: DDL (Data Definition Language) is used for defining a Data source table and a Data structure table; UDF (Universal Disc Format), UDTF (User-Defined Table-generating Functions), which can customize the complex service requirements of users; JOIN includes JOIN between data streams, JOIN between data streams and data tables, and Windows JOIN.
According to the embodiment of the invention, by utilizing the characteristics of the Flink SQL, a general Flink SQL frame is established in advance, and the Flink SQL frame can identify the data stream association logic written by a user. In this step, the Flink SQL framework may identify an input source, an output source, and an association condition of each data stream to be associated, which are included in the input data stream association information, and perform data configuration according to the data stream association information.
The step of pre-establishing the Flink SQL frame in the embodiment of the invention comprises the following steps:
1. pre-establishing a universal real-time stream computing task (based on Flink SQL), wherein the universal real-time stream computing task comprises the steps of analyzing an input data source, associating a plurality of data streams and outputting data to a specified address;
2. a task management platform (realized) is constructed, a user can compile an SQL task, submit the SQL task and manage the SQL task (stop and restart) on the platform, and after the user compiles the SQL, the user clicks to submit the SQL task, so that the real-time stream calculation task in the S101 can be started to really run;
3. and filling parameters required by the stream task by a user according to the specified format, and subsequently submitting the parameters to a stream calculation task for analysis and operation.
S104, acquiring the data streams to be correlated through the configured Flink SQL frame, and performing correlation analysis on the data streams to be correlated according to the correlation conditions of the configured Flink SQL frame to obtain correlation analysis results.
After the configuration is completed, the Flink SQL frame can be used as a technical basis, the data streams to be associated are obtained according to the configured association logic, and the data streams to be associated are associated and analyzed according to the association conditions of the configured Flink SQL frame, so that an association analysis result is obtained.
In the data stream association method provided by the embodiment of the invention, a data stream association instruction is obtained; analyzing the obtained data stream association instruction to obtain an input source, an output source and an association condition of the associated data stream contained in the data stream association instruction; configuring a preset Flink SQL frame through an input source, an output source and a correlation condition of a correlation data stream; and acquiring each data stream to be correlated through the configured Flink SQL frame, and performing correlation analysis on each data stream to be correlated according to the correlation conditions of the configured Flink SQL frame to obtain a correlation analysis result. In the embodiment of the invention, the preset Flink SQL frame is configured, and the data streams to be correlated are subjected to correlation analysis through the configured Flink SQL frame to obtain the correlation analysis result, so that the correlation of the big data stream is realized.
However, the inventor finds that when the association and logic processing is directly performed in Spark/flight, new writing corresponding to the current task to be associated is required each time. For novices or developers unfamiliar with big data development, the learning cost is high, the development time is long, and therefore, the efficiency of processing the associated tasks is low.
In the embodiment of the invention, the similar SQL grammar is used for writing the Flink SQL frame in advance, so that a user only needs to write different data stream association instructions by using the SQL, then the Flink SQL frame is configured through the data stream association information analyzed by the data stream association instructions, and finally the data streams to be associated are associated and analyzed through the configured Flink SQL frame to obtain the association analysis result, thereby reducing the entry threshold of real-time big data tasks and completing the association of the data streams of the big data tasks. The learning cost of the user on big data calculation frameworks such as Spark/Flink is reduced, the influence of resources on task performance is shielded, and the user can quickly realize and complete multi-stream associated tasks.
Optionally, in an embodiment of the data stream association method of the present invention, in the above S104, the data streams to be associated are obtained through the configured Flink SQL framework, and the data streams to be associated are associated and analyzed according to the association condition of the configured Flink SQL framework, so as to obtain an association analysis result, as shown in fig. 2. Fig. 2 is a flowchart of a method for determining a correlation analysis result in a data stream correlation method according to an embodiment of the present invention, where the method includes:
s201, obtaining the data streams to be associated according to the storage addresses and the identifications of the data streams to be associated.
In this step, each corresponding data stream to be associated may be obtained according to the storage address and the identifier of each data stream to be associated. For example, according to the embodiment of the present invention, the user click stream and the advertisement feature stream may be associated, in this step, the storage location of each data stream to be associated is determined according to the Kafka address of each data stream to be associated, and then each data to be associated and the title of each data to be associated are obtained at the storage location of each data stream to be associated.
S202, analyzing the data streams to be associated in an associated manner according to the associated fields of the data streams to be associated to obtain an associated analysis result containing the fields in the data streams to be associated.
For example, in the embodiment of the present invention, a user click stream is associated with an advertisement feature stream, where the association condition is an association field of a data stream to be associated: user ID (identity), advertisement type. I.e., establishing an advertisement association between each user and each advertisement type according to the user ID. In this step, each field of the user click stream and each field of the advertisement feature stream can be obtained, and then the correlation analysis result of each user ID and each advertisement type and each field containing the user click stream and each field of the advertisement feature stream is established. The correlation analysis result may be a data table.
In addition, in the embodiment of the present invention, an IP (Internet Protocol, Protocol for interconnection between networks) stream in the secure cloud may be associated with the port stream, so as to obtain an association analysis result. Wherein the association condition is that the association field of the data stream to be associated: the user IP is to establish network information association between each user IP and the port according to the user IP. In this step, the fields of the IP stream and the fields of the port stream can be obtained, and the correlation analysis result between each IP stream and the port and including the fields of the IP stream and the fields of the port stream is further established.
Therefore, the embodiment of the invention can realize the correlation analysis of the data streams to be correlated according to the correlation logic configured by the user, and obtain the correlation analysis result containing each field of each data stream to be correlated, so that the obtained correlation analysis result is more comprehensive, and the user can further analyze the data to be analyzed by the user from the obtained correlation analysis result.
Optionally, in an embodiment of the data stream association method of the present invention, the association condition in S104 further includes a preset field; the data streams to be correlated are obtained through the configured Flink SQL framework, and the data streams to be correlated are correlated and analyzed according to the correlation conditions of the configured Flink SQL framework, so as to obtain a correlation analysis result, which can be shown in fig. 3. Fig. 3 is a flowchart of a method for determining a correlation analysis result in a data stream correlation method according to an embodiment of the present invention, where the method includes:
s301, obtaining the data streams to be associated according to the storage address and the identification of the data streams to be associated.
In this step, each corresponding data stream to be associated may be obtained according to the storage address and the identifier of each data stream to be associated. For example, the storage address and the identifier in the embodiment of the present invention are a Kafka address and a title of each data stream to be associated, respectively. In this step, the storage location of each data stream to be associated is determined according to the Kafka address of each data stream to be associated, and then each data to be associated and the title of each data to be associated are obtained at the storage location of each data stream to be associated.
S302, analyzing the data streams to be correlated in a correlated manner according to the correlation fields of the data streams to be correlated and the preset fields to obtain a correlation analysis result at least comprising the preset fields.
For example, in the embodiment of the present invention, a user click stream is associated with an advertisement feature stream, where the association condition is an association field and a preset field of each data stream to be associated, where the association field is a user ID (identity) and an advertisement type; the preset field is a preset time period, e.g., 24 hours, within one week. I.e., establishing an advertisement association between each user and each advertisement type according to the user ID. In this step, an association analysis result between each user ID of the preset time period and each advertisement type may be established according to the user ID, the advertisement type, and the preset time period. The correlation analysis result may be a data table.
In addition, in the embodiment of the present invention, an IP (Internet Protocol, Protocol for interconnection between networks) stream in the secure cloud may be associated with the port stream, so as to obtain an association analysis result. The association condition comprises an association field and a preset field of each data stream to be associated, wherein the association field is a user IP; the preset field is a preset time period, e.g., 24 hours, within one week. Namely, according to the user IP, establishing the network information association of each user IP and the port in the preset time period. In this step, the correlation analysis result between each IP flow and a port in a preset time period may be further preset according to the user ID (identity) and the port information.
Therefore, the embodiment of the invention can realize the correlation analysis of the data stream to be correlated according to the correlation logic configured by the user, and can obtain the correlation analysis result of the index formulated by the user according to the preset field set in the correlation condition, so that the obtained correlation analysis result is more targeted, and the user can clearly determine the current data to be analyzed.
Optionally, in an embodiment of the data stream association method of the present invention, after performing association analysis on each to-be-associated data stream through the configured Flink SQL framework to obtain an association analysis result, the method further includes:
and saving the correlation analysis result to the output source.
For example, according to the embodiment of the present invention, a user click stream is associated with an advertisement feature stream, and an obtained association result provides a decision basis for a next advertisement recommendation, where an output source in the data stream association information specifies that a storage address of an association analysis result is a newly created a folder in a local database, and then stores the association analysis result in the folder a.
In addition, according to the embodiment of the present invention, an IP (Internet Protocol, Protocol for interconnection between networks) stream in a machine is associated with a port stream, and it is possible to perform finer-grained monitoring on an abnormal behavior of the machine (upgrade from an IP level to a port level), where an output source in the data stream association information specifies a storage address of an association analysis result as a B folder newly created in a local database, and stores the association analysis result in the B folder.
Therefore, the embodiment of the invention can store the correlation analysis result, and is convenient for the user to search in the later period.
In a second aspect, an embodiment of the present invention discloses a data association apparatus, as shown in fig. 4. Fig. 4 is a schematic structural diagram of a data association apparatus according to an embodiment of the present invention, where the apparatus includes:
a data stream association instruction obtaining module 401, configured to obtain a data stream association instruction, where the data stream association instruction includes an input source, an output source, and an association condition of each data stream to be associated;
a data stream association instruction parsing module 402, configured to parse the data stream association instruction to obtain data stream association information, where the data stream association information includes an input source, an output source, and an association condition of each to-be-associated data stream;
a Flink SQL framework configuration module 403, configured to configure a Flink SQL framework according to the input source, the output source, and the association condition of each data stream to be associated;
the association analysis result determining module 404 is configured to obtain each data stream to be associated through the configured Flink SQL framework, and perform association analysis on each data stream to be associated according to the association conditions of the configured Flink SQL framework to obtain an association analysis result.
In the data stream association apparatus provided in the embodiment of the present invention, specifically, a data stream association instruction is obtained; analyzing the obtained data stream association instruction to obtain an input source, an output source and an association condition of the associated data stream contained in the data stream association instruction; configuring a preset Flink SQL frame through an input source, an output source and a correlation condition of a correlation data stream; and performing correlation analysis on each data stream to be correlated through the configured Flink SQL frame to obtain a correlation analysis result. In the embodiment of the invention, the preset Flink SQL frame is configured, and the data streams to be correlated are correlated and analyzed through the configured Flink SQL frame to obtain the correlation analysis result, so that the correlation between the real-time data streams is realized in a simple SQL mode.
Optionally, in an embodiment of the data stream association apparatus of the present invention, the input source includes a storage address and an identifier of each data stream to be associated; the output source comprises a storage location of the correlation analysis result; the association condition comprises an association field and an association time range of each data stream to be associated.
Optionally, in an embodiment of the data stream association apparatus of the present invention, the association analysis result determining module 404 includes:
the first to-be-associated data stream acquisition submodule is used for acquiring each to-be-associated data stream according to the storage address and the identification of each to-be-associated data stream;
and the first association analysis result determining submodule is used for performing association analysis on each data stream to be associated according to the association field of each data stream to be associated to obtain an association analysis result containing each field in each data stream to be associated.
Optionally, in an embodiment of the data stream association apparatus of the present invention, the association condition further includes a preset field; the association analysis result determination module 404 includes:
the second data stream to be associated acquisition submodule is used for storing addresses and identifications and acquiring each data stream to be associated;
and the second correlation analysis result determining submodule is used for performing correlation analysis on each data stream to be correlated according to the correlation field of each data stream to be correlated and the preset field to obtain a correlation analysis result at least comprising the preset field.
In another aspect, an embodiment of the present invention discloses an electronic device, as shown in fig. 5. Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, including a processor 501, a communication interface 502, a memory 503 and a communication bus 504, where the processor 501, the communication interface 502 and the memory 503 complete communication with each other through the communication bus 504;
the memory 503 is used for storing computer programs;
the processor 501 is configured to implement the following method steps when executing the program stored in the memory 503:
acquiring a data stream association instruction, wherein the data stream association instruction comprises an input source, an output source and an association condition of each data stream to be associated;
analyzing the data stream association instruction to obtain data stream association information, wherein the data stream association information comprises an input source, an output source and an association condition of each data stream to be associated;
configuring a Flink SQL frame according to the input source, the output source and the association condition of each data stream to be associated;
and acquiring each data stream to be correlated through the configured Flink SQL frame, and performing correlation analysis on each data stream to be correlated according to the correlation conditions of the configured Flink SQL frame to obtain a correlation analysis result.
The communication bus 504 mentioned above for the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface 502 is used for communication between the above-described electronic apparatus and other apparatuses.
The Memory 503 may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory 503 may also be at least one storage device located remotely from the processor 501.
The Processor 501 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
In the electronic device provided by the embodiment of the invention, a data stream association instruction is obtained; analyzing the obtained data stream association instruction to obtain an input source, an output source and an association condition of the associated data stream contained in the data stream association instruction; configuring a preset Flink SQL frame through an input source, an output source and a correlation condition of a correlation data stream; and performing correlation analysis on each data stream to be correlated through the configured Flink SQL frame to obtain a correlation analysis result. In the embodiment of the invention, the preset Flink SQL frame is configured, and the data streams to be correlated are correlated and analyzed through the configured Flink SQL frame to obtain the correlation analysis result, so that the correlation between the real-time data streams is realized in a simple SQL mode.
In another aspect, an embodiment of the present invention discloses a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the method steps of any one of the above data flow association methods are implemented.
In a computer-readable storage medium provided in an embodiment of the present invention, a data flow association instruction is obtained; analyzing the obtained data stream association instruction to obtain an input source, an output source and an association condition of the associated data stream contained in the data stream association instruction; configuring a preset Flink SQL frame through an input source, an output source and a correlation condition of a correlation data stream; and acquiring each data stream to be correlated through the configured Flink SQL frame, and performing correlation analysis on each data stream to be correlated according to the correlation conditions of the configured Flink SQL frame to obtain a correlation analysis result. In the embodiment of the invention, the preset Flink SQL frame is configured, and the data streams to be correlated are correlated and analyzed through the configured Flink SQL frame to obtain the correlation analysis result, so that the correlation between the real-time data streams is realized in a simple SQL mode.
In yet another aspect, an embodiment of the present invention discloses a computer program product containing instructions for implementing the method steps of any one of the above data stream association methods when the computer program product runs on a computer.
In a computer program product including an instruction provided in an embodiment of the present invention, specifically, a data flow association instruction is obtained; analyzing the obtained data stream association instruction to obtain an input source, an output source and an association condition of the associated data stream contained in the data stream association instruction; configuring a preset Flink SQL frame through an input source, an output source and a correlation condition of a correlation data stream; and acquiring each data stream to be correlated through the configured Flink SQL frame, and performing correlation analysis on each data stream to be correlated according to the correlation conditions of the configured Flink SQL frame to obtain a correlation analysis result. In the embodiment of the invention, the preset Flink SQL frame is configured, and the data streams to be correlated are correlated and analyzed through the configured Flink SQL frame to obtain the correlation analysis result, so that the correlation between the real-time data streams is realized in a simple SQL mode.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the device, the electronic apparatus and the storage medium embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and the relevant points can be referred to the partial description of the method embodiments.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (10)

1. A method for associating data streams, the method comprising:
acquiring a data stream association instruction, wherein the data stream association instruction comprises an input source, an output source and an association condition of each data stream to be associated; the association condition comprises an association time range of each data stream to be associated;
analyzing the data stream association instruction to obtain data stream association information, wherein the data stream association information comprises an input source, an output source and an association condition of each data stream to be associated; the input source comprises a storage address and an identification of each data stream to be associated; the output source comprises a storage location of the correlation analysis result;
configuring a Flink SQL frame according to the input source, the output source and the association condition of each data stream to be associated;
and acquiring each data stream to be correlated through the configured Flink SQL frame, and performing correlation analysis on each data stream to be correlated according to the correlation conditions of the configured Flink SQL frame to obtain a correlation analysis result.
2. The method according to claim 1, wherein the association condition further comprises an association field of each of the data streams to be associated.
3. The method according to claim 2, wherein the obtaining each data stream to be associated through the configured Flink SQL framework, and performing association analysis on each data stream to be associated according to the association condition of the configured Flink SQL framework to obtain an association analysis result comprises:
acquiring each data stream to be associated according to the storage address and the identification of each data stream to be associated;
and performing correlation analysis on each data stream to be correlated according to the correlation field of each data stream to be correlated to obtain a correlation analysis result containing each field in each data stream to be correlated.
4. The method of claim 2, wherein the association condition further comprises a preset field; the acquiring each data stream to be associated through the configured Flink SQL framework, and performing association analysis on each data stream to be associated according to the association conditions of the configured Flink SQL framework to obtain an association analysis result, including:
acquiring each data stream to be associated according to the storage address and the identification of each data stream to be associated;
and performing correlation analysis on each data stream to be correlated according to the correlation field of each data stream to be correlated and the preset field to obtain a correlation analysis result at least comprising the preset field.
5. An apparatus for associating data, the apparatus comprising:
the device comprises a data stream association instruction acquisition module, a data stream association instruction acquisition module and a data stream association processing module, wherein the data stream association instruction acquisition module is used for acquiring a data stream association instruction, and the data stream association instruction comprises an input source, an output source and an association condition of each data stream to be associated; the association condition comprises an association time range of each data stream to be associated;
the data stream association instruction analysis module is used for analyzing the data stream association instruction to obtain data stream association information, and the data stream association information comprises an input source, an output source and association conditions of each data stream to be associated; the input source comprises a storage address and an identification of each data stream to be associated; the output source comprises a storage location of the correlation analysis result;
the Flink SQL frame configuration module is used for configuring a Flink SQL frame according to the input source, the output source and the association condition of each data stream to be associated;
and the association analysis result determining module is used for acquiring each data stream to be associated through the configured Flink SQL frame, and performing association analysis on each data stream to be associated according to the association conditions of the configured Flink SQL frame to obtain an association analysis result.
6. The apparatus of claim 5, wherein the association condition further comprises an association field of each of the data streams to be associated.
7. The apparatus of claim 6, wherein the correlation analysis result determination module comprises:
the first to-be-associated data stream acquisition submodule is used for acquiring each to-be-associated data stream according to the storage address and the identification of each to-be-associated data stream;
and the first association analysis result determining submodule is used for performing association analysis on each data stream to be associated according to the association field of each data stream to be associated to obtain an association analysis result containing each field in each data stream to be associated.
8. The apparatus of claim 6, wherein the association condition further comprises a preset field; the correlation analysis result determination module comprises:
the second data stream to be associated acquisition submodule is used for acquiring each data stream to be associated according to the storage address and the identification of each data stream to be associated;
and the second correlation analysis result determining submodule is used for performing correlation analysis on each data stream to be correlated according to the correlation field of each data stream to be correlated and the preset field to obtain a correlation analysis result at least comprising the preset field.
9. An electronic device, comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete communication with each other through the communication bus;
the memory is used for storing a computer program;
the processor, when executing the program stored in the memory, implementing the method steps of any of claims 1-4.
10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 4.
CN201910439751.6A 2019-05-24 2019-05-24 Data stream association method and device, electronic equipment and storage medium Active CN110209700B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910439751.6A CN110209700B (en) 2019-05-24 2019-05-24 Data stream association method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910439751.6A CN110209700B (en) 2019-05-24 2019-05-24 Data stream association method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110209700A CN110209700A (en) 2019-09-06
CN110209700B true CN110209700B (en) 2021-11-26

Family

ID=67788434

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910439751.6A Active CN110209700B (en) 2019-05-24 2019-05-24 Data stream association method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110209700B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781079B (en) * 2019-10-08 2022-08-09 新华三大数据技术有限公司 Data processing flow debugging method and device and electronic equipment
CN111026779B (en) * 2019-12-19 2023-10-17 厦门安胜网络科技有限公司 Data processing method, device and storage medium based on Flink SQL
CN111931066B (en) * 2020-09-11 2021-09-07 四川新网银行股份有限公司 Real-time recommendation system design method
CN112286506B (en) * 2020-10-30 2024-05-07 杭州海康威视数字技术股份有限公司 Data association method, device, server and storage medium
CN112434022A (en) * 2020-12-08 2021-03-02 北京北信源软件股份有限公司 Data association analysis method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145055A (en) * 2018-09-07 2019-01-04 杭州玳数科技有限公司 A kind of method of data synchronization and system based on Flink
CN109254982A (en) * 2018-08-31 2019-01-22 杭州安恒信息技术股份有限公司 A kind of stream data processing method, system, device and computer readable storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9372907B2 (en) * 2013-11-26 2016-06-21 Sap Se Table placement in distributed databases

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109254982A (en) * 2018-08-31 2019-01-22 杭州安恒信息技术股份有限公司 A kind of stream data processing method, system, device and computer readable storage medium
CN109145055A (en) * 2018-09-07 2019-01-04 杭州玳数科技有限公司 A kind of method of data synchronization and system based on Flink

Also Published As

Publication number Publication date
CN110209700A (en) 2019-09-06

Similar Documents

Publication Publication Date Title
CN110209700B (en) Data stream association method and device, electronic equipment and storage medium
US9953639B2 (en) Voice recognition system and construction method thereof
CN110321154B (en) Micro-service interface information display method and device and electronic equipment
CN110427188B (en) Configuration method, device, equipment and storage medium of single-test assertion program
US9037552B2 (en) Methods for analyzing a database and devices thereof
CN111522728A (en) Method for generating automatic test case, electronic device and readable storage medium
US9582586B2 (en) Massive rule-based classification engine
CN115587575A (en) Data table creation method, target data query method, device and equipment
CN114116065A (en) Method and device for acquiring topological graph data object and electronic equipment
CN112541002A (en) Program language conversion method, device, electronic equipment and storage medium
CN113901083A (en) Heterogeneous data source operation resource analysis positioning method and equipment based on multiple analyzers
CN107391528B (en) Front-end component dependent information searching method and equipment
US20150199431A1 (en) System and Method of Search for Improving Relevance Based on Device Parameters
CN114546830A (en) Regression testing method, regression testing device, electronic equipment and storage medium
CN111427784B (en) Data acquisition method, device, equipment and storage medium
US8938520B2 (en) Methods and systems for smart adapters in a social media content analytics environment
CN112861182A (en) Database query method and system, computer equipment and storage medium
US10003492B2 (en) Systems and methods for managing data related to network elements from multiple sources
CN113792138B (en) Report generation method and device, electronic equipment and storage medium
CN110806967A (en) Unit testing method and device
US20140201721A1 (en) Framework and repository for analysis of software products
CN113419740A (en) Program data stream analysis method and device, electronic device and readable storage medium
CN108459940B (en) Configuration information modification method and device of application performance management system and electronic equipment
CN110674386B (en) Resource recommendation method, device and storage medium
CN107169133B (en) Snapshot capturing method, device, server and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant