CN116755708A - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN116755708A
CN116755708A CN202310801089.0A CN202310801089A CN116755708A CN 116755708 A CN116755708 A CN 116755708A CN 202310801089 A CN202310801089 A CN 202310801089A CN 116755708 A CN116755708 A CN 116755708A
Authority
CN
China
Prior art keywords
data
processed
data processing
processing
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310801089.0A
Other languages
Chinese (zh)
Inventor
张志鹏
徐怡涵
王健帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202310801089.0A priority Critical patent/CN116755708A/en
Publication of CN116755708A publication Critical patent/CN116755708A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/71Version control; Configuration management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The present disclosure provides a data processing method, apparatus, device, and storage medium, which may be applied to the technical fields of computers, big data processing, and financial science and technology. The method comprises the following steps: in response to the acquired data to be processed, invoking a data processing toolkit with the real-time computing platform; determining a target data processing tool matched with the data to be processed from a plurality of data processing tools in the data processing tool package; the following operations are performed with the target data processing tool: acquiring historical data matched with data to be processed; and processing the data to be processed based on the historical data to obtain target data.

Description

Data processing method, device, equipment and storage medium
Technical Field
The present disclosure relates to the technical fields of computers, big data processing and financial science and technology, and in particular, to a data processing method, apparatus, device, storage medium and program product.
Background
With the rapid development of the big data age, the capacity requirements of the data processing system are increasingly improved in the industry. In view of the real-time data analysis requirement in the financial field, due to the instantaneous change of market quotation information, the requirement on the data processing efficiency of the data processing system is extremely high, and for example, the monitoring service and other services, the frequently-changed monitoring requirement also provides high challenges for the data processing efficiency of the system.
In the traditional monitoring data processing method, each monitoring data needs to develop a set of processing logic, the logic is difficult to adjust, the adjustment can be realized by recompilation and publishing, and the method is not suitable for the agile processing requirement of a large-scale data center on heterogeneous multi-type monitoring data.
Disclosure of Invention
In view of the foregoing, the present disclosure provides a data method, apparatus, device, storage medium, and program product.
In a first aspect, embodiments of the present disclosure provide a data method, including:
in response to the acquired data to be processed, invoking a data processing toolkit with the real-time computing platform;
determining a target data processing tool matched with the data to be processed from a plurality of data processing tools in the data processing tool package;
the following operations are performed by the target data processing tool: acquiring historical data matched with the data to be processed; and processing the data to be processed based on the historical data to obtain target data.
With reference to the first aspect, in one possible implementation manner, determining, from a plurality of data processing tools in the data processing tool package, a target data processing tool that matches the data to be processed includes:
Determining the data type of the data to be processed; and determining the target data processing tool from the plurality of data processing tools based on the data type and a predetermined mapping table, wherein the predetermined mapping table is used for representing a mapping relation between the data processing tool and the data type.
With reference to the first aspect, in one possible implementation manner, processing the data to be processed based on the history data to obtain target data includes:
based on the historical data, carrying out data processing on the data to be processed to obtain a processing result; and obtaining the target data based on the processing result and the configuration data of the data to be processed.
With reference to the first aspect, in one possible implementation manner, the performing, based on the history data, data processing on the to-be-processed data to obtain a processing result includes:
determining a target field of the data to be processed;
determining the field type of the target field;
processing the target field according to a processing mode matched with the field type to obtain an initial processing result; and
and obtaining the processing result based on the initial processing result and the historical data.
With reference to the first aspect, in one possible implementation manner, the acquiring historical data matched with the data to be processed includes:
determining a storage space matched with the data to be processed from a plurality of storage spaces with different types; and acquiring the history data from the storage space by using a data calling mode matched with the storage space.
With reference to the first aspect, in one possible implementation manner, the method for processing data further includes:
responding to the generated data processing file, and converting the generated data processing file by using a first compiler to obtain a generated conversion file; and generating, with a second compiler, the data processing toolkit based on the generated conversion file and associated parameters, wherein the associated parameters are parameters for adapting the generated conversion file to the real-time computing platform.
With reference to the first aspect, in one possible implementation manner, the method for processing data further includes:
responsive to the data processing toolkit having been generated, initiating a data processing task; and responding to the starting of the data processing task, and acquiring the data to be processed in real time.
With reference to the first aspect, in one possible implementation manner, the determining a data type of the data to be processed includes:
determining, for each of a plurality of data types, a predetermined field that matches the data type;
matching the predetermined field with the field in the data to be processed to obtain a matching result; and determining the data type of the data to be processed based on a plurality of matching results.
With reference to the first aspect, in one possible implementation manner, the data calling manner includes at least one of the following:
the data calling mode of the query statement and the data calling mode of the built-in function.
In a second aspect, embodiments of the present disclosure provide a data processing apparatus, including:
the calling module is used for calling a data processing tool package by utilizing the real-time computing platform in response to the acquired data to be processed;
a determining module, configured to determine a target data processing tool that matches the data to be processed from a plurality of data processing tools in the data processing tool package;
the execution module: the target data processing tool is used for executing the following operations, and comprises an acquisition sub-module: the historical data matched with the data to be processed is obtained;
And a processing sub-module: and the processing module is used for processing the data to be processed based on the historical data to obtain target data.
With reference to the second aspect, in one possible implementation manner, the determining module includes:
the first determining submodule is used for determining the data type of the data to be processed;
and the second determining submodule is used for determining the target data processing tool from the plurality of data processing tools based on the data type and a preset mapping table, wherein the preset mapping table is used for representing the mapping relation between the data processing tool and the data type.
With reference to the second aspect, in one possible implementation manner, the processing sub-module includes:
the processing unit is used for carrying out data processing on the data to be processed based on the historical data to obtain a processing result;
and a generating unit configured to generate the target data based on the processing result and the configuration data of the data to be processed.
With reference to the second aspect, in one possible implementation manner, the processing unit includes:
a first determining subunit, configured to determine a target field of the data to be processed;
a second determining subunit, configured to determine a field type of the target field;
The third processing subunit is used for processing the target field according to a processing mode matched with the field type to obtain an initial processing result;
and the generation subunit is used for generating the processing result based on the initial processing result and the historical data.
With reference to the second aspect, in one possible implementation manner, the acquiring submodule includes:
a first determination unit: the storage space is used for determining the storage space matched with the data to be processed from a plurality of storage spaces with different types;
an acquisition unit: the history data is obtained from the storage space by using a data calling mode matched with the storage space.
With reference to the second aspect, in a possible implementation manner, the data processing apparatus further includes:
the conversion module is used for responding to the generated data processing file, converting the generated data processing file by using a first compiler to obtain a generated conversion file;
and the generation module is used for generating the data processing tool package based on the generated conversion file and related parameters by using a second compiler, wherein the related parameters are parameters for adapting the generated conversion file to the real-time computing platform.
With reference to the second aspect, in a possible implementation manner, the data processing apparatus further includes:
the starting module is used for responding to the generated data processing tool package and starting a data processing task;
and the acquisition module is used for responding to the starting of the data processing task and acquiring the data to be processed in real time.
With reference to the second aspect, in one possible implementation manner, the first determining sub-module includes:
a second determining unit configured to determine, for each of a plurality of data types, a predetermined field that matches the data type;
matching unit: the method is used for matching the predetermined field with the field in the data to be processed to obtain a matching result;
and a third determining unit configured to determine a data type of the data to be processed based on a plurality of the matching results.
With reference to the second aspect, in one possible implementation manner, the data calling manner includes at least one of the following:
the data calling mode of the query statement and the data calling mode of the built-in function.
In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method described above.
In a fourth aspect, embodiments of the present disclosure provide a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above-described method.
In a fifth aspect, embodiments of the present disclosure provide a computer program product comprising a computer program which, when executed by a processor, implements the above method.
According to the data processing method, the device, the equipment, the medium and the program product provided by the disclosure, a plurality of data processing tools in a data processing tool package are called by utilizing a real-time computing platform to determine a target data processing tool in response to acquired data to be processed, then historical data matched with the data to be processed are acquired by utilizing the target data processing tool, and the data to be processed is processed based on the historical data to obtain the target data. Because a plurality of data processing tools in the data processing tool package are called, a plurality of types of data to be processed can be processed by using the plurality of data processing tools, a processing algorithm does not need to be independently researched and developed for each type of monitoring data code, and the flexibility and the universality are higher; and meanwhile, the target data processing tool is used for processing the data to be processed and the historical data, so that the integrity and the efficiency of data processing are improved.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be more apparent from the following description of embodiments of the disclosure with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an application scenario diagram of a data processing method, apparatus, device, medium and program product according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a data processing method according to an embodiment of the disclosure;
FIG. 3 schematically illustrates a system architecture diagram of a data processing method according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a block diagram of a data processing apparatus according to an embodiment of the present disclosure;
fig. 5 schematically illustrates a block diagram of an electronic device adapted to implement a data processing method according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
In the technical scheme of the disclosure, the related data (such as including but not limited to personal information of a user) are collected, stored, used, processed, transmitted, provided, disclosed, applied and the like, all conform to the regulations of related laws and regulations, necessary security measures are adopted, and the public welcome is not violated.
In the related technology, the data processing platform lacks perfect grammar and semantic support, is difficult to meet the complex and diverse business requirements in practical application, has relatively large difficulty in developing codes, has relatively high requirements on developers, has low development efficiency and has no universality. Based on this, the embodiment of the disclosure provides a data processing method, which can process multiple types of data to be processed by using multiple data processing tools, has high flexibility and strong universality, and can be applied to enterprise data operation and maintenance systems, such as operation and maintenance monitoring systems, in various fields and industries, and the like, and is not limited herein. Embodiments of the present disclosure are described in detail below with reference to the attached drawings.
Embodiments of the present disclosure provide a data method comprising: in response to the acquired data to be processed, invoking a data processing toolkit with the real-time computing platform; determining a target data processing tool matched with the data to be processed from a plurality of data processing tools in the data processing tool package; the following operations are performed with the target data processing tool: acquiring historical data matched with data to be processed; and processing the data to be processed based on the historical data to obtain target data.
Fig. 1 schematically illustrates an application scenario diagram of a data processing method, apparatus, device, medium and program product according to an embodiment of the present disclosure.
As shown in fig. 1, an application scenario 100 according to this embodiment may include a first terminal device 101, a second terminal device 102, and a third terminal device 103. The network 104 is a medium used to provide a communication link between the first terminal device 101, the second terminal device 102, the third terminal device 103, and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 through the network 104 using at least one of the first terminal device 101, the second terminal device 102, the third terminal device 103, to receive or send messages, etc. Various communication client applications, such as a shopping class application, a web browser application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only) may be installed on the first terminal device 101, the second terminal device 102, and the third terminal device 103.
The first terminal device 101, the second terminal device 102, the third terminal device 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for websites browsed by the user using the first terminal device 101, the second terminal device 102, and the third terminal device 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that the data processing method provided in the embodiments of the present disclosure may be generally performed by the server 105. Accordingly, the data processing apparatus provided by the embodiments of the present disclosure may be generally provided in the server 105. The data processing method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105. Accordingly, the data processing apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The data processing method of the disclosed embodiment will be described in detail below with reference to fig. 2 to 5 based on the scenario described in fig. 1.
Fig. 2 schematically illustrates a flow chart of a data processing method according to an embodiment of the present disclosure.
As shown in fig. 2, the data processing method of this embodiment includes operations S210 to S240.
In response to having acquired the data to be processed, a data processing toolkit is invoked using the real-time computing platform in operation S210.
In some possible embodiments, the data to be processed may be streaming data of different fields, different batches, different types, and in particular, in this disclosure, the data to be processed may be expressed as: monitor data 1, monitor data 2, monitor data 3, …, monitor data n.
The real-time computing platform may include a Flink. The flank is a framework and a distributed processing engine for stateful computation of unbounded and bounded data streams, and is designed to run in all common clustered environments, perform computation at memory speed and on any scale, and enable real-time computation and off-line computation of data. The data processing tool package is a jar package which can be executed by a real-time computing platform and is compiled, wherein the jar package is a file format which is irrelevant to the platform and can be used for combining a plurality of files into one file.
In some possible embodiments, the real-time computing platform may further include Spark Streaming and Storm, and the specific real-time computing platform type is not limited herein.
According to embodiments of the present disclosure, a data processing toolkit is invoked with a real-time computing platform, which may include a plurality of data processing tools.
In operation S220, a target data processing tool that matches the data to be processed is determined from among a plurality of data processing tools in the data processing tool package.
In some possible embodiments, the data processing tool may be a data processing rule generated based on DSL policy files in a domain language. For example, the data processing rules are generated based on DSL policy file 1, DSL policy files 2, …, DSL policy file n.
DSL (Domain Specific Language) is a computer programming language with limited expressivity for a domain, often used in a given domain or problem. In particular, in the present disclosure, a mode read-write database similar to JavaScript grammar and directly embedded SQL statement is supported on DSL language design, and meanwhile, built-in functions are supported to directly read redis, and DSL grammar features include, but are not limited to, support of weak types, character string processing, digital computation, logic computation, direct embedding of SQL, redis read-write caching, and the like.
According to the embodiment of the disclosure, a plurality of data processing tools are utilized to process various types of data to be processed, so that the universality is high.
In operation S230, the following operations are performed using the target data processing tool: and acquiring historical data matched with the data to be processed.
In some possible embodiments, the historical data may be monitored data in a period of time past by the user, including reading data in a database, cached data in a redis, specifically into the present disclosure, for example, reading data in a relational database for a period of time, reading data in a temporal database for a period of time, or reading cached data in a redis for a period of time. And judging whether the historical data is matched with the data to be processed or not, wherein the judgment is based on a matching result obtained by comparing different fields in the historical data with each field in the data to be processed.
In operation S240, the data to be processed is processed based on the history data to obtain target data.
In some possible embodiments, processing the data to be processed based on the historical data may be based on different fields in the historical data, matching with each field in the data to be processed, and if so, processing the data to be processed by using the corresponding DSL policy file to obtain the target data. Wherein the processing logic of the data has been generated in advance in the DSL policy file.
According to the embodiment of the disclosure, a plurality of data processing tools in a data processing tool package are called by using a real-time computing platform to determine a target data processing tool in response to acquired data to be processed, historical data matched with the data to be processed is acquired by using the target data processing tool, and the data to be processed is processed based on the historical data to obtain target data. Because a plurality of data processing tools in the data processing tool package are called, a plurality of types of data to be processed can be processed by using the plurality of data processing tools, a processing algorithm does not need to be independently researched and developed for each type of monitoring data code, and the flexibility and the universality are higher; and meanwhile, the target data processing tool is used for processing the data to be processed and the historical data, so that the integrity and the efficiency of data processing are improved.
According to an embodiment of the present disclosure, determining a target data processing tool that matches data to be processed from a plurality of data processing tools in a data processing tool package, includes: determining the data type of the data to be processed; and determining a target data processing tool from the plurality of data processing tools based on the data type and the predetermined mapping table.
According to an embodiment of the present disclosure, a predetermined mapping table is used to characterize the mapping relationship between the data processing tool and the data type.
In some possible embodiments, the data types of the data to be processed include, but are not limited to, device data type, network data type, application data type, host data type, middleware data type, and the like. The target data processing tool may be a data processing rule generated based on a target data type and a predetermined mapping table, specifically into the present disclosure, for example, a target device data processing rule generated based on a device data type and a predetermined mapping table.
According to an embodiment of the present disclosure, processing data to be processed based on historical data, obtaining target data includes: based on the historical data, carrying out data processing on the data to be processed to obtain a processing result; and generating target data based on the processing result and the configuration data of the data to be processed.
In some possible embodiments, based on the historical data in the database or the historical cache data in the redis in a certain period of time, the configuration data of the data to be processed is read at the same time, the data to be processed is processed by judging whether the configuration data is matched with the information in the historical data, if so, the data to be processed and the historical data are combined, otherwise, the target data (new alarm data) is generated, and the target data (new alarm data) is stored in the database. The configuration data of the data to be processed may be third party configuration data in the real-time monitoring data, and be sub-data of the third party data other than the data to be processed.
According to the embodiment of the disclosure, based on the matching result, the data to be processed and the historical data are combined or new alarm data are generated, so that the enrichment and updating of the database are realized.
According to an embodiment of the present disclosure, based on historical data, performing data processing on data to be processed to obtain a processing result, including: determining a target field of data to be processed; determining a field type of the target field; processing the target field according to a processing mode matched with the field type to obtain an initial processing result; and generating a processing result based on the initial processing result and the history data.
In some possible embodiments, the target field of the data to be processed may be various fields in the current monitoring data, such as field 1, field 2, …, field n. The field types include, but are not limited to, numeric, date type, character type, and the like. The processing means matched with the field type may be different processing logic set according to different field types, and the processing logic is already set in the DSL policy file in advance.
In some possible embodiments, the initial processing result may be a matching result obtained by matching the field type of the target field with the field types existing in the database. The processing result may be a result generated by a preset processing logic based on the initial processing result and the history data. And the generated processing result is stored in a database or redis. Specifically, in this disclosure, for example, for an "unresponsive" field in monitored data 1, reading is performed for a time period longer than 10s for a number of occurrences of 1 for an "unresponsive" target field, and in combination with an initial reading result, a time period longer than 20s for a number of occurrences of 2 for an "unresponsive" field in historical monitored data 1 is combined, and the generated processing result is: the number of times of the non-response field is 3, the time length is accumulated to be more than 30s, and the processing result is updated to a database or redis.
According to an embodiment of the present disclosure, acquiring historical data that matches data to be processed includes: determining a storage space matched with the data to be processed from a plurality of storage spaces with different types; and acquiring historical data from the storage space by utilizing a data calling mode matched with the storage space.
In some possible embodiments, the plurality of storage spaces of different types may be different types of databases, including but not limited to relational databases, time series databases, or directly readable rediss. The data calling mode may be a processing mode of calling the history data according to a keyword in a predetermined default function.
According to an embodiment of the disclosure, in response to a generated data processing file, converting the generated data processing file with a first compiler to obtain a generated conversion file; and generating, with the second compiler, a data processing toolkit based on the generated conversion file and the associated parameters.
According to embodiments of the present disclosure, the associated parameters may be parameters for adapting the generated conversion file to the real-time computing platform.
In some possible embodiments, the first compiler may be a DSL compiler that configures flex and bison files according to a DSL syntax specification, generating. Wherein the flex file is used as a lexical analysis that breaks down the input into meaningful word blocks, called token; the bison file is used as a parse that determines how the word blocks relate to each other, for example, using a syntax tree representation. The generated conversion file may be a DSL policy file configured according to DSL syntax specifications, converting the generated policy file into Java code. The second compiler is a Java compiler, including but not limited to Eclipse, myEclipse, netBeans, intelliJ IDEA, etc., and the specific compiler type is not limited herein. Specifically, in the present disclosure, a Java compiler is used to package the generated Java code, together with a built-in library and a library related to the real-time computing platform, to form a data processing toolkit executable by the real-time computing platform, where the data processing toolkit may also be referred to as a jar package. Wherein, the built-in library and related library may be lib package, i.e. library, which is a collection of libraries, and library may contain multiple jar packages.
According to the embodiment of the disclosure, the generated processing file is utilized to generate the jar packet executable by the real-time computing platform according to the DSL compiler and the Java compiler, so that the generation mode of the data processing toolkit is enriched, and the configuration flexibility of the data processing toolkit is improved.
In accordance with an embodiment of the present disclosure, a data processing task is initiated in response to a data processing toolkit having been generated; and responding to the start of the data processing task, and acquiring the data to be processed in real time.
In some possible embodiments, the user may perform corresponding processing on the DSL policy file based on DSL rules, where the corresponding processing includes, but is not limited to, policy start-stop, editing (adding, modifying, deleting), and the like. For example, the user may update the DSL rules according to its own service processing logic, and save the update of the rules, where the save operation performed by the user may trigger the enabling of a new policy, and finally implement the continuous update of the DSL policy file.
According to the embodiment of the disclosure, the user improves the intelligence of DSL processing strategy updating by carrying out corresponding strategy updating, modifying, deleting and other operations on the DSL strategy file.
According to an embodiment of the present disclosure, determining a data type of data to be processed includes: determining, for each of a plurality of data types, a predetermined field that matches the data type; matching the predetermined field with the field in the data to be processed to obtain a matching result; and determining the data type of the data to be processed based on the plurality of matching results.
In some possible embodiments, the monitoring data types include, but are not limited to, a device data type, a network data type, an application data type, a host data type, a middleware data type, and the like, each having a predetermined field matching thereto. For example, the predetermined fields that match the network data type are field a, field b, and field c. The fields in the data to be processed may be field 1, field 2, and field 3, for example, if field 1, field 2, and field 3 in the data to be processed are not matched with predetermined field a, field b, and field c in the network data type, the matching of field 1, field 2, and field 3 in the data to be processed with predetermined field d, field e, and field f in the application data type is continued, and if the matching is performed, the data type of the data to be processed is the application data type. And similarly, matching the predetermined field with the field in the data to be processed in sequence until a matching result of field matching occurs. It should be noted that if matching is performed sequentially, a new DSL policy file is generated if the matching result of the matching is not yet obtained.
According to an embodiment of the present disclosure, the data call means includes at least one of: the data calling mode of the query statement and the data calling mode of the built-in function.
It should be noted that, unless there is an execution sequence between different operations or an execution sequence between different operations in technical implementation, the execution sequence between multiple operations may be different, and multiple operations may also be executed simultaneously in the embodiment of the disclosure.
Fig. 3 schematically illustrates a system architecture diagram of a data processing method according to an embodiment of the present disclosure.
According to an embodiment of the present disclosure, as shown in fig. 3, a data processing system includes: policy management module 310, policy compilation module 320, data input module 330, data storage module 340, and data processing module 350.
The policy management module 310 is configured to generate a DSL policy file main, including a DSL policy file 1, DSL policy files 2, …, and DSL policy file n, where a user may generate a policy file according to service characteristics and requirements of the user, and continuously update the DSL rule base by storing the newly generated policy file.
The policy compiling module 320 is configured to generate a DSL compiler, i.e. a first compiler, based on DSL rules in combination with the flex file and the bison file, convert data to be processed into executable Java code, and add lib packets generated by the built-in library and the library related to the real-time computing platform to generate a Java compiler, package the data, and form a DSL policy jar packet executable by the real-time computing platform.
The data input module 330 is configured to generate, according to data types, monitoring data 1, monitoring data 2, monitoring data 3, and monitoring data n … for data in different fields, different types, and different batches, and input each data to be processed to the data processing module for data processing.
The data storage module 340 is configured to store data in different types and different time periods, and may be different types of databases, such as a relational database, a time sequence database, or a redis that is directly readable.
The data processing module 350 is configured to perform data processing on the converted DSL policy jar packet to obtain a processing result.
Based on the data processing method, the disclosure also provides a data processing device. The device will be described in detail below in connection with fig. 4.
Fig. 4 schematically shows a block diagram of a data processing apparatus according to an embodiment of the present disclosure.
As shown in fig. 4, the data processing apparatus 400 of this embodiment includes a calling module 410, a determining module 420, and an executing module 430.
A calling module 410 for calling a data processing toolkit with the real-time computing platform in response to the acquired data to be processed.
A determining module 420, configured to determine a target data processing tool that matches the data to be processed from a plurality of data processing tools in the data processing tool package.
Execution module 430: for performing the following operations with the target data processing tool, comprising: and (3) an acquisition sub-module: and the historical data matched with the data to be processed is acquired.
And a processing sub-module: and the data processing module is used for processing the data to be processed based on the historical data to obtain target data.
According to an embodiment of the present disclosure, the determining module includes: a first determination sub-module and a second determination sub-module.
And the first determining submodule is used for determining the data type of the data to be processed.
A second determination sub-module for determining a target data processing tool from the plurality of data processing tools based on the data type and a predetermined mapping table. Wherein the predetermined mapping table is used for characterizing a mapping relationship between the data processing tool and the data type.
According to an embodiment of the present disclosure, a processing sub-module includes: a processing unit and a generating unit.
And the processing unit is used for carrying out data processing on the data to be processed based on the historical data to obtain a processing result.
And the generating unit is used for generating target data based on the processing result and the configuration data of the data to be processed.
According to an embodiment of the present disclosure, a processing unit includes: the system comprises a first determining subunit, a second determining subunit, a third processing subunit and a generating subunit.
A first determining subunit, configured to determine a target field of the data to be processed.
And a second determining subunit, configured to determine a field type of the target field.
And the third processing subunit is used for processing the target field according to the processing mode matched with the field type to obtain an initial processing result.
And the generating subunit is used for generating a processing result based on the initial processing result and the historical data.
According to an embodiment of the present disclosure, the obtaining submodule includes: a first determination unit and an acquisition unit.
A first determination unit: for determining a memory space matching the data to be processed from a plurality of memory spaces of different types.
An acquisition unit: the method is used for acquiring historical data from the storage space by utilizing a data calling mode matched with the storage space.
According to an embodiment of the present disclosure, the data processing apparatus further includes: a conversion module and a generation module.
And the conversion module is used for responding to the generated data processing file, converting the generated data processing file by utilizing the first compiler, and obtaining the generated conversion file.
The generation module is used for generating a data processing tool package based on the generated conversion file and associated parameters by using the second compiler, wherein the associated parameters are parameters for adapting the generated conversion file to the real-time computing platform.
According to an embodiment of the present disclosure, the data processing apparatus further includes: a starting module and an obtaining module.
And the starting module is used for responding to the generated data processing tool package and starting the data processing task.
And the acquisition module is used for responding to the starting of the data processing task and acquiring the data to be processed in real time.
According to an embodiment of the present disclosure, a first determining sub-module includes: a second determining unit, a matching unit and a third determining unit.
And a second determining unit for determining, for each of the plurality of data types, a predetermined field that matches the data type.
Matching unit: and the method is used for matching the predetermined field with the field in the data to be processed to obtain a matching result.
And a third determining unit for determining the data type of the data to be processed based on the plurality of matching results.
According to an embodiment of the present disclosure, the data call means includes at least one of: the data calling mode of the query statement and the data calling mode of the built-in function.
Any of the call module 410, the determination module 420, and the execution module 430 may be combined in one module to be implemented, or any of the modules may be split into multiple modules, according to embodiments of the present disclosure. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. At least one of the invocation module 410, the determination module 420, and the execution module 430 may be implemented, at least in part, as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system-on-chip, a system-on-substrate, a system-on-package, an Application Specific Integrated Circuit (ASIC), or by hardware or firmware, such as any other reasonable way of integrating or packaging the circuitry, or in any one of or a suitable combination of three of software, hardware, and firmware implementations. Alternatively, at least one of the calling module 410, the determining module 420, and the executing module 430 may be at least partially implemented as a computer program module that, when executed, performs the corresponding functions.
Fig. 5 schematically illustrates a block diagram of an electronic device adapted to implement a data processing method according to an embodiment of the disclosure.
As shown in fig. 5, an electronic device 500 according to an embodiment of the present disclosure includes a processor 501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. The processor 501 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. The processor 501 may also include on-board memory for caching purposes. The processor 501 may comprise a single processing unit or a plurality of processing units for performing different actions of the method flows according to embodiments of the disclosure.
In the RAM 503, various programs and data required for the operation of the electronic apparatus 500 are stored. The processor 501, ROM 502, and RAM 503 are connected to each other by a bus 504. The processor 501 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 502 and/or the RAM 503. Note that the program may be stored in one or more memories other than the ROM 502 and the RAM 503. The processor 501 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in the one or more memories.
According to an embodiment of the present disclosure, the electronic device 500 may also include an input/output (I/O) interface 505, the input/output (I/O) interface 505 also being connected to the bus 504. The electronic device 500 may also include one or more of the following components connected to an input/output (I/O) interface 505: an input section 506 including a keyboard, a mouse, and the like; an output portion 507 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker, and the like; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The drive 510 is also connected to an input/output (I/O) interface 505 as needed. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as needed so that a computer program read therefrom is mounted into the storage section 508 as needed.
The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 502 and/or RAM 503 and/or one or more memories other than ROM 502 and RAM 503 described above.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowcharts. The program code means for causing a computer system to carry out the data processing methods provided by the embodiments of the present disclosure when the computer program product is run on the computer system.
The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 501. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed, and downloaded and installed in the form of a signal on a network medium, and/or installed from a removable medium 511 via the communication portion 509. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 509, and/or installed from the removable media 511. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 501. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.
The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims (13)

1. A data processing method, comprising:
in response to the acquired data to be processed, invoking a data processing toolkit with the real-time computing platform;
Determining a target data processing tool matched with the data to be processed from a plurality of data processing tools in the data processing tool package;
performing the following operations with the target data processing tool:
acquiring historical data matched with the data to be processed; and
and processing the data to be processed based on the historical data to obtain target data.
2. The method of claim 1, wherein the determining a target data processing tool from a plurality of data processing tools in the data processing tool package that matches the data to be processed comprises:
determining the data type of the data to be processed; and
the target data processing tool is determined from the plurality of data processing tools based on the data type and a predetermined mapping table, wherein the predetermined mapping table is used to characterize a mapping relationship between the data processing tool and the data type.
3. The method of claim 1, wherein the processing the data to be processed based on the historical data to obtain target data comprises:
based on the historical data, carrying out data processing on the data to be processed to obtain a processing result; and
And generating the target data based on the processing result and the configuration data of the data to be processed.
4. A method according to claim 3, wherein said performing data processing on the data to be processed based on the history data to obtain a processing result includes:
determining a target field of the data to be processed;
determining a field type of the target field;
processing the target field according to a processing mode matched with the field type to obtain an initial processing result; and
and generating the processing result based on the initial processing result and the historical data.
5. The method of claim 1, wherein the obtaining historical data that matches the data to be processed comprises:
determining a storage space matched with the data to be processed from a plurality of storage spaces with different types; and
and acquiring the historical data from the storage space by using a data calling mode matched with the storage space.
6. The method of claim 1, further comprising:
responding to the generated data processing file, and converting the generated data processing file by using a first compiler to obtain a generated conversion file; and
Generating, with a second compiler, the data processing toolkit based on the generated conversion file and associated parameters, wherein the associated parameters are parameters for adapting the generated conversion file to the real-time computing platform.
7. The method of claim 1, further comprising:
responsive to the data processing toolkit having been generated, initiating a data processing task; and
and responding to the data processing task started, and acquiring the data to be processed in real time.
8. The method of claim 2, wherein the determining the data type of the data to be processed comprises:
determining, for each of a plurality of data types, a predetermined field that matches the data type;
matching the predetermined field with the field in the data to be processed to obtain a matching result; and
and determining the data type of the data to be processed based on a plurality of matching results.
9. The method of claim 5, wherein the data call style comprises at least one of:
the data calling mode of the query statement and the data calling mode of the built-in function.
10. A data processing apparatus comprising:
The calling module is used for calling a data processing tool package by utilizing the real-time computing platform in response to the acquired data to be processed;
a determining module, configured to determine a target data processing tool matched with the data to be processed from a plurality of data processing tools in the data processing tool package;
the execution module: for performing the following operations with the target data processing tool, comprising:
and (3) an acquisition sub-module: the historical data matched with the data to be processed is obtained;
and a processing sub-module: and the data processing module is used for processing the data to be processed based on the historical data to obtain target data.
11. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-9.
12. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1 to 9.
13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 9.
CN202310801089.0A 2023-06-30 2023-06-30 Data processing method, device, equipment and storage medium Pending CN116755708A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310801089.0A CN116755708A (en) 2023-06-30 2023-06-30 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310801089.0A CN116755708A (en) 2023-06-30 2023-06-30 Data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116755708A true CN116755708A (en) 2023-09-15

Family

ID=87956966

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310801089.0A Pending CN116755708A (en) 2023-06-30 2023-06-30 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116755708A (en)

Similar Documents

Publication Publication Date Title
US10318595B2 (en) Analytics based on pipes programming model
CN107506256B (en) Method and device for monitoring crash data
CN110795455A (en) Dependency relationship analysis method, electronic device, computer device and readable storage medium
CN109783562B (en) Service processing method and device
US8407713B2 (en) Infrastructure of data summarization including light programs and helper steps
CN115599386A (en) Code generation method, device, equipment and storage medium
CN108959294B (en) Method and device for accessing search engine
CN116560661A (en) Code optimization method, device, equipment and storage medium
CN116594683A (en) Code annotation information generation method, device, equipment and storage medium
CN116414855A (en) Information processing method and device, electronic equipment and computer readable storage medium
CN116755708A (en) Data processing method, device, equipment and storage medium
CN110806967A (en) Unit testing method and device
CN113392311A (en) Field searching method, field searching device, electronic equipment and storage medium
CN113419740A (en) Program data stream analysis method and device, electronic device and readable storage medium
CN114461909A (en) Information processing method, information processing apparatus, electronic device, and storage medium
CN114090514A (en) Log retrieval method and device for distributed system
CN113032256A (en) Automatic test method, device, computer system and readable storage medium
CN112650502A (en) Batch processing task processing method and device, computer equipment and storage medium
CN115563183B (en) Query method, query device and program product
CN116382703B (en) Software package generation method, code development method and device, electronic equipment and medium
CN114268558B (en) Method, device, equipment and medium for generating monitoring graph
CN116820566A (en) Data processing method, device, electronic equipment and storage medium
CN116610296A (en) Program call chain statistical method, apparatus, device, medium and program product
CN116821159A (en) Data processing method, device, equipment, medium and product
CN113535153A (en) Method, device, equipment and medium for encoding custom label

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination