CN112182036A

CN112182036A - Data sending and writing method and device, electronic equipment and readable storage medium

Info

Publication number: CN112182036A
Application number: CN202010968620.XA
Authority: CN
Inventors: 林伟泽; 孙藜; 金山城; 侯武庆
Original assignee: China Citic Bank Corp Ltd
Current assignee: China Citic Bank Corp Ltd
Priority date: 2020-09-15
Filing date: 2020-09-15
Publication date: 2021-01-05

Abstract

The embodiment of the application provides a data sending and writing method, a data sending and writing device, electronic equipment and a readable storage medium. The method comprises the following steps: obtaining serialized data from Kafka; determining a schema corresponding to the identification information carried by the serialized data; analyzing the serialized data based on the schema to obtain source data; and determining whether an ES model corresponding to the source data exists, and if so, writing the ES model into the source data after format conversion. Based on the scheme, the serialized data can be obtained from the Kafka in real time, the corresponding schema is determined based on the identification information, the serialized data is analyzed based on the schema to obtain the source data, the source data is subjected to format conversion and then written into the corresponding ES model according to the mapping rule, the data format consistency of the data from the acquisition to the processing process is ensured, and the real-time analysis and conversion of the service data are realized.

Description

Data sending and writing method and device, electronic equipment and readable storage medium

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for sending and writing data, an electronic device, and a readable storage medium.

Background

The business process in the financial system is complex, the data volume of the business data generated by each business node system is huge, and the business data needs to be acquired in time for processing and analysis so as to realize the support of operation.

At present, a mode of periodically acquiring service data in batches is mostly adopted for processing, the timeliness is slow, the operation and use requirements cannot be met, and data structures of data collected in each system are possibly different, so that the service data is inconvenient to process and analyze, and the service data is not favorable for use.

Disclosure of Invention

The present application aims to solve at least one of the above technical drawbacks. The technical scheme adopted by the application is as follows:

in a first aspect, an embodiment of the present application provides a data sending method, where the method includes:

acquiring binlog data from a database of a monitored system;

analyzing binlog data to obtain source data;

serializing the source data to obtain serialized data based on whether the schema matched with the source data exists or not, wherein the serialized data contains identification information of the schema;

the serialized data was published to Kafka.

Optionally, serializing the source data based on whether there is a schema matching the source data, including:

determining that a schema matched with the source data exists in the local cache;

if the source data exists, serializing the source data by using the schema;

and if not, registering the schema, and serializing the source data by using the schema.

Optionally, registering the schema includes:

determining whether a registered schema with the same name of the schema exists;

if not, registering the schema;

if yes, determining whether a data structure corresponding to the registered schema is consistent with a data structure of the source data, and if not, registering the schema; and if so, taking the registered schema as the schema.

In a second aspect, an embodiment of the present application provides a method for writing data, where the method includes:

obtaining serialized data from Kafka;

determining a schema corresponding to the identification information carried by the serialized data;

analyzing the serialized data based on the schema to obtain source data;

and determining whether an ES model corresponding to the source data exists, and if so, writing the ES model into the source data after format conversion.

Optionally, the writing method further includes:

determining whether a target data field exists in the source data, wherein the target data field does not exist in the ES model;

and if so, expanding the target data field of the ES model.

Optionally, the writing method further includes:

and if the ES model corresponding to the source data does not exist, determining the source data as abnormal data.

Optionally, the writing method further includes:

and determining whether an ES model corresponding to the abnormal data exists according to a preset retry strategy.

In a third aspect, an embodiment of the present application provides an apparatus for transmitting data, where the apparatus includes:

the data acquisition module is used for acquiring binlog data from a database of the monitored system;

the source data acquisition module is used for analyzing the binlog data to acquire source data;

the serialization module is used for serializing the source data to obtain serialized data based on whether the schema matched with the source data exists or not, and identification information of the schema exists in the serialized data;

and the data issuing module is used for issuing the serialized data to the Kafka.

Optionally, the serialization module is specifically configured to:

if the source data exists, serializing the source data by using the schema;

Optionally, when registering the schema, the serialization module is specifically configured to:

if not, registering the schema;

In a fourth aspect, an embodiment of the present application provides an apparatus for writing data, where the apparatus includes:

a data acquisition module for acquiring serialized data from Kafka;

the schema determining module is used for determining a schema corresponding to the identification information carried by the serialized data;

the data analysis module is used for analyzing the serialized data based on the schema to obtain source data;

and the data writing module is used for determining whether an ES model corresponding to the source data exists, and if so, writing the ES model after performing format conversion on the source data.

Optionally, the apparatus further includes a field extension module, where the field extension module is configured to:

and if so, expanding the target data field of the ES model.

Optionally, the apparatus further comprises:

and the abnormal data determining module is used for determining the source data as abnormal data when the ES model corresponding to the source data does not exist.

Optionally, the apparatus further comprises:

and the abnormal data retry module is used for determining whether an ES model corresponding to the abnormal data exists or not according to a preset retry strategy.

In a fifth aspect, an embodiment of the present application provides an electronic device, including: a processor and a memory;

a memory for storing operating instructions;

a processor configured to perform the method as shown in any implementation of the first aspect or any implementation of the second aspect of the present application by calling an operation instruction.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium on which a computer program is stored, which when executed by a processor, implements the method shown in any of the embodiments of the first aspect or any of the embodiments of the second aspect of the present application.

The technical scheme provided by the embodiment of the application has the following beneficial effects:

according to the scheme provided by the embodiment of the application, serialized data is obtained from Kafka, and a schema corresponding to identification information carried by the serialized data is determined; analyzing the serialized data based on the schema to obtain source data, determining whether an ES model corresponding to the source data exists, and if so, performing format conversion on the source data and writing the source data into the ES model. Based on the scheme, the serialized data can be obtained from the Kafka in real time, the corresponding schema is determined based on the identification information, the serialized data is analyzed based on the schema to obtain the source data, the source data is subjected to format conversion and then written into the corresponding ES model according to the mapping rule, the data format consistency of the data from the acquisition to the processing process is ensured, and the real-time analysis and conversion of the service data are realized.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.

Fig. 1 is a schematic flowchart of a data transmission method according to an embodiment of the present application;

fig. 2 is a schematic flowchart of a specific implementation of a data transmission method according to an embodiment of the present application;

fig. 3 is a schematic flowchart of a data writing method according to an embodiment of the present application;

FIG. 4 is a block diagram of the overall logic architecture for data collection, transmission, and writing provided by an embodiment of the present application;

fig. 5 is a schematic structural diagram of a data transmitting apparatus according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a data writing apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present invention.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

In the prior art, the collected source data is generally a JavaScript Object Notation (JSON) character set, and the structure rule of the data is uncertain, so that the data is inconvenient to analyze and use, and the transmission process cannot monitor the change process of the data structure.

In the prior art, the method of periodically acquiring service data in batches is mostly adopted for processing, the time efficiency is slow, the operation and use requirements cannot be met,

the embodiments of the present application provide a method, an apparatus, an electronic device, and a readable storage medium for sending and writing data, which aim to solve at least one of the above technical problems in the prior art.

The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.

Fig. 1 shows a schematic flow diagram of a data transmission method provided in an embodiment of the present application, and as shown in fig. 1, the method mainly includes:

step S110: obtaining binlog (binary log) data from a database of a monitored system;

step S120: analyzing binlog data to obtain source data;

step S130: serializing the source data to obtain serialized data based on whether a schema (mode) matched with the source data exists or not, wherein identification information of the schema exists in the serialized data;

step S140: the serialized data was published to Kafka.

In the embodiment of the application, data change in a database of a monitored system can be monitored in real time based on the Canal, and source data can be acquired by analyzing binlog.

In the embodiment of the application, after the source data is analyzed, the avro serialization can be performed according to the schema matched with the source data to obtain the serialized data, and the serialized data is issued to Kafka.

In the embodiment of the application, the identification information is used for identifying the schema, so that when deserializing the serialized data, the schema is obtained through the identification information, and the serialized data is analyzed to obtain the source data.

In the embodiment of the application, the serialized data is issued to the Kafka, so that a data consumer can acquire the data in time, and the timeliness of data acquisition can be improved to meet the operation and use requirements.

In the method provided by the embodiment of the application, binlog data acquired from a database of a monitored system is analyzed to obtain source data, and the source data is serialized on the basis of schema matched with the source data, so that serialized data is obtained and sent to Kafka. Based on the scheme, the acquired data can be serialized and then issued to the Kafka, the schema used in the serialization is identified, the timeliness of the service data is favorably improved, a basis is provided for acquiring the serialized data from the Kafka, analyzing the serialized data by using the corresponding schema, and converting the source data into a corresponding data format, the service data is favorably processed and analyzed, and the service data is favorably used.

In an optional manner of the embodiment of the present application, serializing source data based on whether there is a schema matched with the source data includes:

if the source data exists, serializing the source data by using the schema;

In the embodiment of the application, when source data are serialized on the basis of the schema, whether the schema matched with the source data exists in the local cache can be determined, and if the schema matched with the source data exists in the local cache, the schema matched with the source data in the cache can be used for processing.

If the schema matched with the source data does not exist, the schema matched with the schema and the source data can be registered, and the source data is serialized through the registered schema. In actual use, the registered schema can be stored in a local cache so as to reduce repeated registration of the schema.

In an optional mode of the embodiment of the present application, registering the schema includes:

if not, registering the schema;

if yes, determining whether a data structure corresponding to the registered schema is consistent with a data structure of the source data, and if not, registering the schema; and if so, taking the registered schema as the schema. In the embodiment of the application, the current schema can be registered through the schema registration center. Specifically, it may be determined whether a registered schema with the same name as the current schema exists in the schema registry, and if not, the current schema may be registered; if so, determining whether the data structure corresponding to the schema is consistent with the data structure of the source data corresponding to the current schema, and if so, serializing by using the registered schema; and if not, registering the current schema.

Fig. 2 is a flowchart illustrating a specific implementation of the data sending method according to the embodiment of the present application, where a target mysql, that is, changed data collected from a database of a monitored system, is collected. And (4) acquiring data by the Canal, namely acquiring source data by acquiring and analyzing binlog. And obtaining information such as a library name, a table name, an operation type and the like, namely obtaining information required when matching the schema. And obtaining the schema used this time, namely obtaining the schema matched with the source data format. And filling the data into the schema, namely filling the source data into the schema. Performing avro coding, namely serializing the source data filled into the schema. And sending the data to the kafka, namely issuing the serialized data to the kafka.

One table corresponds to one schema, and the naming rule of the schema name is as follows: the library name _ table name, namely the data table structure of various source data and the schema are respectively named through the library name and the table name. Filling all columns in the data into newschchema, and inquiring whether the table oldschchema exists or not through the schema name, namely when the source data is matched with the schema, filling all fields of the source data into the schema, naming the schema through the source data, and searching whether the schema exists in the cache or not through the name of the schema corresponding to the source data. If the hashcodes of the oldschema and the newschema are the same, that is, the names of the schemas corresponding to the source data and the schemas exist in the cache, the hash values of the oldschemas and the source data can be compared to determine whether the hash values are the same, and if the hash values are the same, the schema in the cache is used for filling the source data.

If the schemas with the same names of the schemas corresponding to the source data do not exist in the cache, or the schemas with the same names of the schemas corresponding to the source data but different hash values exist in the cache, the schemas can be registered with the schema registry. And whether the schema registry has the version with the same structure as the newschema structure or not is judged, namely, the registry is inquired about the version information of the schema with the same structure as the schema corresponding to the source data. And acquiring the registered existschema through the version, namely acquiring the schema matched with the source data through the version information when the registry has the version information of the schema with the same structure as the schema corresponding to the source data. And when the schema registration center does not have the version information of the schema with the same structure as the schema corresponding to the source data, registering the schema corresponding to the source data. After the schema registry completes the registration of the schema, the source data may be populated into the schema.

Fig. 3 shows a schematic flowchart of a data writing method provided in an embodiment of the present application, and as shown in fig. 3, the method mainly includes:

step S210: obtaining serialized data from Kafka;

step S220: determining a schema corresponding to the identification information carried by the serialized data;

step S230: analyzing the serialized data based on the schema to obtain source data;

step S240: and determining whether an ES (elastic search) model corresponding to the source data exists, and if so, writing the ES model after format conversion is carried out on the source data.

In the embodiment of the application, serialized data can be obtained from Kafka, and corresponding schema is obtained through the identification information, so that deserialization of the serialized data is realized based on the schema, and source data is obtained.

In the embodiment of the application, the identification information is used for identifying the schema, and the schema matched with the source data can be obtained according to the corresponding relationship between the identification information and the schema, so as to realize the analysis of the source data.

In practical use, deserialization analysis of serialized data can be realized based on the kafka deserializationschema.

In the embodiment of the application, the serialized data can be acquired from the Kafka, and the serialized data is analyzed in time to obtain the source data, so that a data consumer can acquire the source data in time, and the timeliness of data acquisition can be improved to meet the operation and use requirements.

In the embodiment of the application, after the source data is acquired, if an ES model corresponding to the source data exists, the source data may be written into the ES model after format conversion.

When the source data are sucked into the ES model, the data time sequence in the window time can be ensured by a Flink stream data window overlapping watermark technology, and the source data are written into the ES model in batches.

Because the source data is written into the ES model after being subjected to format conversion, the data formats are unified, the data can be obtained from the ES model for processing and analysis, and the use of the data is facilitated.

The method provided by the embodiment of the application obtains the serialized data from the Kafka and determines the schema corresponding to the identification information carried by the serialized data; analyzing the serialized data based on the schema to obtain source data, determining whether an ES model corresponding to the source data exists, and if so, performing format conversion on the source data and writing the source data into the ES model. Based on the scheme, the serialized data can be obtained from the Kafka in real time, the corresponding schema is determined based on the identification information, the serialized data is analyzed based on the schema to obtain the source data, the source data is subjected to format conversion and then written into the corresponding ES model according to the mapping rule, the data format consistency of the data from the acquisition to the processing process is ensured, and the real-time analysis and conversion of the service data are realized.

In an optional manner of the embodiment of the present application, the method further includes:

and if so, expanding the target data field of the ES model.

In the embodiment of the application, the source data may include a target data field, the ES model does not include the target data field, and the target data field may be extended for the ES model.

In the embodiment of the application, the source data and the ES model are pre-configured with a corresponding relationship, and if the ES model corresponding to the source data does not exist, the source data can be determined to be abnormal data. Abnormal data cannot be written directly to the ES model.

In this embodiment of the application, a retry strategy may be preset for the abnormal data, for example, after each preset duration, whether an ES model corresponding to the abnormal data exists is determined again, and if the number of retries exceeds a specified number, the abnormal data still cannot be written into the ES model, and the abnormal data may be defined as dirty data.

In the embodiment of the application, a data management system can be deployed to realize data acquisition and writing into the ES model, and the change of the data structure can be monitored and the current data structure of the data can be displayed.

Fig. 4 is a diagram illustrating an overall logical architecture of data collection, transmission, and writing provided by an embodiment of the present application.

A data sourcing system includes a plurality of business subsystems that generate source data during a business process. The data acquisition and transmission device module subscribes and monitors source data provided by a data source system, and corresponds to the data transmission method provided by the embodiment of the application. The source data are serialized and then are issued to a kafka acquisition bus, and the Flink data analysis and writing module acquires a serialized data stream by subscribing the kafka bus, analyzes an anti-sequence and writes data matched with an ES model through a metadata rule into an ES index, corresponding to the data writing method provided by the embodiment of the application. After the data is processed by the data writing method in the embodiment of the application, the batch processing module provides secondary extraction of the data of the ES library, and the oas-app-application data service module can acquire real-time data through the ES library to provide data services outwards.

Based on the same principle as the method shown in fig. 1, fig. 5 shows a schematic structural diagram of a data transmitting apparatus provided in an embodiment of the present application, and as shown in fig. 5, the data transmitting apparatus 30 may include:

the data acquisition module 310 is configured to obtain binlog data from a database of the monitored system;

a source data obtaining module 320, configured to analyze binlog data to obtain source data;

the serialization module 330 is configured to serialize the source data to obtain serialized data based on whether a schema matched with the source data exists, where identification information of the schema exists in the serialized data;

and a data publishing module 340, configured to publish the serialized data to Kafka.

In the apparatus provided in the embodiment of the present application, binlog data obtained from a database of a monitored system is analyzed to obtain source data, and the source data is serialized based on schema matched with the source data, so that serialized data is obtained and sent to Kafka. Based on the scheme, the acquired data can be serialized and then issued to the Kafka, the schema used in the serialization is identified, the timeliness of the service data is favorably improved, a basis is provided for acquiring the serialized data from the Kafka, analyzing the serialized data by using the corresponding schema, and converting the source data into a corresponding data format, the service data is favorably processed and analyzed, and the service data is favorably used.

Optionally, the serialization module is specifically configured to:

if the source data exists, serializing the source data by using the schema;

determining whether version information corresponding to the schema exists;

if yes, registering the schema through version information;

and if the schema does not exist, creating version information corresponding to the schema, and registering the schema through the version information.

It is to be understood that the above modules of the data transmission apparatus in the present embodiment have functions of implementing the corresponding steps of the data transmission method in the embodiment shown in fig. 1. The function can be realized by hardware, and can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the functions described above. The modules can be software and/or hardware, and each module can be implemented independently or by integrating a plurality of modules. For the functional description of each module of the above data sending apparatus, reference may be specifically made to the corresponding description of the data sending method in the embodiment shown in fig. 1, and details are not repeated here.

Based on the same principle as the method shown in fig. 3, fig. 6 shows a schematic structural diagram of a data writing device provided by an embodiment of the present application, and as shown in fig. 6, the data writing device 40 may include:

a data acquisition module 410 for acquiring serialized data from Kafka;

the schema determining module 420 is configured to determine a schema corresponding to the identification information carried by the serialized data;

the data analysis module 430 is configured to analyze the serialized data based on the schema to obtain source data;

the data writing module 440 is configured to determine whether an ES model corresponding to the source data exists, and if so, perform format conversion on the source data and write the source data into the ES model.

The device provided by the embodiment of the application acquires the serialized data from the Kafka and determines the schema corresponding to the identification information carried by the serialized data; analyzing the serialized data based on the schema to obtain source data, determining whether an ES model corresponding to the source data exists, and if so, performing format conversion on the source data and writing the source data into the ES model. Based on the scheme, the serialized data can be obtained from the Kafka in real time, the corresponding schema is determined based on the identification information, the serialized data is analyzed based on the schema to obtain the source data, the source data is subjected to format conversion and then written into the corresponding ES model according to the mapping rule, the data format consistency of the data from the acquisition to the processing process is ensured, and the real-time analysis and conversion of the service data are realized.

and if so, expanding the target data field of the ES model.

Optionally, the apparatus further comprises:

It is to be understood that the above modules of the data writing apparatus in the present embodiment have functions of implementing the corresponding steps of the data writing method in the embodiment shown in fig. 3. The function can be realized by hardware, and can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the functions described above. The modules can be software and/or hardware, and each module can be implemented independently or by integrating a plurality of modules. For the functional description of each module of the data writing device, reference may be specifically made to the corresponding description of the data writing method in the embodiment shown in fig. 3, and details are not repeated here.

The embodiment of the application provides an electronic device, which comprises a processor and a memory;

a memory for storing operating instructions;

and the processor is used for executing the method provided by any embodiment of the application by calling the operation instruction.

As an example, fig. 7 shows a schematic structural diagram of an electronic device to which an embodiment of the present application is applicable, and as shown in fig. 7, the electronic device 2000 includes: a processor 2001 and a memory 2003. Wherein the processor 2001 is coupled to a memory 2003, such as via a bus 2002. Optionally, the electronic device 2000 may also include a transceiver 2004. It should be noted that the transceiver 2004 is not limited to one in practical applications, and the structure of the electronic device 2000 is not limited to the embodiment of the present application.

The processor 2001 is applied to the embodiment of the present application to implement the method shown in the above method embodiment. The transceiver 2004 may include a receiver and a transmitter, and the transceiver 2004 is applied to the embodiments of the present application to implement the functions of the electronic device of the embodiments of the present application to communicate with other devices when executed.

The Processor 2001 may be a CPU (Central Processing Unit), general Processor, DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit), FPGA (Field Programmable Gate Array) or other Programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 2001 may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs and microprocessors, and the like.

Bus 2002 may include a path that conveys information between the aforementioned components. The bus 2002 may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus 2002 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 7, but this is not intended to represent only one bus or type of bus.

The Memory 2003 may be a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory) or other type of dynamic storage device that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to these.

Optionally, the memory 2003 is used for storing application program code for performing the disclosed aspects, and is controlled in execution by the processor 2001. The processor 2001 is used to execute the application program code stored in the memory 2003 to implement the methods provided in any of the embodiments of the present application.

The electronic device provided by the embodiment of the application is applicable to any embodiment of the method, and is not described herein again.

Compared with the prior art, the embodiment of the application provides the electronic equipment, which is characterized in that serialized data are obtained from Kafka, and a schema corresponding to identification information carried by the serialized data is determined; analyzing the serialized data based on the schema to obtain source data, determining whether an ES model corresponding to the source data exists, and if so, performing format conversion on the source data and writing the source data into the ES model. Based on the scheme, the serialized data can be obtained from the Kafka in real time, the corresponding schema is determined based on the identification information, the serialized data is analyzed based on the schema to obtain the source data, the source data is subjected to format conversion and then written into the corresponding ES model according to the mapping rule, the data format consistency of the data from the acquisition to the processing process is ensured, and the real-time analysis and conversion of the service data are realized.

The present application provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computer program implements the method shown in the above method embodiments.

The computer-readable storage medium provided in the embodiments of the present application is applicable to any of the embodiments of the foregoing method, and is not described herein again.

Compared with the prior art, the embodiment of the application provides a computer-readable storage medium, which is used for acquiring serialized data from Kafka and determining a schema corresponding to identification information carried by the serialized data; analyzing the serialized data based on the schema to obtain source data, determining whether an ES model corresponding to the source data exists, and if so, performing format conversion on the source data and writing the source data into the ES model. Based on the scheme, the serialized data can be obtained from the Kafka in real time, the corresponding schema is determined based on the identification information, the serialized data is analyzed based on the schema to obtain the source data, the source data is subjected to format conversion and then written into the corresponding ES model according to the mapping rule, the data format consistency of the data from the acquisition to the processing process is ensured, and the real-time analysis and conversion of the service data are realized.

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A method for transmitting data, comprising:

acquiring binary log binlog data from a database of a monitored system;

analyzing the binlog data to obtain source data;

serializing the source data to obtain serialized data based on whether a mode schema matched with the source data exists or not, wherein the serialized data contains identification information of the schema;

the serialized data was published to Kafka.

2. The method of claim 1, wherein serializing the source data based on whether there is a schema that matches the source data comprises:

determining that the schema matched with the source data exists in the local cache;

if so, serializing the source data by using the schema;

3. The method of claim 2, wherein registering the schema comprises:

if not, registering the schema;

if yes, determining whether a data structure corresponding to the registered schema is consistent with the data structure of the source data, and if not, registering the schema; and if so, taking the registered schema as the schema.

4. A method for writing data, comprising:

obtaining serialized data from Kafka;

analyzing the serialized data based on the schema to obtain source data;

and determining whether a data copy ES model corresponding to the source data exists, if so, performing format conversion on the source data and writing the source data into the ES model.

5. The method of claim 4, further comprising:

determining whether a target data field is present in the source data, the target data field not being present in the ES model;

and if so, expanding the target data field of the ES model.

6. The method according to claim 4 or 5, characterized in that the method further comprises:

7. The method of claim 6, further comprising:

and determining whether an ES model corresponding to the abnormal data exists or not according to a preset retry strategy.

8. An apparatus for transmitting data, comprising:

the serialization module is used for serializing the source data to obtain serialized data based on whether the schema matched with the source data exists or not, wherein the serialized data contains the identification information of the schema;

9. An apparatus for writing data, comprising:

a data acquisition module for acquiring serialized data from Kafka;

10. An electronic device comprising a processor and a memory;

the memory is used for storing operation instructions;

the processor is used for executing the method of any one of claims 1-7 by calling the operation instruction.

11. A computer-readable storage medium, characterized in that the storage medium has stored thereon a computer program which, when being executed by a processor, carries out the method of any one of claims 1-7.