CN109308170B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN109308170B
CN109308170B CN201811054274.3A CN201811054274A CN109308170B CN 109308170 B CN109308170 B CN 109308170B CN 201811054274 A CN201811054274 A CN 201811054274A CN 109308170 B CN109308170 B CN 109308170B
Authority
CN
China
Prior art keywords
data
transmitted
preset
disk
partition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811054274.3A
Other languages
Chinese (zh)
Other versions
CN109308170A (en
Inventor
林皓
曲金羽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mixin (Beijing) Digital Technology Co.,Ltd.
Original Assignee
Beijing Beixinyuan Information Security Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Beixinyuan Information Security Technology Co ltd filed Critical Beijing Beixinyuan Information Security Technology Co ltd
Priority to CN201811054274.3A priority Critical patent/CN109308170B/en
Publication of CN109308170A publication Critical patent/CN109308170A/en
Application granted granted Critical
Publication of CN109308170B publication Critical patent/CN109308170B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a data processing method and a device, wherein the method comprises the following steps: acquiring a logic log type of data to be transmitted; copying the data to be transmitted to a target partition corresponding to the logic log type according to a preset mapping relation; the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition; and monitoring data information of the data to be transmitted in the target partition, and if the data information is judged to reach a preset condition, writing the data to be transmitted into a disk so that a receiver of the data to be transmitted can read the data to be transmitted from the disk. The device performs the above method. The data processing method and the data processing device provided by the embodiment of the invention can effectively process massive data of different types, so that the data can be stored persistently to adapt to the application scene of big data.

Description

Data processing method and device
Technical Field
The embodiment of the invention relates to the technical field of data processing, in particular to a data processing method and device.
Background
With the rapid development of big data technology, it is very important to process massive data of different types.
In the prior art, data storage and forwarding are performed through a memory, and a persistence mechanism and a necessary delay mechanism are lacked, so that functions of data delay batch processing, data failure recovery and the like cannot be realized, and application and management of big data are greatly limited.
Therefore, how to avoid the above-mentioned defects and effectively process massive data of different types to enable the data to be persistently stored so as to adapt to the application scenario of the large data becomes a problem to be solved urgently.
Disclosure of Invention
To solve the problems in the prior art, embodiments of the present invention provide a data processing method and apparatus.
In a first aspect, an embodiment of the present invention provides a data processing method, where the method includes:
acquiring a logic log type of data to be transmitted;
copying the data to be transmitted to a target partition corresponding to the logic log type according to a preset mapping relation; the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition;
and monitoring data information of the data to be transmitted in the target partition, and if the data information is judged to reach a preset condition, writing the data to be transmitted into a disk so that a receiver of the data to be transmitted can read the data to be transmitted from the disk.
In a second aspect, an embodiment of the present invention provides a data processing apparatus, where the apparatus includes:
the acquisition unit is used for acquiring the logic log type of the data to be transmitted;
the copying unit is used for copying the data to be transmitted to a target partition corresponding to the type of the logic log according to a preset mapping relation; the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition;
and the writing unit is used for monitoring the data information of the data to be transmitted in the target partition, and if the data information is judged to reach a preset condition, writing the data to be transmitted into a disk so that a receiver of the data to be transmitted can read the data to be transmitted from the disk.
In a third aspect, an embodiment of the present invention provides an electronic device, including: a processor, a memory, and a bus, wherein,
the processor and the memory are communicated with each other through the bus;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform a method comprising:
acquiring a logic log type of data to be transmitted;
copying the data to be transmitted to a target partition corresponding to the logic log type according to a preset mapping relation; the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition;
and monitoring data information of the data to be transmitted in the target partition, and if the data information is judged to reach a preset condition, writing the data to be transmitted into a disk so that a receiver of the data to be transmitted can read the data to be transmitted from the disk.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, including:
the non-transitory computer readable storage medium stores computer instructions that cause the computer to perform a method comprising:
acquiring a logic log type of data to be transmitted;
copying the data to be transmitted to a target partition corresponding to the logic log type according to a preset mapping relation; the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition;
and monitoring data information of the data to be transmitted in the target partition, and if the data information is judged to reach a preset condition, writing the data to be transmitted into a disk so that a receiver of the data to be transmitted can read the data to be transmitted from the disk.
According to the data processing method and device provided by the embodiment of the invention, the data to be transmitted is copied to the target partition corresponding to the type of the logic log, the data information of the data to be transmitted in the target partition is monitored, and if the preset condition is met, the data to be transmitted is written into the disk, so that massive data of different types can be effectively processed, the data can be stored persistently, and the application scene of the large data can be adapted.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of a data processing method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a data processing method according to another embodiment of the present invention;
FIG. 3 is a diagram illustrating the relationship between topics, partitions, and segmented files according to an embodiment of the present invention;
FIG. 4 is a flow chart illustrating a data processing method according to another embodiment of the present invention;
FIG. 5 is a block diagram of a data processing apparatus according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic flow diagram of a data processing method according to an embodiment of the present invention, and as shown in fig. 1, a data processing method according to an embodiment of the present invention includes the following steps:
s101: and acquiring the logic log type of the data to be transmitted.
Specifically, the device obtains the logical log type of the data to be transmitted. Fig. 2 is a schematic flow chart of a data processing method according to another embodiment of the present invention, and as shown in fig. 2, the device may be the proxy server in fig. 2, which is not limited in particular. The data to be transmitted may be multi-source data, and the type of the multi-source data may be RDBMS, LOG, MONGODB, casasandra, and the like in fig. 2, which is not particularly limited. The logical log types can be understood by the log type of informix, including no log, no buffer log, asii log, as follows:
no log: no logging is performed, and transactions are of course not supported at this time (it is possible to temporarily switch to a no log state to prevent long transactions from occurring at the time of a large transaction).
No buffer log: nor is every operation written directly to disk, but rather written immediately after each transaction is completed.
buffer log: it will typically be written to disk after the buffer is full.
assii log: similar to the no buffer treatment.
The contents of fig. 2 are explained as follows:
connect provides a data pipe-centric traffic class, a tool for scalable, reliable streaming of data between different systems. In the Connect model, Source is used for leading in data to a data pipeline and Sink is used for leading out data from the data pipeline, so that data reading and writing are realized, and many complex management works can be simplified. Connect focuses on copying of data, not data transformation.
As shown in fig. 2, the Sources on the left side are responsible for reading data from other heterogeneous systems and importing the data into the proxy server; the Sinks on the right side writes the data in the proxy server to other systems.
The core part of Connect is Connector, which can be viewed as a logical jobs for copying data between proxy servers and other systems. Such as an upstream system copying data to a proxy server, or copying data from a proxy server to a downstream system, etc.
S102: copying the data to be transmitted to a target partition corresponding to the logic log type according to a preset mapping relation; and the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition.
Specifically, the device copies the data to be transmitted to a target partition corresponding to the type of the logical log according to a preset mapping relationship; and the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition. In the Connect, a lightweight storage structure based on a Partition storage unit mechanism has a simple storage layout. According to the actual business requirement of data, a user can define Topic by himself, each Partition of one Topic corresponds to one logic log, one logic log can be composed of a group of segmented files, and the size of each segmented file can be the same. Fig. 3 is a schematic diagram of the relationship among the subject, the partitions, and the segment files according to the embodiment of the present invention, and as shown in fig. 3, each segment file is represented by numbers 0 to 12, and is arranged according to the writing time sequence, that is, the writing time of the segment file corresponding to 0 is earlier than that of the segment file corresponding to 1, and so on, and will not be described again. Further, the data to be transmitted may be copied to the last target segment file in the target partition. Fig. 3 shows a scenario in which a theme is written with data corresponding to 3 partitions, respectively, and if the logical log type of the data to be transmitted corresponds to the Partition0 in fig. 3, the data to be transmitted is copied to the target Partition 0.
S103: and monitoring data information of the data to be transmitted in the target partition, and if the data information is judged to reach a preset condition, writing the data to be transmitted into a disk so that a receiver of the data to be transmitted can read the data to be transmitted from the disk.
Specifically, the device monitors data information of the data to be transmitted in the target partition, and if the data information is judged to reach a preset condition, the data to be transmitted is written into a disk, so that a receiver of the data to be transmitted can read the data to be transmitted from the disk. The recipient of the data to be transmitted may be the system to the right of the proxy server in fig. 2. The data information may include a data amount or a data transmission duration, and if it is determined that the data information reaches a preset condition, the data to be transmitted is written into a disk, which may be specifically as follows: if the data volume reaches the preset data volume, writing the data to be transmitted into a disk; or, if the data transmission duration is judged to reach the preset duration, writing the data to be transmitted into a disk. The preset data volume and the preset time length can be set independently according to actual conditions. The data copied to the proxy server in Connect is persisted to the inside of the disk. By utilizing the advantages of sequential reading and writing of the disk and zero-copy technology of the proxy server, high throughput of data and backtracking when data copy is wrong can be guaranteed. It should be noted that: data can be copied to other systems by Connect Sink only after being stored in a disk. Different from the existing data pipeline, the data pipeline can cache and persist data through the proxy server, can ensure that when the data is blocked at the upstream or the downstream of the transmission pipeline, the excessive pressure of the data pipeline cannot occur, and can also meet the requirement of batch processing of the data at fixed time every day, namely the delayed batch processing capacity.
The method can also comprise the following steps: monitoring data information of the data to be transmitted in the last target segmented file, and then continuously repeating the step of writing the data to be transmitted into a disk if the data information is judged to reach a preset condition so that a receiver of the data to be transmitted can read the data to be transmitted from the disk, or continuously repeating the step of writing the data to be transmitted into the disk if the data amount is judged to reach a preset data amount; or, if the data transmission duration is judged to reach the preset duration, writing the data to be transmitted into the disk, so that the data to be transmitted is durably stored in the disk.
The method may further comprise:
acquiring a target theme type corresponding to the target partition according to a preset relation; and the preset relation is the corresponding relation between the preset partition and the preset theme type. Fig. 4 is a schematic flowchart of a data processing method according to another embodiment of the present invention, as shown in fig. 4, a Partition3 is a preset Partition corresponding to Topic 1; the preset Partition corresponding to Topic2 is Partition 5; the preset Partition corresponding to Topic3 is Partition 7; if the Partition3 in fig. 4 is a target Partition, the target Topic type corresponding to the Partition3 is Topic 1.
The method further comprises the following steps:
acquiring a plurality of task processing threads, wherein each task processing thread can copy the data to be transmitted to a plurality of target partitions respectively and simultaneously; and executing the task processing threads in parallel. Examples are as follows: task processing thread 1 (corresponding to Task1 in fig. 4) may copy data 1 to be transmitted to Partition1, and copy data 2 to be transmitted to Partition 2; task processing thread 2 (corresponding to Task2 in fig. 4) may copy data to be transmitted 3 to Partition3 and copy data to be transmitted 4 to Partition4, and may execute Task processing thread 1 and Task processing thread 2 in parallel.
Another data model for Connect is that the connector has many tasks. Each partition can be run in a task. Multiple partitions may be placed in a task, executed by a thread. So that the logical and physical partitioning of resources may be more closely matched. As shown in fig. 4, after the data stream upstream of the proxy server comes, the corresponding proxy server connector is defined, and then the data stream is decomposed into a set of tasks and then pushed to different topics of the proxy server.
The proxy server can also perform scheduling and management based on the traditional Zookeeper: the dynamic increase or decrease of the data Source and Sink is detected and the consumed offset (offset) in each Partition is tracked.
The second method is to add information of recording Source and Sink related Partition in the proxy server of Connect, including offset in each Partition, copy record and other information. So as to avoid the problem that frequent access to the Zookeeper causes the Zookeeper to be down or over-stressed, thereby influencing the use of the user.
According to the data processing method provided by the embodiment of the invention, the data to be transmitted is copied to the target partition corresponding to the type of the logic log, the data information of the data to be transmitted in the target partition is monitored, and if the preset condition is reached, the data to be transmitted is written into the disk, so that massive data of different types can be effectively processed, the data can be stored persistently, and the application scene of the large data can be adapted.
On the basis of the above embodiment, the data information includes a data volume or a data transmission duration; correspondingly, if it is determined that the data information reaches the preset condition, writing the data to be transmitted into a disk, including:
and if the data volume reaches the preset data volume, writing the data to be transmitted into a disk.
Specifically, if the device judges that the data volume reaches the preset data volume, the device writes the data to be transmitted into a disk. Reference may be made to the above embodiments, which are not described in detail.
Or the like, or, alternatively,
and if the data transmission duration reaches the preset duration, writing the data to be transmitted into a disk.
Specifically, if the device judges that the data transmission duration reaches the preset duration, the device writes the data to be transmitted into a disk. Reference may be made to the above embodiments, which are not described in detail.
According to the data processing method provided by the embodiment of the invention, the data to be transmitted is written into the disk by taking the data volume or the data transmission duration as the preset condition, so that the mass data of different types can be further effectively processed, and the data can be stored persistently to adapt to the application scene of the big data.
On the basis of the embodiment, each preset partition is also divided into a plurality of preset segmented files which are arranged according to the writing time sequence; correspondingly, the copying the data to be transmitted to the target partition corresponding to the type of the logical log includes:
and copying the data to be transmitted to the last target segment file in the target partition.
Specifically, the device copies the data to be transmitted to the last target segment file in the target partition. Reference may be made to the above embodiments, which are not described in detail.
According to the data processing method provided by the embodiment of the invention, the data to be transmitted is copied to the last target segmented file in the target partition, so that the storage resource is effectively utilized.
On the basis of the above embodiment, the method further includes:
and monitoring the data information of the data to be transmitted in the last target segmented file.
Specifically, the device monitors the data information of the data to be transmitted in the last target segment file. Reference may be made to the above embodiments, which are not described in detail.
According to the data processing method provided by the embodiment of the invention, mass data of different types can be further effectively processed by monitoring the data information of the data to be transmitted in the last target segmented file, so that the data can be stored persistently to adapt to the application scene of big data.
On the basis of the above embodiment, the method further includes:
acquiring a target theme type corresponding to the target partition according to a preset relation; and the preset relation is the corresponding relation between the preset partition and the preset theme type.
Specifically, the device acquires a target theme type corresponding to the target partition according to a preset relationship; and the preset relation is the corresponding relation between the preset partition and the preset theme type. Reference may be made to the above embodiments, which are not described in detail.
According to the data processing method provided by the embodiment of the invention, the target partition can be conveniently classified by acquiring the target topic type, and further management is facilitated.
On the basis of the above embodiment, the method further includes:
and acquiring a plurality of task processing threads, wherein each task processing thread can copy the data to be transmitted to a plurality of target partitions respectively and simultaneously.
Specifically, the device acquires a plurality of task processing threads, and each task processing thread can copy the data to be transmitted to a plurality of target partitions respectively and simultaneously. Reference may be made to the above embodiments, which are not described in detail.
And executing the task processing threads in parallel.
Specifically, the device executes the plurality of task processing threads in parallel. Reference may be made to the above embodiments, which are not described in detail.
According to the data processing method provided by the embodiment of the invention, each task processing thread copies data respectively and simultaneously, and a plurality of task processing threads are executed in parallel, so that the data processing speed can be increased.
On the basis of the above embodiment, the data to be transmitted is multi-source data.
Specifically, the data to be transmitted in the device is multi-source data. Reference may be made to the above embodiments, which are not described in detail.
According to the data processing method provided by the embodiment of the invention, mass data of different types can be further effectively processed by processing the data to be transmitted as multi-source data, so that the data can be stored persistently to adapt to the application scene of large data.
Fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention, and as shown in fig. 5, an embodiment of the present invention provides a data processing apparatus, including an obtaining unit 501, a copying unit 502, and a writing unit 503, where:
the obtaining unit 501 is configured to obtain a logic log type of data to be transmitted; the copying unit 502 is configured to copy the data to be transmitted to a target partition corresponding to the type of the logical log according to a preset mapping relationship; the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition; the writing unit 503 is configured to monitor data information of the data to be transmitted in the target partition, and if it is determined that the data information reaches a preset condition, write the data to be transmitted into a disk, so that a receiver of the data to be transmitted reads the data to be transmitted from the disk.
Specifically, the obtaining unit 501 is configured to obtain a logical log type of data to be transmitted; the copying unit 502 is configured to copy the data to be transmitted to a target partition corresponding to the type of the logical log according to a preset mapping relationship; the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition; the writing unit 503 is configured to monitor data information of the data to be transmitted in the target partition, and if it is determined that the data information reaches a preset condition, write the data to be transmitted into a disk, so that a receiver of the data to be transmitted reads the data to be transmitted from the disk.
According to the data processing device provided by the embodiment of the invention, the data to be transmitted is copied to the target partition corresponding to the type of the logic log, the data information of the data to be transmitted in the target partition is monitored, and if the preset condition is met, the data to be transmitted is written into the disk, so that massive data of different types can be effectively processed, the data can be stored persistently, and the application scene of the large data can be adapted.
The data processing apparatus provided in the embodiment of the present invention may be specifically configured to execute the processing flows of the above method embodiments, and the functions of the data processing apparatus are not described herein again, and refer to the detailed description of the above method embodiments.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 6, the electronic device includes: a processor (processor)601, a memory (memory)602, and a bus 603;
the processor 601 and the memory 602 complete mutual communication through a bus 603;
the processor 601 is configured to call program instructions in the memory 602 to perform the methods provided by the above-mentioned method embodiments, for example, including: acquiring a logic log type of data to be transmitted; copying the data to be transmitted to a target partition corresponding to the logic log type according to a preset mapping relation; the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition; and monitoring data information of the data to be transmitted in the target partition, and if the data information is judged to reach a preset condition, writing the data to be transmitted into a disk so that a receiver of the data to be transmitted can read the data to be transmitted from the disk.
The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method provided by the above-mentioned method embodiments, for example, comprising: acquiring a logic log type of data to be transmitted; copying the data to be transmitted to a target partition corresponding to the logic log type according to a preset mapping relation; the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition; and monitoring data information of the data to be transmitted in the target partition, and if the data information is judged to reach a preset condition, writing the data to be transmitted into a disk so that a receiver of the data to be transmitted can read the data to be transmitted from the disk.
The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the methods provided by the above method embodiments, for example, including: acquiring a logic log type of data to be transmitted; copying the data to be transmitted to a target partition corresponding to the logic log type according to a preset mapping relation; the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition; and monitoring data information of the data to be transmitted in the target partition, and if the data information is judged to reach a preset condition, writing the data to be transmitted into a disk so that a receiver of the data to be transmitted can read the data to be transmitted from the disk.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The above-described embodiments of the electronic device and the like are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may also be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the embodiments of the present invention, and are not limited thereto; although embodiments of the present invention have been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (9)

1. A data processing method, comprising:
acquiring a logic log type of data to be transmitted;
copying the data to be transmitted to a target partition corresponding to the logic log type according to a preset mapping relation; the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition;
monitoring data information of the data to be transmitted in the target partition, and writing the data to be transmitted into a disk by utilizing a disk sequential read-write technology and a zero copy technology of a proxy server if the data information is judged to reach a preset condition so that a receiver of the data to be transmitted can read the data to be transmitted from the disk;
acquiring a plurality of task processing threads, wherein each task processing thread can copy the data to be transmitted to a plurality of target partitions respectively and simultaneously;
and executing the task processing threads in parallel.
2. The method of claim 1, wherein the data information comprises data amount or data transmission duration; correspondingly, if it is determined that the data information reaches the preset condition, writing the data to be transmitted into a disk, including:
if the data volume reaches the preset data volume, writing the data to be transmitted into a disk;
or the like, or, alternatively,
and if the data transmission duration reaches the preset duration, writing the data to be transmitted into a disk.
3. The method according to claim 2, wherein each preset partition is further divided into a plurality of preset segment files arranged according to a writing time sequence; correspondingly, the copying the data to be transmitted to the target partition corresponding to the type of the logical log includes:
and copying the data to be transmitted to the last target segment file in the target partition.
4. The method of claim 3, further comprising:
and monitoring the data information of the data to be transmitted in the last target segmented file.
5. The method of any of claims 1 to 4, further comprising:
acquiring a target theme type corresponding to the target partition according to a preset relation; and the preset relation is the corresponding relation between the preset partition and the preset theme type.
6. The method according to any one of claims 1 to 4, wherein the data to be transmitted is multi-source data.
7. A data processing apparatus, comprising:
the acquisition unit is used for acquiring the logic log type of the data to be transmitted;
the copying unit is used for copying the data to be transmitted to a target partition corresponding to the type of the logic log according to a preset mapping relation; the preset mapping relation is a corresponding relation between a preset logic log type and a preset partition;
the write-in unit is used for monitoring data information of the data to be transmitted in the target partition, and if the data information is judged to reach a preset condition, the data to be transmitted is written into a disk by utilizing a disk sequential read-write technology and a zero copy technology of a proxy server so that a receiver of the data to be transmitted can read the data to be transmitted from the disk;
the thread acquisition unit is used for acquiring a plurality of task processing threads, and each task processing thread can copy the data to be transmitted to a plurality of target partitions at the same time;
and the execution unit is used for executing the plurality of task processing threads in parallel.
8. An electronic device, comprising: a processor, a memory, and a bus, wherein,
the processor and the memory are communicated with each other through the bus;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of claims 1 to 6.
9. A non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the method of any one of claims 1 to 6.
CN201811054274.3A 2018-09-11 2018-09-11 Data processing method and device Active CN109308170B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811054274.3A CN109308170B (en) 2018-09-11 2018-09-11 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811054274.3A CN109308170B (en) 2018-09-11 2018-09-11 Data processing method and device

Publications (2)

Publication Number Publication Date
CN109308170A CN109308170A (en) 2019-02-05
CN109308170B true CN109308170B (en) 2021-11-30

Family

ID=65224675

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811054274.3A Active CN109308170B (en) 2018-09-11 2018-09-11 Data processing method and device

Country Status (1)

Country Link
CN (1) CN109308170B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110851077A (en) * 2019-10-25 2020-02-28 中盈优创资讯科技有限公司 Logstash data processing device and method
CN111290720B (en) * 2020-03-13 2023-09-05 惠州市蓝微电子有限公司 Data printing method and device
CN112100414B (en) * 2020-09-11 2024-02-23 深圳力维智联技术有限公司 Data processing method, device, system and computer readable storage medium
CN112732175B (en) * 2020-12-25 2024-05-14 华录光存储研究院(大连)有限公司 Data transmission method and system
CN112799820B (en) * 2021-02-05 2024-06-11 拉卡拉支付股份有限公司 Data processing method, device, electronic equipment, storage medium and program product
CN113515353B (en) * 2021-06-04 2024-05-14 深圳奥哲网络科技有限公司 Long transaction processing method, system, electronic equipment and storage medium
CN114125081B (en) * 2021-10-27 2023-09-22 桂林长海发展有限责任公司 Method and device for processing received data and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101034364A (en) * 2007-04-02 2007-09-12 华为技术有限公司 Method, device and system for implementing RAM date backup
CN102790686A (en) * 2011-05-17 2012-11-21 浙江核新同花顺网络信息股份有限公司 Log data collecting method and system and log collecting server
CN106055630A (en) * 2016-05-27 2016-10-26 北京小米移动软件有限公司 Log storage method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110154023A1 (en) * 2009-12-21 2011-06-23 Smith Ned M Protected device management
CN102236672B (en) * 2010-05-06 2016-08-24 深圳市腾讯计算机系统有限公司 A kind of data lead-in method and device
US8782101B1 (en) * 2012-01-20 2014-07-15 Google Inc. Transferring data across different database platforms
CN105740413A (en) * 2016-01-29 2016-07-06 珠海全志科技股份有限公司 File movement method by FUSE on Linux platform
CN106354434B (en) * 2016-08-31 2019-07-23 中国人民大学 The storage method and system of daily record data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101034364A (en) * 2007-04-02 2007-09-12 华为技术有限公司 Method, device and system for implementing RAM date backup
CN102790686A (en) * 2011-05-17 2012-11-21 浙江核新同花顺网络信息股份有限公司 Log data collecting method and system and log collecting server
CN106055630A (en) * 2016-05-27 2016-10-26 北京小米移动软件有限公司 Log storage method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Kafka的Log存储解析;jewes;《https://blog.csdn.net/jewes/article/details/42970799》;20150121;第1-2页 *

Also Published As

Publication number Publication date
CN109308170A (en) 2019-02-05

Similar Documents

Publication Publication Date Title
CN109308170B (en) Data processing method and device
Narkhede et al. Kafka: the definitive guide: real-time data and stream processing at scale
EP3382551A1 (en) Distributed hardware tracing
US9424160B2 (en) Detection of data flow bottlenecks and disruptions based on operator timing profiles in a parallel processing environment
US8904225B2 (en) Stream data processing failure recovery method and device
US9037905B2 (en) Data processing failure recovery method, system and program
US9864772B2 (en) Log-shipping data replication with early log record fetching
WO2023040399A1 (en) Service persistence method and apparatus
CN106649000B (en) Fault recovery method of real-time processing engine and corresponding server
US20170277469A1 (en) Small storage volume management
US8972422B2 (en) Management of log data in a networked system
US10997057B2 (en) Debugging asynchronous functions
CN110019045B (en) Log floor method and device
CN116701352A (en) Database data migration method and system
CN110727700A (en) Method and system for integrating multi-source streaming data into transaction type streaming data
CN111078418A (en) Operation synchronization method and device, electronic equipment and computer readable storage medium
US20180309702A1 (en) Method and device for processing data after restart of node
CN111541747B (en) Data check point setting method and device
CN113934566A (en) Exception handling method and device and electronic equipment
CN109241027B (en) Data migration method, device, electronic equipment and computer readable storage medium
CN114116790A (en) Data processing method and device
US8838414B2 (en) Determining when to create a prediction based on deltas of metric values
CN115629918B (en) Data processing method, device, electronic equipment and storage medium
CN109561120A (en) Small documents backup method, systems and management server
CN117422556B (en) Derivative transaction system, device and computer medium based on replication state machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 100195 Room 301, floor 3, building 103, No. 3, minzhuang Road, Haidian District, Beijing

Patentee after: Mixin (Beijing) Digital Technology Co.,Ltd.

Address before: 100093 301, 3rd floor, building 103, 3 minzhuang Road, Haidian District, Beijing

Patentee before: BEIJING BEIXINYUAN INFORMATION SECURITY TECHNOLOGY CO.,LTD.

CP03 Change of name, title or address