CN110737708A - pipelined efficient data conversion processing method - Google Patents

pipelined efficient data conversion processing method Download PDF

Info

Publication number
CN110737708A
CN110737708A CN201910873646.3A CN201910873646A CN110737708A CN 110737708 A CN110737708 A CN 110737708A CN 201910873646 A CN201910873646 A CN 201910873646A CN 110737708 A CN110737708 A CN 110737708A
Authority
CN
China
Prior art keywords
data
conversion processing
data conversion
flow
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910873646.3A
Other languages
Chinese (zh)
Inventor
刘兴伟
成艳丽
刘博�
张宝玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Wanwei Information Technology Co Ltd
Original Assignee
China Telecom Wanwei Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Wanwei Information Technology Co Ltd filed Critical China Telecom Wanwei Information Technology Co Ltd
Priority to CN201910873646.3A priority Critical patent/CN110737708A/en
Publication of CN110737708A publication Critical patent/CN110737708A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to the technical field of data conversion processing, in particular to an pipelined high-efficiency data conversion processing method which comprises the following steps of S1, decomposing a data conversion processing process into a plurality of data conversion processing flows, wherein each data conversion processing flow is divided into multiple stages, each data conversion processing flow is in a pipelined mode, S2, fragmenting data before executing the data conversion processing, and S3, converting the data times which are fragmented in the step S2 through the data conversion processing flow of the step S1.

Description

pipelined efficient data conversion processing method
Technical Field
The invention relates to the technical field of data conversion processing, in particular to pipelined efficient data conversion processing methods.
Background
In the prior art, generally configures a complex conversion processing process of each service data into series of data processing conversion flows in a data conversion processing process, and the data sequentially passes through a th flow, a second flow … … and until all the final flows are processed, when the data volume is large, the data processed first in each flow cannot enter the next flow for processing in time, but the next processing flows cannot enter when the current flow finishes processing all the data, so that the resource utilization rate is low, and the data conversion processing efficiency is low.
Disclosure of Invention
The invention aims to provide pipeline-type efficient data conversion processing methods, reduce the waiting time of data, fully utilize resources and effectively improve the efficiency of data conversion processing.
In order to solve the above technical problems, pipelined efficient data conversion processing methods of the present invention include the following steps:
s1, decomposing a data conversion processing process into a plurality of data conversion processing flows, wherein each data conversion processing flow is divided into multiple stages, and each data conversion processing flow is in a streamline type;
s2, before the conversion processing of the data is executed, the data is segmented;
and S3, converting the sliced data in the step S2 for times through the data conversion processing flow of the step S1.
In step S1, the more the number of data conversion processing flow stages, the higher the processing efficiency;
in step S2, the smaller the slice granularity is, the higher the processing efficiency is.
Preferably, in step S1, the data conversion process flow is dynamically configured by the data flow configuration center.
Preferably, in step S2, the data slicing rules and the data slicing rules are dynamically configured by the data slicing center.
Preferably, at least data processing units are configured in each data conversion processing flow in step S1, the execution time of each data processing conversion flow is not equal, the execution time of the flow with long execution time can reduce the efficiency of the whole pipeline processing, and the flow with short execution time waits for the flow with long execution time to be executed.
The method has the advantages that the method comprises the following steps of S1, decomposing a data conversion processing process into a plurality of data conversion processing flows, enabling each data conversion processing flow to be in a multi-stage mode, enabling each data conversion processing flow to be in a streamline mode, S2, fragmenting data before data conversion processing is carried out, S3, converting the data obtained through fragmentation in the step S2 for times through the data conversion processing flow in the step S1, and achieving high parallelization processing in the data processing conversion process by fragmenting the data and decomposing the process streamline type flow, so that the data processing conversion efficiency is effectively improved, the configuration is simple and convenient, and the operation is reliable.
Drawings
FIG. 1 is an exploded flow diagram of the present data conversion process;
FIG. 2 is a schematic view of the working process of the present invention.
Detailed Description
As shown in fig. 1 and fig. 2, pipelined efficient data conversion processing methods of the present invention include the following steps:
s1, decomposing a data conversion processing process into a plurality of data conversion processing flows, wherein each data conversion processing flow is divided into multiple stages, and each data conversion processing flow is in a streamline type;
s2, before the conversion processing of the data is executed, the data is segmented;
and S3, converting the sliced data in the step S2 for times through the data conversion processing flow of the step S1.
Preferably, in step S1, the data conversion process flow is dynamically configured by the data flow configuration center.
Preferably, in step S2, the data slicing rules and the data slicing rules are dynamically configured by the data slicing center.
Preferably, in step S1, at least data processing units are configured in each data conversion processing flow.
The invention uses a human interface information table as original data to perform data conversion processing by using pipelined high-efficiency data conversion processing methods.
Step S1, the data conversion processing process is decomposed into 5 data conversion processing flows through a data flow configuration center, wherein (1) the identity card number is used as a unique to mark duplication removal records, (2) the field value of men and women represented by 0/1 is converted into men/women, (3) new field postcodes are added, corresponding postcodes in a region code relation table in a database are searched according to home addresses, (4) new field ages are added, corresponding ages are calculated according to the identity card number, (5) the converted data are written into a local file, analysis and testing are conducted, the third flow needs to be in network communication with the database, execution time of other flows is short for local memory operation, therefore, for the third flow, multiple data processing units are distributed, the execution time of the third flow in the embodiment is about 3 times of the average execution time of the second flow, the fourth flow and the fifth flow, 3 threads are distributed to perform data parallel conversion processing, and processing speed of the third flow is improved.
Step S2, in this embodiment, the original data is in the form of a library table, and a single piece of data has the smallest divisible granularity, so that the data fragmentation rule is configured by the data fragmentation processing center, and the data to be processed is fragments per pieces.
Step S3: the plurality of data fragments generated in step S2 sequentially enter into the pipeline data conversion processing flow for execution.
In this embodiment, the data flow configuration center and the data slicing processing center may be part of a data management center defined in more .
The invention can realize high parallelization processing in the data processing and converting process by dividing the data into pieces and decomposing the process flow line type flow, thereby effectively improving the data processing and converting efficiency, and having simple and convenient configuration and reliable operation.

Claims (4)

1, kinds of pipelined high-efficiency data conversion processing method, which is characterized by comprising the following steps:
s1, decomposing a data conversion processing process into a plurality of data conversion processing flows, wherein each data conversion processing flow is divided into multiple stages, and each data conversion processing flow is in a streamline type;
s2, before the conversion processing of the data is executed, the data is segmented;
and S3, converting the sliced data in the step S2 for times through the data conversion processing flow of the step S1.
2. The pipeline type efficient data transformation method according to claim 1, wherein in step S1, the data transformation process flow is dynamically configured by a data flow configuration center.
3. The pipeline type efficient data transformation method according to claim 1, wherein in step S2, the data slicing rules and the data slicing rules are dynamically configured by the data slicing center.
4. The pipeline-type efficient data transformation processing method according to claim 1, wherein at least data processing units are configured in each data transformation processing flow in step S1.
CN201910873646.3A 2019-09-17 2019-09-17 pipelined efficient data conversion processing method Pending CN110737708A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910873646.3A CN110737708A (en) 2019-09-17 2019-09-17 pipelined efficient data conversion processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910873646.3A CN110737708A (en) 2019-09-17 2019-09-17 pipelined efficient data conversion processing method

Publications (1)

Publication Number Publication Date
CN110737708A true CN110737708A (en) 2020-01-31

Family

ID=69267967

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910873646.3A Pending CN110737708A (en) 2019-09-17 2019-09-17 pipelined efficient data conversion processing method

Country Status (1)

Country Link
CN (1) CN110737708A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101046724A (en) * 2006-05-10 2007-10-03 华为技术有限公司 Dish interface processor and method of processing disk operation command
CN101226624A (en) * 2008-02-15 2008-07-23 上海申通轨道交通研究咨询有限公司 Staging specification processing system for orbital traffic ticket business data and method thereof
CN101295249A (en) * 2008-06-26 2008-10-29 腾讯科技(深圳)有限公司 Method and system for dynamic configuration management of software interface style
CN101969402A (en) * 2010-10-18 2011-02-09 浪潮集团山东通用软件有限公司 Data exchanging method based on parallel processing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101046724A (en) * 2006-05-10 2007-10-03 华为技术有限公司 Dish interface processor and method of processing disk operation command
CN101226624A (en) * 2008-02-15 2008-07-23 上海申通轨道交通研究咨询有限公司 Staging specification processing system for orbital traffic ticket business data and method thereof
CN101295249A (en) * 2008-06-26 2008-10-29 腾讯科技(深圳)有限公司 Method and system for dynamic configuration management of software interface style
CN101969402A (en) * 2010-10-18 2011-02-09 浪潮集团山东通用软件有限公司 Data exchanging method based on parallel processing

Similar Documents

Publication Publication Date Title
CN104715073A (en) Association rule mining system based on improved Apriori algorithm
CN108334557B (en) Aggregated data analysis method and device, storage medium and electronic equipment
CN110019308A (en) Data query method, apparatus, equipment and storage medium
CN104391748A (en) Mapreduce computation process optimization method
CN106778079A (en) A kind of DNA sequence dna k mer frequency statistics methods based on MapReduce
CN105550253B (en) Method and device for acquiring type relationship
CN111652468A (en) Business process generation method and device, storage medium and computer equipment
CN105574032A (en) Rule matching operation method and device
CN105302915B (en) The high-performance data processing system calculated based on memory
CN106326005A (en) Automatic parameter tuning method for iterative MapReduce operation
CN110007955B (en) Compression method for decoding module code of instruction set simulator
CN110737708A (en) pipelined efficient data conversion processing method
CN110879753B (en) GPU acceleration performance optimization method and system based on automatic cluster resource management
CN109937413B (en) Processing method and system for massive crowd characteristic data
CN115733780B (en) Dynamic self-adaption method, system, equipment and medium based on flexible Ethernet
CN110176276B (en) Biological information analysis process management method and system
CN116680290A (en) Data query method and device based on data center
CN110825792A (en) High-concurrency distributed data retrieval method based on golang middleware coroutine mode
CN106970837B (en) Information processing method and electronic equipment
CN110825453B (en) Data processing method and device based on big data platform
CN105654106A (en) Decision tree generation method and system thereof
CN113283744A (en) Design and updating method for lightweight power consumption abnormal characteristic fingerprint database
CN113342550A (en) Data processing method, system, computing device and storage medium
CN107329846B (en) Big finger data comparison method based on big data technology
CN106815017B (en) Dynamic language performance analysis and display method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200131