CN116610447A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN116610447A
CN116610447A CN202310587107.XA CN202310587107A CN116610447A CN 116610447 A CN116610447 A CN 116610447A CN 202310587107 A CN202310587107 A CN 202310587107A CN 116610447 A CN116610447 A CN 116610447A
Authority
CN
China
Prior art keywords
data processing
downloading
download
data
subtasks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310587107.XA
Other languages
Chinese (zh)
Inventor
张云业
刘启翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Information Technology Co Ltd
Original Assignee
Jingdong Technology Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Information Technology Co Ltd filed Critical Jingdong Technology Information Technology Co Ltd
Priority to CN202310587107.XA priority Critical patent/CN116610447A/en
Publication of CN116610447A publication Critical patent/CN116610447A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method and device, and relates to the technical field of computers. One embodiment of the method comprises the following steps: responding to the data processing request, and determining a downloading type according to the data processing request; according to the downloading type, obtaining at least one data processing subtask, and sending the data processing subtask to a message distribution component; acquiring the data processing subtasks from the message distribution component in response to the downloading machine request in the idle state, and distributing the data processing subtasks to the downloading machine in the idle state; and processing the downloaded files corresponding to all the data processing subtasks in response to the completion of the execution of all the data processing subtasks, and generating a data processing result. According to the embodiment, the downloading task can be executed through multiple processes and multiple threads, the data downloading speed and the downloading efficiency are improved, the downloading machine in an idle state is used for downloading the data, the utilization rate of the downloading machine in the application cluster is improved, and the method has good external expansibility.

Description

Data processing method and device
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data processing method and apparatus.
Background
The current method for downloading data through the relational database is to download the data through a single process, then write the downloaded data into a download file, and finally store the download file into a cloud storage for downloading.
In the process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art:
the downloading is executed by a single process, the downloading speed is low, the utilization rate of a downloading machine in an application cluster is low, the external expansibility is poor, and the downloading of a relational database and the downloading of big data cannot be compatible.
Disclosure of Invention
In view of this, the embodiments of the present invention provide a data processing method and apparatus, which can execute a download task through multiple processes and multiple threads, improve the data download speed and download efficiency, and through downloading data by a download machine in an idle state, improve the utilization rate of the download machine in an application cluster, have better external expansibility, and can be compatible with relational database download and big data download.
To achieve the above object, according to one aspect of an embodiment of the present invention, there is provided a data processing method.
A data processing method, comprising: responding to a data processing request, and determining a downloading type according to the data processing request; according to the downloading type, obtaining at least one data processing subtask, and sending the data processing subtask to a message distribution component; responding to a downloading machine request in an idle state to acquire the data processing subtasks from the message distribution component, and distributing the data processing subtasks to the downloading machine in the idle state so as to download corresponding downloading files by using the downloading machine in the idle state; and responding to the completion of the execution of all the data processing subtasks, and processing the downloaded files corresponding to all the data processing subtasks according to a preset processing rule to generate a data processing result.
Optionally, the processing rule includes a splitting and merging rule, and the processing, according to a preset processing rule, the downloaded files corresponding to all the data processing subtasks to generate a data processing result includes: merging the downloaded files corresponding to all the data processing subtasks; and splitting the combination result according to the data splitting parameters in the splitting and combining rule to generate the data processing result.
Optionally, the processing the downloaded files corresponding to all the data processing subtasks according to a preset processing rule, after generating the data processing result, further includes: and storing the data processing result into a database, and generating a storage address of the data processing result so as to acquire the data processing result through the storage address.
Optionally, in the case that the download type is a relational database data download, the downloading the corresponding download file using the download machine in the idle state includes: and according to the downloading parameters, downloading the downloading file by using the downloading machine in the idle state through a mode of reading and writing in a single thread by multiple threads.
Optionally, in the case that the download type is big data download, the downloading the corresponding download file using the download machine in the idle state includes: and calling a big data platform through the downloading machine in the idle state, assembling downloading parameters according to the task information of the data processing subtasks, and downloading the downloaded file based on the big data platform according to the downloading parameters.
Optionally, the downloading of the relational database data includes downloading statistical information of the relational database or downloading mass data of the relational database, and the obtaining at least one data processing subtask according to the downloading type includes: generating a data processing subtask under the condition that the download type is the statistical information download of the relational database or the big data download; and generating a plurality of data processing subtasks under the condition that the downloading type is the mass data downloading of the relational database.
Optionally, the task information includes a system identifier and a page identifier, and the downloading parameters are assembled according to the task information of the data processing subtasks, including: inquiring a database table comprising the downloading parameters according to the system identifier and the page identifier; and assembling the downloading parameters based on the database table, wherein the downloading parameters comprise a display field, a target data table and a data date.
According to another aspect of an embodiment of the present invention, there is provided a data processing apparatus.
A data processing apparatus comprising: the download type determining module is used for responding to the data processing request and determining the download type according to the data processing request; the data processing subtask generating module is used for obtaining at least one data processing subtask according to the downloading type and sending the data processing subtask to the message distributing component; a data processing subtask distribution module, configured to obtain the data processing subtask from the message distribution component in response to a download machine request in an idle state, and distribute the data processing subtask to a download machine in the idle state, so as to download a corresponding download file by using the download machine in the idle state; and the data processing result generation module is used for responding to the completion of the execution of all the data processing subtasks, processing the downloaded files corresponding to all the data processing subtasks according to a preset processing rule and generating a data processing result.
Optionally, the processing rule includes a split-merge rule, and the data processing result generating module is further configured to: merging the downloaded files corresponding to all the data processing subtasks; and splitting the combination result according to the data splitting parameters in the splitting and combining rule to generate the data processing result.
Optionally, the system further comprises a data processing result storage module for: and storing the data processing result into a database, and generating a storage address of the data processing result so as to acquire the data processing result through the storage address.
Optionally, in the case that the download type is a relational database data download, the data processing subtask distribution module is further configured to: and according to the downloading parameters, downloading the downloading file by using the downloading machine in the idle state through a mode of reading and writing in a single thread by multiple threads.
Optionally, in the case that the download type is big data download, the data processing subtask distribution module is further configured to: and calling a big data platform through the downloading machine in the idle state, assembling downloading parameters according to the task information of the data processing subtasks, and downloading the downloaded file based on the big data platform according to the downloading parameters.
Optionally, the relational database data downloading includes relational database statistical information downloading or relational database mass data downloading, and the data processing subtask generating module is further configured to: generating a data processing subtask under the condition that the download type is the statistical information download of the relational database or the big data download; and generating a plurality of data processing subtasks under the condition that the downloading type is the mass data downloading of the relational database.
Optionally, the task information includes a system identifier and a page identifier, and the data processing subtask distribution module is further configured to: inquiring a database table comprising the downloading parameters according to the system identifier and the page identifier; and assembling the downloading parameters based on the database table, wherein the downloading parameters comprise a display field, a target data table and a data date.
According to yet another aspect of an embodiment of the present invention, an electronic device is provided.
An electronic device, comprising: one or more processors; and the memory is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors are enabled to realize the data processing method provided by the embodiment of the invention.
According to yet another aspect of an embodiment of the present invention, a computer-readable medium is provided.
A computer readable medium having stored thereon a computer program which, when executed by a processor, implements a data processing method provided by an embodiment of the present invention.
One embodiment of the above invention has the following advantages or benefits: determining a download type according to the data processing request by responding to the data processing request; according to the downloading type, obtaining at least one data processing subtask, and sending the data processing subtask to a message distribution component; responding to the downloading machine request in the idle state to acquire the data processing subtasks from the message distribution component, and distributing the data processing subtasks to the downloading machine in the idle state so as to download the corresponding downloading files by using the downloading machine in the idle state; in response to completion of execution of all the data processing subtasks, processing the download files corresponding to all the data processing subtasks according to a preset processing rule, and generating a technical scheme of a data processing result, the download tasks can be executed through multiple processes and multiple threads, so that the data downloading speed and the downloading efficiency are improved, the downloading machines in an idle state are used for downloading data, the utilization rate of the downloading machines in an application cluster is improved, the external expansibility is good, and the downloading of a relational database and the downloading of big data can be compatible.
Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main steps of a data processing method according to one embodiment of the invention;
FIG. 2 is a flow diagram of a data processing method according to one embodiment of the invention;
FIG. 3 is a schematic diagram of the architecture of a database read operation module according to one embodiment of the invention;
FIG. 4 is a schematic diagram of the architecture of a file read/write module operation module according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the architecture of a file storage module according to one embodiment of the invention;
FIG. 6 is a schematic architecture diagram of a task execution module according to one embodiment of the invention;
FIG. 7 is a schematic diagram of the main modules of a data processing apparatus according to one embodiment of the present invention;
FIG. 8 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;
fig. 9 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram of main steps of a data processing method according to an embodiment of the present invention.
As shown in fig. 1, the data processing method according to an embodiment of the present invention mainly includes the following steps S101 to S104.
Step S101: in response to the data processing request, a download type is determined from the data processing request.
The download type may include a relational database data download or a big data download, the relational database data download may include a relational database statistics download or a relational database mass data download, the big data download may be a big data history slice download, the relational database data may be mysql (a relational database management system) data or PostgreSQL (an object-relational database management system of free software) data, etc.
Step S102: and obtaining at least one data processing subtask according to the downloading type, and sending the data processing subtask to the message distribution component.
In one embodiment, obtaining not less than one data processing subtask according to the download type may include: generating a data processing subtask under the condition that the download type is the statistical information download or big data download of the relational database; and generating a plurality of data processing subtasks under the condition that the download type is the mass data download of the relational database.
Specifically, the data processing request may further include task information such as a system identifier and a page identifier, where the download type is a mass data download of the relational database, according to the system identifier and the page identifier, a database table including a download parameter is queried, the number of data processing subtasks is calculated according to the number of database tables (e.g., the number of database tables may be an integer multiple of the number of data processing subtasks), and a plurality of data processing subtasks are generated according to the number of data processing subtasks.
Step S103: and responding to the downloading machine request in the idle state to acquire the data processing subtasks from the message distribution component, and distributing the data processing subtasks to the downloading machine in the idle state so as to download the corresponding downloading files by using the downloading machine in the idle state.
Specifically, the downloading application cluster scans unexecuted data processing subtasks regularly, and if the unexecuted data processing subtasks are found, the downloading machine in an idle state in the application cluster obtains the data processing subtasks and task information according to the data processing subtask batch number request message distribution component. The downloading machine in the idle state may be a downloading machine with a cpu (i.e. central processing unit) utilization rate lower than a preset threshold.
In one embodiment, in the case that the download type is a relational database data download, downloading the corresponding download file using the download machine in the idle state may include: and according to the downloading parameters, downloading the downloading file by using a downloading machine in an idle state through a multithreading reading and single-thread writing mode. The downloading parameters can be assembled according to the task information of the data processing subtasks, and the downloading parameters can comprise: inquiring a database table comprising the downloading parameters according to the system identification and the page identification; and assembling the downloading parameters based on the database table, wherein the downloading parameters comprise a display field, a target data table and a data date.
Specifically, according to the download sub-task information, the download parameters (the database table to be downloaded, the download field, the field processing logic, etc.) are assembled, the multi-thread execution database reads the data operation, the task processing component (such as the disraptor component) is used, the asynchronous single-thread execution data writing operation is used for downloading the download file, and the download file is stored in the cloud storage.
In one embodiment, in the case that the download type is big data download, downloading the corresponding download file using the download machine in the idle state may include: and calling the big data platform through the downloading machine in the idle state, assembling the downloading parameters according to the task information of the data processing subtasks, and downloading the downloaded file based on the big data platform according to the downloading parameters.
Specifically, the downloading machine in the idle state is used for calling the big data platform to start the big data downloading task flow, the downloading parameters are assembled according to the task information of the data processing subtasks, the big data platform is called to be started to create the downloading calculation task, the big data platform is called to be started to create the downloading pushing data task under the condition that the downloading calculation task is ended, and the downloading file is stored in the cloud storage under the condition that the downloading pushing task is ended.
Step S104: and responding to the completion of the execution of all the data processing subtasks, and processing the downloaded files corresponding to all the data processing subtasks according to a preset processing rule to generate a data processing result. Wherein the processing rules may include split-merge rules.
In one embodiment, processing the downloaded files corresponding to all the data processing subtasks according to a preset processing rule, to generate a data processing result may include: merging the downloaded files corresponding to all the data processing subtasks; and splitting the combination result according to the data splitting parameters in the splitting and combining rule to generate a data processing result.
Specifically, after all the data processing subtasks are finished, the downloaded file is directly pulled from the cloud storage, and the data merging and splitting of the corresponding file types are executed according to the configured downloaded file types and the data splitting parameters. The data splitting parameter may be maximum number information of single file, and splitting may be performed according to file type and number of data, for example, 50 ten thousand data are defined for each file, and when the number of data after combining exceeds 50 ten thousand, file splitting may be performed.
In one embodiment, processing the downloaded files corresponding to all the data processing subtasks according to a preset processing rule, and after generating the data processing result, the method may further include: and storing the data processing result into a database (such as cloud storage), and generating a storage address of the data processing result so as to acquire the data processing result through the storage address (such as a file downloading address).
FIG. 2 is a flow diagram of a data processing method according to one embodiment of the invention.
As shown in fig. 2, according to the download interface configuration system identifier and the page identifier, according to the configuration selection file download type, the maximum number of single files and the subtask (i.e. subtask) splitting rule, a data processing request is generated. In response to the data processing request, one or more data processing sub-tasks are created according to the download type and sent to the message distribution component. The downloading machine in the idle state requests to acquire the data processing subtasks from the message distribution assembly at fixed time, downloads the download file in a mode of downloading the relational database or downloading the big data platform according to the download type of the data processing subtasks, and stores the download file in the cloud storage. When all the data processing subtasks are executed, the downloaded files corresponding to all the data processing subtasks are pulled from the cloud storage to be combined and split, data processing results are generated, the data processing results are stored in the cloud storage, and storage addresses (namely file downloading addresses) of the data processing results are generated so as to obtain the data processing results through the storage addresses.
In one embodiment, the data processing method mainly comprises a configuration parameter assembling module, a database reading operation module, a file reading and writing (merging and splitting) module, a storage module, a task executing module, a historical data fragment downloading service flow module and the like.
The configuration parameter assembly module mainly depends on a bottom layer configuration driver, can flexibly adjust a downloading strategy, improves the execution efficiency of downloading and rapidly meets the requirement of service downloading data; according to system codes (i.e. system identification), download fields (i.e. asset types), page codes (i.e. page identification), inquiring field information required to be displayed by downloading and field processing logic (e.g. some fields are obtained by other field operations and some fields are enumerated and displayed), inquiring file types of files currently downloaded, inquiring maximum number of files currently downloaded and the like; acquiring different configured subtask numbers according to the system codes, the asset types, the page codes, the download types transmitted by the caller and the number of the data storage tables; when the number of servers is increased and the number of data storage tables is changed, the task allocation can be adjusted at any time, and the downloading efficiency can be improved to the maximum under the condition that the application is not restarted as much as possible.
FIG. 3 is a schematic diagram of the architecture of a database read operation module according to one embodiment of the invention.
As shown in FIG. 3, in one embodiment, the database read operation module includes a data presentation field and its presentation configuration query, a specific data query. In order to improve function expansion, unify parameter assembly of query flow and code multiplexing rate, a query interface is uniformly used for providing service to the outside, and for different queries, data query is differentially processed according to the steps of parameter verification, parameter assembly, data call, query post-processing, query result processing according to configuration display information and the like. Independent small modules such as a data query function, a query parameter check function, a query return value processing function, a query parameter assembly function and the like can flexibly provide services for other functions.
FIG. 4 is a schematic diagram of the architecture of a file read/write module operation module according to an embodiment of the present invention.
As shown in fig. 4, in one embodiment, the file read/write module operation module is divided into two functional modules: a file read-write splitting module; and a file creation and deletion module. The file read-write splitting module mainly realizes the functions of reading and writing different file types (such as excel, csv and the like), closing file streams, merging and splitting files and the like, and the operation of excel is based on packaging by an easy excel component. The file creation and deletion module mainly realizes the general functions of files such as deletion and creation of files, deletion and creation of file directories and the like.
FIG. 5 is a schematic diagram of an architecture of a file storage module according to one embodiment of the invention.
As shown in fig. 5, in one embodiment, the file storage module mainly encapsulates the functional operations of cloud storage file upload and download, SFTP (secure file transfer protocol) file upload and download.
FIG. 6 is a schematic architecture diagram of a task execution module according to one embodiment of the invention.
As shown in FIG. 6, in one embodiment, the task execution module mainly encapsulates operations such as adding or deleting a download task and some special queries.
In the download function of the ABS (Asset support securities) system, by using the message distribution component, the disraptor component, and the like, in combination with the above functional modules, the ABS download function can be implemented by multi-process and multi-thread, the download speed is provided, and the application cluster utilization rate is improved.
In one embodiment, the historical data fragment downloading function module realizes the function of downloading the historical data fragments mainly by means of a system internal service flow component and a rich RPC (Remote Procedure Call Protocol ) interface provided by a big data platform, and greatly saves the processes of manually creating a historical data downloading task, running number, approval and the like in the past; according to the transfer parameters of the caller, assembling parameters required by the downloading of the historical data fragments: download display field, download target data table, download date, download calculation task data table, etc.; invoking a calculation task RPC interface provided by a big data platform, creating and starting a download calculation data task, and periodically scanning whether the download calculation task is completed or not, and waiting all the time if the download calculation task is not completed; calling a push task RPC interface provided by a large data platform, creating and starting a download push data task, pushing data to a designated FTP file server, and periodically scanning whether the download push data task is completed or not, and waiting all the time if the download push data task is completed; and acquiring the file from the FTP server, calling a file splitting and merging strategy of the corresponding file type according to the configured downloaded file type, splitting the file into a plurality of files, finally packing and uploading the files to cloud storage, and returning a cloud storage path to a caller.
Fig. 7 is a schematic diagram of main modules of a data processing apparatus according to an embodiment of the present invention.
As shown in fig. 7, a data processing apparatus 700 according to an embodiment of the present invention mainly includes: a download type determining module 701, a data processing subtask generating module 702, a data processing subtask distributing module 703 and a data processing result generating module 704.
The download type determining module 701 is configured to determine a download type according to a data processing request in response to the data processing request.
The data processing subtask generating module 702 is configured to obtain at least one data processing subtask according to the download type, and send the data processing subtask to the message distribution component.
The data processing subtask distribution module 703 is configured to obtain the data processing subtask from the message distribution component in response to a download machine request in an idle state, and distribute the data processing subtask to the download machine in the idle state, so as to download the corresponding download file using the download machine in the idle state.
And the data processing result generating module 704 is configured to process the downloaded files corresponding to all the data processing subtasks according to a preset processing rule in response to completion of execution of all the data processing subtasks, and generate a data processing result.
In one embodiment, the processing rules may include split merge rules, and the data processing result generation module 704 is specifically configured to: merging the downloaded files corresponding to all the data processing subtasks; and splitting the combination result according to the data splitting parameters in the splitting and combining rule to generate a data processing result.
In one embodiment, the device may further include a data processing result saving module (not shown in the figure) for: and storing the data processing result into a database, and generating a storage address of the data processing result so as to acquire the data processing result through the storage address.
In one embodiment, in the case where the download type is a relational database data download, the data processing subtask distribution module 703 is specifically configured to: and according to the downloading parameters, downloading the downloading file by using a downloading machine in an idle state through a multithreading reading and single-thread writing mode.
In one embodiment, in the case that the download type is big data download, the data processing subtask distribution module 703 is specifically configured to: and calling the big data platform through the downloading machine in the idle state, assembling the downloading parameters according to the task information of the data processing subtasks, and downloading the downloaded file based on the big data platform according to the downloading parameters.
In one embodiment, the relational database data downloading may include relational database statistics downloading or relational database mass data downloading, and the data processing subtask generating module is specifically configured to: generating a data processing subtask under the condition that the download type is the statistical information download or big data download of the relational database; and generating a plurality of data processing subtasks under the condition that the download type is the mass data download of the relational database.
In one embodiment, the task information may include a system identifier and a page identifier, and the data processing subtask distribution module is specifically configured to: inquiring a database table comprising the downloading parameters according to the system identification and the page identification; and assembling the downloading parameters based on the database table, wherein the downloading parameters comprise a display field, a target data table and a data date.
In addition, the specific implementation of the data processing apparatus in the embodiments of the present invention has been described in detail in the above data processing method, and thus the description thereof will not be repeated here.
Fig. 8 illustrates an exemplary system architecture 800 in which a data processing method or data processing apparatus of an embodiment of the present invention may be applied.
As shown in fig. 8, a system architecture 800 may include terminal devices 801, 802, 803, a network 804, and a server 805. The network 804 serves as a medium for providing communication links between the terminal devices 801, 802, 803 and the server 805. The network 804 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A user may interact with the server 805 through the network 804 using the terminal devices 801, 802, 803 to receive or send messages or the like. Various communication client applications such as data processing class applications, data download applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only) may be installed on the terminal devices 801, 802, 803.
The terminal devices 801, 802, 803 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 805 may be a server providing various services, such as a background management server (by way of example only) providing support for data processing class websites browsed by users using the terminal devices 801, 802, 803. The background management server can respond to the received data such as the data processing request and the like, and determine the downloading type according to the data processing request; according to the downloading type, obtaining at least one data processing subtask, and sending the data processing subtask to a message distribution component; responding to the downloading machine request in the idle state to acquire the data processing subtasks from the message distribution component, and distributing the data processing subtasks to the downloading machine in the idle state so as to download the corresponding downloading files by using the downloading machine in the idle state; and responding to the completion of the execution of all the data processing subtasks, processing the downloaded files corresponding to all the data processing subtasks according to a preset processing rule, generating a data processing result and other processes, and feeding back the processing result (such as the data processing result-only an example) to the terminal equipment.
It should be noted that, the data processing method provided by the embodiment of the present invention is generally executed by the server 805, and accordingly, the data processing apparatus is generally disposed in the server 805.
It should be understood that the number of terminal devices, networks and servers in fig. 8 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 9, there is illustrated a schematic diagram of a computer system 900 suitable for use in implementing a terminal device or server in accordance with an embodiment of the present invention. The terminal device or server shown in fig. 9 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present invention.
As shown in fig. 9, the computer system 900 includes a Central Processing Unit (CPU) 901, which can execute various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 902 or a program loaded from a storage section 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data necessary for the operation of the system 900 are also stored. The CPU 901, ROM 902, and RAM 903 are connected to each other through a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.
The following components are connected to the I/O interface 905: an input section 906 including a keyboard, a mouse, and the like; an output portion 907 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage portion 908 including a hard disk or the like; and a communication section 909 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 910 is also connected to the I/O interface 905 as needed. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on the drive 910 so that a computer program read out therefrom is installed into the storage section 908 as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 909 and/or installed from the removable medium 911. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 901.
The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in a processor, for example, as: the processor comprises a download type determining module, a data processing subtask generating module, a data processing subtask distributing module and a data processing result generating module. The names of these modules do not constitute a limitation on the module itself in some cases, and for example, the download type determining module may also be described as "a module for determining a download type from a data processing request in response to the data processing request".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to include: responding to the data processing request, and determining a downloading type according to the data processing request; according to the downloading type, obtaining at least one data processing subtask, and sending the data processing subtask to a message distribution component; responding to the downloading machine request in the idle state to acquire the data processing subtasks from the message distribution component, and distributing the data processing subtasks to the downloading machine in the idle state so as to download the corresponding downloading files by using the downloading machine in the idle state; in response to completion of execution of all the data processing subtasks, processing the downloaded files corresponding to all the data processing subtasks according to a preset processing rule to generate a data processing result
According to the technical scheme of the embodiment of the invention, the downloading type is determined according to the data processing request in response to the data processing request; according to the downloading type, obtaining at least one data processing subtask, and sending the data processing subtask to a message distribution component; responding to the downloading machine request in the idle state to acquire the data processing subtasks from the message distribution component, and distributing the data processing subtasks to the downloading machine in the idle state so as to download the corresponding downloading files by using the downloading machine in the idle state; and responding to the completion of the execution of all the data processing subtasks, and processing the downloaded files corresponding to all the data processing subtasks according to a preset processing rule to generate a data processing result. The download task can be executed through multiple processes and multiple threads, the data download speed and download efficiency are improved, the download machine in an idle state is used for downloading data, the utilization rate of the download machine in the application cluster is improved, the external expansibility is good, and the relational database download and the big data download can be compatible.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method of data processing, comprising:
responding to a data processing request, and determining a downloading type according to the data processing request;
according to the downloading type, obtaining at least one data processing subtask, and sending the data processing subtask to a message distribution component;
responding to a downloading machine request in an idle state to acquire the data processing subtasks from the message distribution component, and distributing the data processing subtasks to the downloading machine in the idle state so as to download corresponding downloading files by using the downloading machine in the idle state;
and responding to the completion of the execution of all the data processing subtasks, and processing the downloaded files corresponding to all the data processing subtasks according to a preset processing rule to generate a data processing result.
2. The method according to claim 1, wherein the processing rule includes a split-merge rule, and the processing the downloaded files corresponding to all the data processing subtasks according to the preset processing rule to generate a data processing result includes:
merging the downloaded files corresponding to all the data processing subtasks;
and splitting the combination result according to the data splitting parameters in the splitting and combining rule to generate the data processing result.
3. The method according to claim 1, wherein the processing the downloaded files corresponding to all the data processing subtasks according to the preset processing rule, after generating the data processing result, further includes:
and storing the data processing result into a database, and generating a storage address of the data processing result so as to acquire the data processing result through the storage address.
4. The method of claim 1, wherein in the event that the download type is a relational database data download, the downloading the corresponding download file using the idle state download machine comprises:
and according to the downloading parameters, downloading the downloading file by using the downloading machine in the idle state through a mode of reading and writing in a single thread by multiple threads.
5. The method of claim 1, wherein, in the case where the download type is big data download, the downloading the corresponding download file using the download machine in the idle state comprises:
and calling a big data platform through the downloading machine in the idle state, assembling downloading parameters according to the task information of the data processing subtasks, and downloading the downloaded file based on the big data platform according to the downloading parameters.
6. The method according to claim 4 or 5, wherein the relational database data download includes a relational database statistics download or a relational database mass data download, and the obtaining not less than one data processing subtask according to the download type includes:
generating a data processing subtask under the condition that the download type is the statistical information download of the relational database or the big data download;
and generating a plurality of data processing subtasks under the condition that the downloading type is the mass data downloading of the relational database.
7. The method according to claim 4 or 5, wherein the task information includes a system identifier and a page identifier, and the task information according to the data processing subtask includes a task information package download parameter, including:
inquiring a database table comprising the downloading parameters according to the system identifier and the page identifier;
and assembling the downloading parameters based on the database table, wherein the downloading parameters comprise a display field, a target data table and a data date.
8. A data processing apparatus, comprising:
the download type determining module is used for responding to the data processing request and determining the download type according to the data processing request;
the data processing subtask generating module is used for obtaining at least one data processing subtask according to the downloading type and sending the data processing subtask to the message distributing component;
a data processing subtask distribution module, configured to obtain the data processing subtask from the message distribution component in response to a download machine request in an idle state, and distribute the data processing subtask to a download machine in the idle state, so as to download a corresponding download file by using the download machine in the idle state;
and the data processing result generation module is used for responding to the completion of the execution of all the data processing subtasks, processing the downloaded files corresponding to all the data processing subtasks according to a preset processing rule and generating a data processing result.
9. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-7.
10. A computer readable medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method according to any of claims 1-7.
CN202310587107.XA 2023-05-23 2023-05-23 Data processing method and device Pending CN116610447A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310587107.XA CN116610447A (en) 2023-05-23 2023-05-23 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310587107.XA CN116610447A (en) 2023-05-23 2023-05-23 Data processing method and device

Publications (1)

Publication Number Publication Date
CN116610447A true CN116610447A (en) 2023-08-18

Family

ID=87674196

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310587107.XA Pending CN116610447A (en) 2023-05-23 2023-05-23 Data processing method and device

Country Status (1)

Country Link
CN (1) CN116610447A (en)

Similar Documents

Publication Publication Date Title
US8818940B2 (en) Systems and methods for performing record actions in a multi-tenant database and application system
CN108629029B (en) Data processing method and device applied to data warehouse
CN109388626B (en) Method and apparatus for assigning numbers to services
CN111478781B (en) Message broadcasting method and device
CN110572422A (en) Data downloading method and device
CN110909022A (en) Data query method and device
CN112835904A (en) Data processing method and data processing device
CN112000734A (en) Big data processing method and device
CN113742389A (en) Service processing method and device
CN113452733A (en) File downloading method and device
CN112398669A (en) Hadoop deployment method and device
CN112685481A (en) Data processing method and device
CN111831503A (en) Monitoring method based on monitoring agent and monitoring agent device
CN116610447A (en) Data processing method and device
CN113779122B (en) Method and device for exporting data
CN112395337B (en) Data export method and device
CN112711572B (en) Online capacity expansion method and device suitable for database and table division
CN112688982B (en) User request processing method and device
CN109213815B (en) Method, device, server terminal and readable medium for controlling execution times
CN113556370A (en) Service calling method and device
CN113626176A (en) Service request processing method and device
CN113572704A (en) Information processing method, production end, consumption end and server
CN112783914A (en) Statement optimization method and device
CN114090524A (en) Excel file distributed exporting method and device
CN111461583B (en) Inventory checking method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination