CN112241872A - Distributed data calculation analysis method, device, equipment and storage medium - Google Patents

Distributed data calculation analysis method, device, equipment and storage medium Download PDF

Info

Publication number
CN112241872A
CN112241872A CN202011087359.9A CN202011087359A CN112241872A CN 112241872 A CN112241872 A CN 112241872A CN 202011087359 A CN202011087359 A CN 202011087359A CN 112241872 A CN112241872 A CN 112241872A
Authority
CN
China
Prior art keywords
analysis
data
calculation
analysis result
analyzed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011087359.9A
Other languages
Chinese (zh)
Inventor
黄培
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhongyan Network Technology Co ltd
Original Assignee
Shanghai Zhongyan Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zhongyan Network Technology Co ltd filed Critical Shanghai Zhongyan Network Technology Co ltd
Priority to CN202011087359.9A priority Critical patent/CN112241872A/en
Publication of CN112241872A publication Critical patent/CN112241872A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Economics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a distributed data calculation analysis method, a device, equipment and a storage medium. The method comprises the steps of obtaining a project to be analyzed and generating a corresponding analysis model; acquiring data to be calculated, and dividing the data to be calculated into a plurality of data blocks; establishing a calculation analysis task, and performing calculation analysis on each data block by using the calculation analysis task to obtain a corresponding temporary analysis result; and combining each temporary analysis result to obtain a final data analysis result. The method is based on the Python language building, adopts a multi-level mode, firstly carries out to-be-analyzed project segmentation, then carries out data segmentation, merges temporary analysis results after analyzing each data block to obtain final data analysis results, only needs to deploy Python application environment, is simple to deploy, and is easy in data integration. The method and the device solve the technical problems that analysis and calculation aiming at mass data in the related technology mostly depend on a frame, the building and deployment are complex, and data integration is not easy to carry out.

Description

Distributed data calculation analysis method, device, equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a distributed data computation analysis method, apparatus, device, and storage medium.
Background
Distributed data processing is the processing of data using distributed computing techniques. With the rapid expansion of data volume, the data volume faced by internet companies has reached PB level, and traditional centralized data processing has become increasingly unable to meet market demand. A distributed network is composed of several intercommunicating computers, each having its own processor and storage device, and the huge computing tasks originally concentrated on a single node are load-balanced and assigned to the computers in the distributed network for parallel processing.
At present, a plurality of frameworks exist in a distributed data analysis mode in the market, such as Spark, Dash, Loky, Celery and the like, but the frameworks have disadvantages in specific scenes, for example, Spark deployment is troublesome; the Dash has powerful functions but unique bottom layer structure, and is inconvenient for data integration; the Loky documents are fewer, the bottom layer is compared, and the use threshold is higher; celery requires embedded code and has low performance.
In the prior art, analysis and calculation aiming at mass data mostly depend on a frame, construction and deployment are complex, and data integration is not easy to perform.
Aiming at the problems that analysis and calculation aiming at mass data in the related technology mostly depend on a frame, the construction and deployment are complex, and data integration is difficult to carry out, an effective solution is not provided at present.
Disclosure of Invention
The application mainly aims to provide a distributed data calculation analysis method, a distributed data calculation analysis device, distributed data calculation equipment and a storage medium, so that the problems that analysis and calculation for mass data in the related technology mostly depend on a frame, the building and deployment are complex, and data integration is not easy to perform are solved.
In order to achieve the above object, in a first aspect, the present application provides a distributed data computation analysis method.
The method according to the application comprises the following steps:
acquiring a project to be analyzed, and generating a corresponding analysis model according to the type of the project to be analyzed;
acquiring data to be calculated of an item to be analyzed;
dividing the data to be calculated into a plurality of data blocks according to a preset division rule corresponding to the analysis model;
establishing a calculation analysis task, and performing calculation analysis on each data block by using the calculation analysis task to obtain a corresponding temporary analysis result;
and combining each temporary analysis result to obtain a final data analysis result.
In one possible implementation manner of the present application, the preset segmentation rule is: and averagely dividing the data to be calculated into a plurality of data blocks, wherein the number of the data blocks is the same as the number of idle processes of the system.
In one possible implementation of the present application, the number of computational analysis tasks is the same as the number of data blocks.
In a possible implementation manner of the present application, a calculation analysis task is used to perform calculation analysis on each data block to obtain a corresponding temporary analysis result, which specifically includes: and carrying out calculation analysis on each data block in parallel by using each calculation analysis task to obtain a corresponding temporary analysis result.
In a possible implementation manner of the present application, merging each temporary analysis result to obtain a final data analysis result includes:
and merging each temporary analysis result according to a merging calculation frame corresponding to the analysis model to generate a final result file and obtain a final data analysis result.
In one possible implementation of the present application, the method supports stand-alone applications and multi-machine distributed applications.
In a second aspect, the present application further provides a distributed data computation analysis apparatus, including:
the acquisition module is used for acquiring the items to be analyzed and the data to be calculated of the items to be analyzed;
the processing module is used for generating a corresponding analysis model according to the type of the item to be analyzed;
dividing the data to be calculated into a plurality of data blocks according to a preset division rule corresponding to the analysis model;
establishing a calculation analysis task, and performing calculation analysis on each data block by using the calculation analysis task to obtain a corresponding temporary analysis result;
and the merging output module is used for merging each temporary analysis result to obtain a final data analysis result.
In one possible implementation manner of the present application, the processing module includes:
and the parallel processing unit is used for performing calculation analysis on each data block in parallel by using each calculation analysis task to obtain a corresponding temporary analysis result.
In one possible implementation manner of the present application, the merging output module is specifically configured to:
and merging each temporary analysis result according to a merging calculation frame corresponding to the analysis model to generate a final result file and obtain a final data analysis result.
In a third aspect, the present application further provides a distributed data computing and analyzing electronic device, where the electronic device includes:
one or more processors;
a memory; and
one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the processor to implement the distributed data computational analysis method of any of the first aspects.
In a fourth aspect, the present application further provides a computer-readable storage medium having a computer program stored thereon, the computer program being loaded by a processor to perform the steps of the distributed data computation analysis method of any one of the first aspect.
In the embodiment of the application, a distributed data calculation analysis method is built based on Python language, a multi-level mode is adopted, firstly, a project to be analyzed is segmented, then, data to be calculated of the project to be analyzed is segmented, each data block is analyzed, then, temporary analysis results are combined, and a final data analysis result is obtained; and the technical problems that analysis and calculation for mass data in the related technology mostly depend on a frame, the construction and deployment are complex, and data integration is difficult to perform are solved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, serve to provide a further understanding of the application and to enable other features, objects, and advantages of the application to be more apparent. The drawings and their description illustrate the embodiments of the invention and do not limit it. In the drawings:
FIG. 1 is a schematic flow chart diagram illustrating an embodiment of a distributed data computation analysis method according to an embodiment of the present application;
FIG. 2 is a schematic structural diagram of an embodiment of a distributed data computation analysis apparatus according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an embodiment of a distributed data computation analysis electronic device according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be used. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In this application, the terms "upper", "lower", "left", "right", "front", "rear", "top", "bottom", "inner", "outer", "middle", "vertical", "horizontal", "lateral", "longitudinal", and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings. These terms are used primarily to better describe the present application and its embodiments, and are not used to limit the indicated devices, elements or components to a particular orientation or to be constructed and operated in a particular orientation.
Moreover, some of the above terms may be used to indicate other meanings besides the orientation or positional relationship, for example, the term "on" may also be used to indicate some kind of attachment or connection relationship in some cases. The specific meaning of these terms in this application will be understood by those of ordinary skill in the art as appropriate.
In addition, the term "plurality" shall mean two as well as more than two.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
First, an embodiment of the present application provides a distributed data calculation and analysis method, where an execution subject of the distributed data calculation and analysis method is a distributed data calculation and analysis device, the distributed data calculation and analysis device is applied to a processor, and the distributed data calculation and analysis method includes: acquiring a project to be analyzed, and generating a corresponding analysis model according to the type of the project to be analyzed; acquiring data to be calculated of an item to be analyzed; dividing the data to be calculated into a plurality of data blocks according to a preset division rule corresponding to the analysis model; establishing a calculation analysis task, and performing calculation analysis on each data block by using the calculation analysis task to obtain a corresponding temporary analysis result; and combining each temporary analysis result to obtain a final data analysis result.
Referring to fig. 1, fig. 1 is a schematic flowchart illustrating an embodiment of a distributed data calculation and analysis method according to an embodiment of the present application, where the distributed data calculation and analysis method includes:
101. and acquiring the item to be analyzed, and generating a corresponding analysis model according to the type of the item to be analyzed.
In the embodiment of the present application, if an item to be analyzed is a research item, and the size of data to be calculated of the research item is 500 ten thousand, the corresponding analysis model is an analysis model of a statistical result for each topic of the research item.
102. And acquiring data to be calculated of the item to be analyzed.
In the embodiment of the present application, data to be calculated of a project to be analyzed is uniformly stored in a database, the data to be calculated is obtained, and the data to be calculated can be called from the database.
103. And dividing the data to be calculated into a plurality of data blocks according to a preset division rule corresponding to the analysis model.
In the embodiment of the present application, the preset segmentation rule is: the data to be calculated is averagely divided into a plurality of data blocks, the number of the divided data blocks is the same as the number of the idle processes of the system, namely if the number of the idle processes of the system is 500, the 500 ten thousand data are averagely divided into 500 data blocks, wherein each data block comprises 1 ten thousand data.
104. And establishing a calculation analysis task, and performing calculation analysis on each data block by using the calculation analysis task to obtain a corresponding temporary analysis result.
In the embodiment of the application, the number of created computational analysis tasks is the same as the number of data blocks, that is, the number of created computational analysis tasks is the same as the number of system idle processes, each data block may correspond to one computational analysis task, and each data block may correspond to one system idle process, that is, the computational analysis tasks may be invoked by the system idle processes to perform computational analysis on the data blocks in parallel and simultaneously, so as to obtain corresponding temporary analysis results, that is, 500 temporary analysis results are obtained.
105. And combining each temporary analysis result to obtain a final data analysis result.
In the embodiment of the application, 500 temporary analysis results are subjected to combined calculation according to a combined calculation frame corresponding to the analysis model, a final result file is generated, a final data analysis result is obtained, and rapid data analysis and data sharing can be realized only by loading the final data analysis result from the final result file.
The method can support single-machine application and multi-machine distributed application, can conduct real-time statistical analysis on 10 hundred million-level project quantities and 10 hundred million-level data quantities, is built based on Python language, adopts a multi-level mode, firstly conducts to-be-analyzed project segmentation on the ten million-level project quantities, then conducts data segmentation on to-be-calculated data of a to-be-analyzed project, averagely divides the data of each to-be-analyzed project into a plurality of data blocks, combines temporary analysis results after analyzing each data block to obtain a final data analysis result of each project, can be achieved only by deploying Python application environment, is simple in deployment, easy in data integration, free of other configuration, capable of improving concurrency capacity only by adding a service process, and easy to expand.
In order to better implement the distributed data calculation and analysis method in the embodiment of the present application, on the basis of the distributed data calculation and analysis method, an embodiment of the present application further provides a distributed data calculation and analysis device, as shown in fig. 2, the distributed data calculation and analysis device 200 includes:
an obtaining module 201, configured to obtain an item to be analyzed and data to be calculated of the item to be analyzed;
the processing module 202 is configured to generate a corresponding analysis model according to the type of the item to be analyzed;
dividing the data to be calculated into a plurality of data blocks according to a preset division rule corresponding to the analysis model;
establishing a calculation analysis task, and performing calculation analysis on each data block by using the calculation analysis task to obtain a corresponding temporary analysis result;
and a merging output module 203, configured to merge each temporary analysis result to obtain a final data analysis result.
In some embodiments of the present application, the processing module 202 comprises:
the parallel processing unit 2021 is configured to perform computation analysis on each data block in parallel by using each computation analysis task to obtain a corresponding temporary analysis result.
In some embodiments of the present application, the merge output module 203 is specifically configured to:
and merging each temporary analysis result according to a merging calculation frame corresponding to the analysis model to generate a final result file and obtain a final data analysis result.
Specifically, for a specific process of implementing the functions of each module and unit in the device in the embodiment of the present application, reference may be made to the description of the distributed data calculation and analysis method in the corresponding embodiment of fig. 1, which is not described herein again in detail.
The embodiment of the present application further provides a distributed data calculation and analysis electronic device, which integrates any one of the distributed data calculation and analysis apparatuses provided in the embodiment of the present application, and the electronic device includes:
one or more processors;
a memory; and
one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the processor to perform the steps of the distributed data calculation analysis method in any of the above embodiments of the distributed data calculation analysis method.
The distributed data calculation and analysis electronic device provided by the embodiment of the application integrates any one of the distributed data calculation and analysis devices provided by the embodiment of the application. As shown in fig. 3, it shows a schematic structural diagram of an electronic device according to an embodiment of the present application, specifically:
the electronic device may include components such as a processor 301 of one or more processing cores, memory 302 of one or more computer-readable storage media, a power supply 303, and an input unit 304. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 3 does not constitute a limitation of the electronic device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:
the processor 301 is a control center of the electronic device, connects various parts of the whole electronic device by various interfaces and lines, and performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 302 and calling data stored in the memory 302, thereby performing overall monitoring of the electronic device. Optionally, processor 301 may include one or more processing cores; the Processor 301 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, preferably the processor 301 may integrate an application processor, which handles primarily the operating system, user interfaces, application programs, etc., and a modem processor, which handles primarily wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 301.
The memory 302 may be used to store software programs and modules, and the processor 301 executes various functional applications and data processing by operating the software programs and modules stored in the memory 302. The memory 302 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to the use of the server, and the like. Further, the memory 302 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 302 may also include a memory controller to provide the processor 301 with access to the memory 302.
The electronic device further comprises a power supply 303 for supplying power to each component, and preferably, the power supply 303 can be logically connected with the processor 301 through a power management system, so that functions of charging, discharging, power consumption management and the like can be managed through the power management system. The power supply 303 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
The electronic device may further include an input unit 304, and the input unit 304 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the server may further include a display unit and the like, which will not be described in detail herein. Specifically, in this embodiment, the processor 301 in the electronic device loads the executable file corresponding to the process of one or more application programs into the memory 302 according to the following instructions, and the processor 301 runs the application programs stored in the memory 302, thereby implementing various functions as follows:
acquiring a project to be analyzed, and generating a corresponding analysis model according to the type of the project to be analyzed;
acquiring data to be calculated of an item to be analyzed;
dividing the data to be calculated into a plurality of data blocks according to a preset division rule corresponding to the analysis model;
establishing a calculation analysis task, and performing calculation analysis on each data block by using the calculation analysis task to obtain a corresponding temporary analysis result;
and combining each temporary analysis result to obtain a final data analysis result.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the distributed data calculation and analysis apparatus, the electronic device and the corresponding units thereof described above may refer to the description of the distributed data calculation and analysis method in the embodiment corresponding to fig. 1, and are not described herein again in detail.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by related hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by the processor 301.
To this end, an embodiment of the present application provides a computer-readable storage medium, which may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like. The computer program is loaded by the processor to execute the steps of any one of the distributed data calculation and analysis methods provided by the embodiments of the present application. For example, the computer program may be loaded by a processor to perform the steps of:
acquiring a project to be analyzed, and generating a corresponding analysis model according to the type of the project to be analyzed;
acquiring data to be calculated of an item to be analyzed;
dividing the data to be calculated into a plurality of data blocks according to a preset division rule corresponding to the analysis model;
establishing a calculation analysis task, and performing calculation analysis on each data block by using the calculation analysis task to obtain a corresponding temporary analysis result;
and combining each temporary analysis result to obtain a final data analysis result.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A distributed data calculation analysis method is characterized in that the method is built based on Python language, and the method comprises the following steps:
acquiring a project to be analyzed, and generating a corresponding analysis model according to the type of the project to be analyzed;
acquiring data to be calculated of the item to be analyzed;
dividing the data to be calculated into a plurality of data blocks according to a preset division rule corresponding to the analysis model;
establishing a calculation analysis task, and performing calculation analysis on each data block by using the calculation analysis task to obtain a corresponding temporary analysis result;
and combining each temporary analysis result to obtain a final data analysis result.
2. The method of claim 1, wherein the preset segmentation rule is: and averagely dividing the data to be calculated into a plurality of data blocks, wherein the number of the data blocks is the same as the number of idle processes of the system.
3. The method of claim 1, wherein the number of computational analysis tasks is the same as the number of data blocks.
4. The method of claim 3, wherein the computational analysis is performed on each of the data blocks by using the computational analysis task to obtain a corresponding temporary analysis result, specifically: and utilizing each calculation analysis task to perform calculation analysis on each data block in parallel to obtain a corresponding temporary analysis result.
5. The method of claim 1, wherein said combining each of said temporal analysis results to obtain a final data analysis result comprises:
and merging each temporary analysis result according to a merging calculation frame corresponding to the analysis model to generate a final result file and obtain a final data analysis result.
6. The method of claim 1, wherein the method supports stand-alone applications and multi-machine distributed applications.
7. A distributed data computation analysis apparatus, comprising:
the acquisition module is used for acquiring a project to be analyzed and data to be calculated of the project to be analyzed;
the processing module is used for generating a corresponding analysis model according to the type of the item to be analyzed;
dividing the data to be calculated into a plurality of data blocks according to a preset division rule corresponding to the analysis model;
establishing a calculation analysis task, and performing calculation analysis on each data block by using the calculation analysis task to obtain a corresponding temporary analysis result;
and the merging output module is used for merging each temporary analysis result to obtain a final data analysis result.
8. The apparatus of claim 7, wherein the processing module comprises:
and the parallel processing unit is used for performing calculation analysis on each data block in parallel by using each calculation analysis task to obtain a corresponding temporary analysis result.
9. A distributed data computing analysis electronic device, comprising:
one or more processors;
a memory; and
one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the processor to implement the distributed data computing analysis method of any of claims 1-6.
10. A computer-readable storage medium, having stored thereon a computer program which is loaded by a processor to perform the steps of the distributed data computing analysis method of any one of claims 1 to 6.
CN202011087359.9A 2020-10-12 2020-10-12 Distributed data calculation analysis method, device, equipment and storage medium Pending CN112241872A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011087359.9A CN112241872A (en) 2020-10-12 2020-10-12 Distributed data calculation analysis method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011087359.9A CN112241872A (en) 2020-10-12 2020-10-12 Distributed data calculation analysis method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112241872A true CN112241872A (en) 2021-01-19

Family

ID=74168779

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011087359.9A Pending CN112241872A (en) 2020-10-12 2020-10-12 Distributed data calculation analysis method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112241872A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113704340A (en) * 2021-08-30 2021-11-26 远景智能国际私人投资有限公司 Data processing method, device, server and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815249A (en) * 2019-02-22 2019-05-28 苏州华必讯信息科技有限公司 The fast parallel extracting method of the large data files mapped based on memory
CN110489242A (en) * 2019-09-24 2019-11-22 深圳前海微众银行股份有限公司 Distributed data calculation method, device, terminal device and storage medium
CN110619464A (en) * 2019-09-12 2019-12-27 阿里巴巴集团控股有限公司 Data analysis method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815249A (en) * 2019-02-22 2019-05-28 苏州华必讯信息科技有限公司 The fast parallel extracting method of the large data files mapped based on memory
CN110619464A (en) * 2019-09-12 2019-12-27 阿里巴巴集团控股有限公司 Data analysis method and device
CN110489242A (en) * 2019-09-24 2019-11-22 深圳前海微众银行股份有限公司 Distributed data calculation method, device, terminal device and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113704340A (en) * 2021-08-30 2021-11-26 远景智能国际私人投资有限公司 Data processing method, device, server and storage medium
CN113704340B (en) * 2021-08-30 2023-07-21 远景智能国际私人投资有限公司 Data processing method, device, server and storage medium

Similar Documents

Publication Publication Date Title
Abd Elaziz et al. Advanced optimization technique for scheduling IoT tasks in cloud-fog computing environments
US11151442B2 (en) Convolutional neural network processing method and device
US9477512B2 (en) Task-based modeling for parallel data integration
CN107861796B (en) Virtual machine scheduling method supporting energy consumption optimization of cloud data center
Zhang et al. Parallel rough set based knowledge acquisition using MapReduce from big data
Thakkar et al. Renda: resource and network aware data placement algorithm for periodic workloads in cloud
Deng et al. A data and task co-scheduling algorithm for scientific cloud workflows
Morad et al. Generalized MultiAmdahl: Optimization of heterogeneous multi-accelerator SoC
Yang et al. Improving Spark performance with MPTE in heterogeneous environments
Shen et al. Performance prediction of parallel computing models to analyze cloud-based big data applications
CN115225506A (en) Data processing method and system based on cloud platform, electronic equipment and storage medium
Marszałkowski et al. Time and energy performance of parallel systems with hierarchical memory
CN115033616A (en) Data screening rule verification method and device based on multi-round sampling
CN112241872A (en) Distributed data calculation analysis method, device, equipment and storage medium
Kim et al. Towards the design of a system and a workflow model for medical big data processing in the hybrid cloud
Mabrouk et al. Efficient adaptive load balancing approach for compressive background subtraction algorithm on heterogeneous CPU–GPU platforms
Salinas-Hilburg et al. Energy-aware task scheduling in data centers using an application signature
Ying et al. Towards fault tolerance optimization based on checkpoints of in-memory framework spark
Khan Hadoop performance modeling and job optimization for big data analytics
Rekachinsky et al. Modeling parallel processing of databases on the central processor Intel Xeon Phi KNL
CN115630122A (en) Data synchronization method and device, storage medium and computer equipment
CN111831425B (en) Data processing method, device and equipment
Wang et al. Adaptive elasticity policies for staging-based in situ visualization
Wang et al. An adaptive elasticity policy for staging based in-situ processing
Senger et al. Hierarchical scheduling of independent tasks with shared files

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210119