CN111124685A - Big data processing method and device, electronic equipment and storage medium - Google Patents

Big data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111124685A
CN111124685A CN201911368393.0A CN201911368393A CN111124685A CN 111124685 A CN111124685 A CN 111124685A CN 201911368393 A CN201911368393 A CN 201911368393A CN 111124685 A CN111124685 A CN 111124685A
Authority
CN
China
Prior art keywords
data
sub
main process
processing
processes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911368393.0A
Other languages
Chinese (zh)
Inventor
徐成海
孙丰龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Digital China Health Technologies Co ltd
Original Assignee
Digital China Health Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Digital China Health Technologies Co ltd filed Critical Digital China Health Technologies Co ltd
Priority to CN201911368393.0A priority Critical patent/CN111124685A/en
Publication of CN111124685A publication Critical patent/CN111124685A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a big data processing method and device, electronic equipment and a storage medium, and relates to the technical field of big data processing. The big data processing method is based on Python and comprises the steps of obtaining a data processing task of a hierarchical data format file HDF, creating a corresponding main process and a plurality of sub processes according to the data processing task, scheduling the plurality of sub processes through the main process to respectively obtain sample data of the HDF data for concurrent processing, and outputting the processed data to a network training model.

Description

Big data processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a big data processing method and apparatus, an electronic device, and a storage medium.
Background
At present, when a python is used for training and predicting a deep learning model, data needs to be read into a memory from a file system, processed on line and finally input into a deep learning framework. The data access is the storage format of the data in the hard disk, namely, the interface for accessing the data, and the online processing is that some processing is performed after the data is fetched from the memory, and then the data is sent to the deep learning network.
In the prior art, all training data is written into an HDF5 file, and usually, multithreading is started to dynamically read data from an HDF5 during program running, perform data processing, and then send the data to a deep learning network model. The method has the advantages that by using the HDF5 technology, massive data can be read efficiently, the data can be compressed, and the operation is simple.
However, with the prior art, the current HDF5 does not support multi-process reading and writing, and only can perform multi-thread reading and writing, and if a complex data processing process is encountered, the global thread lock of python makes the overall thread lock become a bottleneck of the whole operation process, resulting in a technical problem of low efficiency of big data processing.
Disclosure of Invention
The application provides a big data processing method, a big data processing device, an electronic device and a storage medium, and aims to solve the technical problems that HDF5 in the prior art does not support multi-process reading and writing, can only perform multi-thread reading and writing, and if a complex data processing process is met, a global thread lock of python makes the whole operation process bottleneck, so that big data processing efficiency is low.
In order to achieve the above object, in a first aspect, an embodiment of the present application provides a big data processing method, which includes, based on python:
acquiring a data reading task of the HDF data of the hierarchical data file;
creating a corresponding main process and a plurality of sub processes according to the data processing task;
and scheduling a plurality of subprocesses through the main process to respectively acquire sample data of the HDF data for concurrent processing, and outputting the processed data to a network training model.
Optionally, the big data processing method further includes:
before a plurality of subprocesses are scheduled by a main process to respectively acquire sample data of HDF data and carry out concurrent processing, a communication queue for the communication between the main process and the plurality of subprocesses is generated, wherein the communication queue comprises: a data request queue and a data output queue.
Optionally, the scheduling, by the main process, the multiple sub-processes to respectively obtain sample data of the HDF5 data for concurrent processing, and before outputting the processed data to the network training model, the method includes:
and storing the indexes of the sample data to be processed of each sub-process to a data request queue through the main process.
Optionally, the scheduling, by the main process, the multiple sub-processes to respectively obtain sample data of the HDF data for concurrent processing, and outputting the processed data to the network training model, includes:
scheduling each subprocess to read the index from the data request queue through the main process;
and respectively reading sample data to be processed of the HDF data of the hierarchical data file according to the indexes through the sub-process, and simultaneously processing the sample data to be processed according to a preset data processing rule.
In a second aspect, the present application also provides a big data processing apparatus, which includes, based on python:
the acquisition module is used for acquiring a data processing task of the HDF data of the hierarchical data file;
the process module is used for creating a corresponding main process and a plurality of sub processes according to the data processing task;
and the concurrent processing module is used for scheduling the plurality of sub-processes through the main process to respectively acquire sample data of the HDF data for concurrent processing, and outputting the processed data to the network training model.
Optionally, the apparatus further comprises:
the communication module is used for generating a communication queue for communication between the main process and the plurality of sub-processes, wherein the communication queue comprises: a data request queue and a data output queue.
Optionally, the apparatus further comprises:
and the storage module is used for scheduling the plurality of sub-processes through the main process to respectively obtain the sample data of the HDF database for concurrent processing, and storing the index of the sample data to be processed of each sub-process to the data request queue through the main process before outputting the processed data to the network training model.
Optionally, the concurrent processing module includes:
the reading module is used for scheduling each subprocess through the main process to read the index from the data request queue and reading sample data from the HDF data of the hierarchical data file;
and the processing module is used for respectively reading the sample data to be processed of the HDF data of the hierarchical data file according to the indexes through each sub-process and simultaneously processing the sample data to be processed according to the preset data processing rule.
In a third aspect, the present application further provides an electronic device, which includes a memory and a processor, where the memory stores a computer program, and when the computer program in the memory is read and executed by the processor, the electronic device implements the method in the first aspect.
In a fourth aspect, the present application also proposes a computer-readable storage medium, on which a computer program is stored, which, when read and executed by a processor, implements the method of the first aspect.
Compared with the prior art, the method has the following beneficial effects:
the embodiment of the application provides a big data processing method and a big data processing device, wherein a main process schedules a plurality of subprocesses to respectively obtain sample data of HDF data for concurrent processing, the processed data is output to a network training model, the plurality of subprocesses run simultaneously, and each process has a lock and is not interfered with each other when running, so that the global interpreter lock of the main process is not required to be used in a competitive mode, the architecture can maximally utilize the resources of a multi-core CPU, and the data processing access and processing speed are improved.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly explain the technical solutions of the present application, the drawings needed for the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also derive other related drawings from these drawings without inventive effort.
Fig. 1 is a schematic flowchart illustrating a big data processing method according to an embodiment of the present application;
fig. 2 is a schematic diagram illustrating an architecture of a big data processing method according to an embodiment of the present application;
FIG. 3 is a schematic flow chart illustrating another big data processing method according to an embodiment of the present application;
FIG. 4 is a functional block diagram of a big data processing apparatus according to an embodiment of the present disclosure;
FIG. 5 is a block diagram of another embodiment of a big data processing apparatus;
FIG. 6 is a block diagram of another embodiment of a big data processing apparatus;
fig. 7 is a functional block diagram of a concurrency processing module according to an embodiment of the present application;
fig. 8 is a functional module schematic diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 is a schematic flow diagram of a big data processing method according to an embodiment of the present application, and fig. 2 is a schematic diagram of a big data processing architecture according to an embodiment of the present application, where as shown in fig. 1 and fig. 2, the big data processing method may be applied to a computer, an intelligent terminal, a server, and other devices, and the method according to the present embodiment is based on python, and includes:
and step S101, acquiring a data processing task of the HDF data of the hierarchical data file.
Specifically, when the deep learning model is trained and predicted by using python, data needs to be read into a memory from a file system for online processing, because HDF data can be compressed, the storage space of the file system is reduced, for example: HDF5 data can be used by medical image samples, geographic image samples and life image samples waiting for processing, so as to save storage space. Then, a data processing task of the HDF5 data is obtained, which may be, for example: it should be noted that, although the HDF5 data is described in detail in the present embodiment, the processing tasks such as resampling processing and label conversion on the image samples are not limited to this.
And step S102, creating a corresponding main process and a plurality of sub processes according to the data processing task.
Specifically, a corresponding main process and a plurality of sub processes can be created according to the configuration of the user and the data processing task.
And step S103, scheduling a plurality of sub-processes through the main process to respectively acquire sample data of the HDF data for concurrent processing, and outputting the processed data to a network training model.
In particular, although HDF5 can read and write the same file in multiple threads, the multiple threads of python cannot fully exploit the performance of a multi-core CPU due to the presence of a global interpreter lock, therefore, by adopting a multi-process processing mode, a plurality of sub-processes are scheduled by the main process to respectively acquire sample data of the HDF5 data, because N sub-processes run simultaneously, each process has a lock of the process, the processes do not interfere with each other when running, the global interpreter lock of the main process does not need to be used competitively, the processed data is output to the network training model, the architecture can maximally utilize the resources of a multi-core CPU, the limit of the global interpreter lock does not exist among the processes, the resources do not need to be competed among the processes, for example, a server with 10 cores starts 10 processes, each process runs independently, the CPU utilization rate reaches 100%, and the real multi-thread effect is achieved.
The embodiment of the application provides a big data processing method, which comprises the steps of obtaining a data processing task of HDF data of a hierarchical data file, creating a corresponding main process and a plurality of sub processes according to the data processing task, scheduling the plurality of sub processes through the main process to respectively obtain sample data of the HDF data for concurrent processing, and outputting the processed data to a network training model and simultaneously operating the plurality of sub processes.
Optionally, before the main process schedules the multiple sub-processes to respectively obtain sample data of the HDF data and perform concurrent processing, the method further includes:
generating a communication queue for communication between a main process and a plurality of sub-processes, wherein the communication queue comprises: a data request queue and a data output queue.
Specifically, after the main program is started, because communication between processes is required, it first generates a queue for communication between two processes: a data request queue and a data output queue. The data request queue is used for storing indexes of data requested by the main process to the subprocess, the main process puts the indexes of the data required to be requested into the queue, the subprocess reads the indexes of the data required to be processed from the queue and runs a program in the space of the main process, on one hand, the indexes of the data required to be obtained are written into the data request queue, on the other hand, the processed data are taken out from the data output queue and sent to a network model for training or inference.
Optionally, the scheduling, by the main process, the multiple sub-processes to respectively obtain sample data of the HDF database for concurrent processing, and before outputting the processed data to the network training model, the method includes:
and storing the indexes of the sample data to be processed of each sub-process to a data request queue through the main process.
Specifically, in order to match the data processing task with each sub-process, an index of sample data to be processed of each sub-process is stored in the data request queue through the main process, wherein the index indicates a position of a sample in the HDF data, and each sub-process can read the corresponding sample data to be processed through the position.
Optionally, fig. 3 is a schematic flow diagram of another big data processing method provided in an embodiment of the present application, and as shown in fig. 3, the method includes that a main process schedules a plurality of sub-processes to respectively obtain sample data of an HDF database for concurrent processing, and outputs the processed data to a network training model, where the method includes:
step S201, each subprocess is scheduled by the main process to read indexes from the data request queue;
step S202, respectively reading sample data to be processed of the HDF data of the hierarchical data file according to the indexes through each sub-process, and simultaneously processing the sample data to be processed according to a preset data processing rule.
Specifically, each sub-process first opens the HDF5 file and then loops through the index in the data request queue. After the index is read, the subprocess firstly obtains original data from the HDF5 file, and then sequentially transmits the original data into a user-defined data processing pipeline. After all processing is completed, the subprocess sends the data into a data output queue, outputs the processed data to a network training model, and then reads the next data index for repeated circulation.
It should be noted that, under the Linux operating system, the child process inherits the file descriptor table of the parent process, so the HDF5 file cannot be opened in the parent process, otherwise, the child process cannot read the HDF5 file. When the child process is started, only the HDF5 file path can be transmitted, and then each process opens the HDF5 file.
Application examples
For the above method, the present embodiment will be described by test examples as follows:
the testing hardware configuration may be an eosin X780-G30 server, CPU 40 core Intel (R) Xeon (R) Silver4114CPU @2.20GHz, 128G DDR4 memory, hard disk 16T SATA3(6Gbps), GPU 8 block NVIDIA PCI-ETesla V100.
The test environment is Ubuntu 16.04, the Python version is 3.6, the HDF5 read-write library is tables2.4, and the numpy version is 1.16.
The test data is data of a kidney cancer CT (computed tomography) slice and a label of 210 cases, the CT and the label are three-dimensional data numpy. array, the array size is generally 512x512x100, a compression algorithm is adopted in an HDF5 file, the storage space of a file system is reduced, and the size of the HDF5 file is 18G finally.
The test procedure was to randomly read all 210 samples of the HDF5 file and then process them on the CPU.
Wherein, CPU non-intensive processing: resampling to an array of size (128,128,128) is performed, and then converting the multi-valued label to a binary label (i.e., adding one pass).
CPU intensive processing: first resampled to an array of size (128,128,128), then traversed through all array elements, and finally converted to binary labels.
In addition to the above tests, HDF5 load data was compared to sample files in the format of npy loaded directly with numpy library.
Where multiple processes and multiple threads both use queues to reduce latency, the number of processes or threads is 10 and the size of the queue is 20.
For actual environments such as thread/process, CPU intensive/non-intensive, etc., the actual environments are divided into several groups of test cases, and table 1 is a test case result table of the big data processing method provided by this embodiment, as shown in table 1 below:
TABLE 1
Serial number Processing mode CPU non-intensive CPU intensive type
1 Single process (HDF5) 222 second 1610 seconds
2 Multithreading (N10, HDF5) 342 second 1584 seconds
3 Multiple process (N10, HDF5) 56 seconds 221 seconds
4 Multiple process (N10, npy) 889 seconds ——
As can be seen from table 1 above, there is no advantage in multithreading in a Python environment, the speed of loading data by using HDF5 is higher than that of other methods, and the processing speed is greatly improved by combining multiple processes with HDF 5.
Fig. 4 is a functional module schematic diagram of a big data processing apparatus according to an embodiment of the present application, please refer to fig. 4, it should be noted that the basic principle and the generated technical effect of the big data processing apparatus 300 according to the embodiment are the same as those of the corresponding method embodiment, and for brief description, the corresponding contents in the method embodiment may be referred to for the parts not mentioned in the embodiment. The big data processing apparatus 300 includes:
the acquisition module 310 is configured to acquire a data processing task of HDF data of the hierarchical data file;
a process module 320, configured to create a corresponding main process and multiple sub processes according to the data processing task;
and the concurrent processing module 330 is configured to schedule a plurality of sub-processes through the main process to respectively obtain sample data of the HDF data for concurrent processing, and output the processed data to the network training model.
The above-mentioned apparatus is used for executing the method provided by the foregoing embodiment, and the implementation principle and technical effect are similar, which are not described herein again.
Optionally, fig. 5 is a schematic functional module diagram of another big data processing apparatus according to an embodiment of the present application, please refer to fig. 5, in which the big data processing apparatus 300 further includes:
a communication module 340, configured to generate a communication queue for a main process and a plurality of sub-processes to communicate with each other, where the communication queue includes: a data request queue and a data output queue.
The above-mentioned apparatus is used for executing the method provided by the foregoing embodiment, and the implementation principle and technical effect are similar, which are not described herein again.
Optionally, fig. 6 is a schematic functional module diagram of another big data processing apparatus according to an embodiment of the present application, please refer to fig. 6, where the big data processing apparatus 300 further includes:
and the storage module 350 is configured to schedule the multiple sub-processes through the main process to respectively obtain sample data of the HDF data for concurrent processing, and store an index of the sample data to be processed of each sub-process to the data request queue through the main process before outputting the processed data to the network training model.
The above-mentioned apparatus is used for executing the method provided by the foregoing embodiment, and the implementation principle and technical effect are similar, which are not described herein again.
Optionally, fig. 7 is a functional module schematic diagram of a concurrency processing module according to an embodiment of the present application, please refer to fig. 7, where the concurrency processing module 330 includes:
the reading module 331, configured to schedule each sub-process to read an index from the data request queue through the main process;
and the processing module 332 is configured to read, by each sub-process, sample data to be processed of the HDF data of the hierarchical data file according to the index, and process the sample data to be processed simultaneously according to a preset data processing rule.
The above-mentioned apparatus is used for executing the method provided by the foregoing embodiment, and the implementation principle and technical effect are similar, which are not described herein again.
These above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when one of the above modules is implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. For another example, these modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).
Fig. 8 is a functional module diagram of an electronic device according to an embodiment of the present application, please refer to fig. 8, the electronic device may include a processor 1001 and a memory 1002, and the processor 1001 may call a computer program stored in the memory 1002. When read and executed by the processor 1001, the computer program may implement the above-described method embodiments. The specific implementation and technical effects are similar, and are not described herein again.
Optionally, the present application further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is read and executed by a processor, the above method embodiments can be implemented.
In the several embodiments provided in the present application, it should be understood that the above-described apparatus embodiments are merely illustrative, and the disclosed apparatus and method may be implemented in other ways. For example, the division of the unit is only a logical function division, and in actual implementation, there may be another division manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or may not be executed, for example, each unit may be integrated into one processing unit, each unit may exist alone physically, or two or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A big data processing method is characterized in that based on python, the method comprises the following steps:
acquiring a data processing task of the HDF data of the hierarchical data file;
creating a corresponding main process and a plurality of sub processes according to the data processing task;
and scheduling a plurality of sub-processes through the main process to respectively acquire sample data of the HDF data for concurrent processing, and outputting the processed data to a network training model.
2. The method according to claim 1, wherein before the main process schedules the plurality of sub-processes to respectively obtain sample data of the HDF data for concurrent processing, the method further comprises:
generating a communication queue for the main process and the plurality of sub-processes to communicate with each other, wherein the communication queue comprises: a data request queue and a data output queue.
3. The method according to claim 2, wherein the scheduling, by the main process, the plurality of sub-processes to respectively obtain sample data of the HDF data for concurrent processing, and before outputting the processed data to a network training model, comprises:
and storing the index of the sample data to be processed of each sub-process to the data request queue through the main process.
4. The method according to claim 3, wherein the scheduling, by the main process, the plurality of sub-processes to respectively obtain sample data of the HDF data for concurrent processing, and outputting the processed data to a network training model includes:
scheduling each of the sub-processes by the main process to read the index from the data request queue;
and respectively reading sample data to be processed of the HDF data of the hierarchical data file according to the indexes through each subprocess, and simultaneously processing the sample data to be processed according to a preset data processing rule.
5. A big data processing apparatus, python based, comprising:
the acquisition module is used for acquiring a data processing task of the HDF data of the hierarchical data file;
the process module is used for creating a corresponding main process and a plurality of sub processes according to the data processing task;
and the concurrent processing module is used for scheduling the plurality of sub-processes through the main process to respectively acquire sample data of the HDF data for concurrent processing, and outputting the processed data to a network training model.
6. The apparatus of claim 5, further comprising:
a communication module, configured to generate a communication queue for the main process and the plurality of sub-processes to communicate with each other, where the communication queue includes: a data request queue and a data output queue.
7. The apparatus of claim 6, further comprising:
and the storage module is used for scheduling the plurality of sub-processes through the main process to respectively acquire the sample data of the HDF data for concurrent processing, and storing the index of the sample data to be processed of each sub-process to the data request queue through the main process before outputting the processed data to a network training model.
8. The apparatus of claim 7, wherein the concurrent processing module comprises:
a reading module, configured to schedule each of the sub-processes through the main process to read the index from the data request queue;
and the processing module is used for reading sample data to be processed of the HDF data of the hierarchical data file through each subprocess according to the index and simultaneously processing the sample data to be processed according to a preset data processing rule.
9. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, and the processor is configured to execute the computer program to perform the steps of the big data processing method according to any of claims 1 to 4.
10. A storage medium, characterized in that the storage medium has stored thereon a computer program which, when being executed by a processor, performs the steps of the big data processing method according to any of claims 1 to 4.
CN201911368393.0A 2019-12-26 2019-12-26 Big data processing method and device, electronic equipment and storage medium Pending CN111124685A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911368393.0A CN111124685A (en) 2019-12-26 2019-12-26 Big data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911368393.0A CN111124685A (en) 2019-12-26 2019-12-26 Big data processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111124685A true CN111124685A (en) 2020-05-08

Family

ID=70503185

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911368393.0A Pending CN111124685A (en) 2019-12-26 2019-12-26 Big data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111124685A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112346879A (en) * 2020-11-06 2021-02-09 网易(杭州)网络有限公司 Process management method and device, computer equipment and storage medium
CN113326139A (en) * 2021-06-28 2021-08-31 上海商汤科技开发有限公司 Task processing method, device, equipment and storage medium
CN113391909A (en) * 2021-06-28 2021-09-14 上海商汤科技开发有限公司 Process creation method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050039049A1 (en) * 2003-08-14 2005-02-17 International Business Machines Corporation Method and apparatus for a multiple concurrent writer file system
CN103744643A (en) * 2014-01-10 2014-04-23 浪潮(北京)电子信息产业有限公司 Method and device for structuring a plurality of nodes parallel under multithreaded program
CN104268229A (en) * 2014-09-26 2015-01-07 北京金山安全软件有限公司 Resource obtaining method and device based on multi-process browser
CN105337755A (en) * 2014-08-08 2016-02-17 阿里巴巴集团控股有限公司 Master-slave architecture server, service processing method thereof and service processing system thereof
CN109144741A (en) * 2017-06-13 2019-01-04 广东神马搜索科技有限公司 The method, apparatus and electronic equipment of interprocess communication
CN109922319A (en) * 2019-03-26 2019-06-21 重庆英卡电子有限公司 RTSP agreement multiple video strems Parallel preconditioning method based on multi-core CPU
CN110162452A (en) * 2019-04-30 2019-08-23 广州微算互联信息技术有限公司 A kind of analog detection method, device and storage medium for testing and control service
CN110413386A (en) * 2019-06-27 2019-11-05 深圳市富途网络科技有限公司 Multiprocessing method, apparatus, terminal device and computer readable storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050039049A1 (en) * 2003-08-14 2005-02-17 International Business Machines Corporation Method and apparatus for a multiple concurrent writer file system
CN103744643A (en) * 2014-01-10 2014-04-23 浪潮(北京)电子信息产业有限公司 Method and device for structuring a plurality of nodes parallel under multithreaded program
CN105337755A (en) * 2014-08-08 2016-02-17 阿里巴巴集团控股有限公司 Master-slave architecture server, service processing method thereof and service processing system thereof
CN104268229A (en) * 2014-09-26 2015-01-07 北京金山安全软件有限公司 Resource obtaining method and device based on multi-process browser
CN109144741A (en) * 2017-06-13 2019-01-04 广东神马搜索科技有限公司 The method, apparatus and electronic equipment of interprocess communication
CN109922319A (en) * 2019-03-26 2019-06-21 重庆英卡电子有限公司 RTSP agreement multiple video strems Parallel preconditioning method based on multi-core CPU
CN110162452A (en) * 2019-04-30 2019-08-23 广州微算互联信息技术有限公司 A kind of analog detection method, device and storage medium for testing and control service
CN110413386A (en) * 2019-06-27 2019-11-05 深圳市富途网络科技有限公司 Multiprocessing method, apparatus, terminal device and computer readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郭学兵: "基于Python的并行编程技术在批量气象规范报表入库处理中的应用" *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112346879A (en) * 2020-11-06 2021-02-09 网易(杭州)网络有限公司 Process management method and device, computer equipment and storage medium
CN112346879B (en) * 2020-11-06 2023-08-11 网易(杭州)网络有限公司 Process management method, device, computer equipment and storage medium
CN113326139A (en) * 2021-06-28 2021-08-31 上海商汤科技开发有限公司 Task processing method, device, equipment and storage medium
CN113391909A (en) * 2021-06-28 2021-09-14 上海商汤科技开发有限公司 Process creation method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US20200293360A1 (en) Techniques to manage virtual classes for statistical tests
Manolache et al. Schedulability analysis of applications with stochastic task execution times
CN111258744A (en) Task processing method based on heterogeneous computation and software and hardware framework system
CN111124685A (en) Big data processing method and device, electronic equipment and storage medium
US20080209436A1 (en) Automated testing of programs using race-detection and flipping
US20130061231A1 (en) Configurable computing architecture
CN111190741A (en) Scheduling method, device and storage medium based on deep learning node calculation
Gu et al. Improving execution concurrency of large-scale matrix multiplication on distributed data-parallel platforms
Hong et al. Hierarchical dataflow modeling of iterative applications
CN114924748A (en) Compiling method, device and equipment
CN113407343A (en) Service processing method, device and equipment based on resource allocation
CN116521350B (en) ETL scheduling method and device based on deep learning algorithm
US11531565B2 (en) Techniques to generate execution schedules from neural network computation graphs
KR20150101870A (en) Method and apparatus for avoiding bank conflict in memory
Elshazly et al. Accelerated execution via eager-release of dependencies in task-based workflows
CN113760497A (en) Scheduling task configuration method and device
CN114127681A (en) Method and apparatus for enabling autonomous acceleration of data flow AI applications
Batko et al. Actor model of Anemone functional language
Delestrac et al. Demystifying the TensorFlow Eager Execution of Deep Learning Inference on a CPU-GPU Tandem
Beach et al. Integrating acceleration devices using CometCloud
McDonagh et al. Applying semi-synchronised task farming to large-scale computer vision problems
US20230273818A1 (en) Highly parallel processing architecture with out-of-order resolution
CN117742928B (en) Algorithm component execution scheduling method for federal learning
WO2024109312A1 (en) Task scheduling execution method, and generation method and apparatus for task scheduling execution instruction
Friebe et al. Work-in-Progress: Validation of Probabilistic Timing Models of a Periodic Task with Interference-A Case Study

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200508

RJ01 Rejection of invention patent application after publication