WO2023124000A1 - Multi-concurrency data processing method and device - Google Patents

Multi-concurrency data processing method and device Download PDF

Info

Publication number
WO2023124000A1
WO2023124000A1 PCT/CN2022/104711 CN2022104711W WO2023124000A1 WO 2023124000 A1 WO2023124000 A1 WO 2023124000A1 CN 2022104711 W CN2022104711 W CN 2022104711W WO 2023124000 A1 WO2023124000 A1 WO 2023124000A1
Authority
WO
WIPO (PCT)
Prior art keywords
processing
container
task
vehicle
vehicle work
Prior art date
Application number
PCT/CN2022/104711
Other languages
French (fr)
Chinese (zh)
Inventor
王振东
张伟德
朱军
刘坤鹏
郑朝友
段锐
孙建蕾
任思阳
葛绍亮
刘加银
Original Assignee
中国第一汽车股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国第一汽车股份有限公司 filed Critical 中国第一汽车股份有限公司
Publication of WO2023124000A1 publication Critical patent/WO2023124000A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Definitions

  • the embodiments of the present application relate to the vehicle field, and in particular, relate to a multi-concurrent data processing method and device.
  • Embodiments of the present application provide a multi-concurrent data processing method and device, so as to at least solve the technical problem in the related art that hardware resources cannot be fully utilized for high-concurrency task processing.
  • a multi-concurrent data processing method including: acquiring the drive test data collected by the vehicle; decomposing the drive test data to generate multiple vehicle work tasks to be processed in parallel, wherein , the vehicle work task is the task of controlling the vehicle to work according to the specified control instruction; submit multiple vehicle work tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster; The container cluster manager schedules multiple processing containers to process each vehicle task to be processed in parallel in parallel to generate a processing result, wherein each processing container has a set of computing resources, and the computing resources are used to process the vehicle task.
  • the method further includes: the server cluster includes multiple servers, and a corresponding number of processing containers are obtained according to the number of computing resources of the server cluster, and the computing resources include at least processor resources and storage resources, wherein the server cluster training deployment has At least one master server and at least one slave server.
  • the container cluster manager is installed on the master server to monitor the working status of the slave servers.
  • create a task processing program of the container cluster manager wherein the task processing program is used to determine the number of processing containers that need to be transferred according to preset parameters; the task processing program is encapsulated to build a task processing image; based on the task processing image , constructing the containers required for running the vehicle work tasks, wherein the containers include: a management container for scheduling management and a processing container for running tasks.
  • scheduling multiple processing containers through the container cluster manager to perform parallel processing on each vehicle work task to be processed in parallel including: receiving multiple vehicle work tasks to be processed in parallel, scheduling at least one management container and communicating with Processing containers with the same number of vehicle work tasks; each management container distributes corresponding vehicle work tasks to designated processing containers; starts the processing containers, and each processing container runs the assigned vehicle work tasks respectively.
  • the method further includes: merging the sub-run results of each vehicle work task to generate a processing result; storing the processing result to a predetermined database, wherein , the database is a database that allows interactive queries.
  • a multi-concurrent data processing device including: an acquisition component configured to acquire drive test data collected by vehicles; a decomposition component configured to decompose the drive test data, Generate multiple vehicle work tasks to be processed in parallel, wherein the vehicle work task is a task to control the vehicle to work according to the specified control instruction; the submission component is set to submit multiple vehicle work tasks to be processed in parallel to the container cluster manager, Among them, the container cluster manager is used to arrange and schedule computing resources in the server cluster; the processing component is set to schedule multiple processing containers through the container cluster manager to perform parallel processing on each vehicle task to be processed in parallel and generate processing results , where each processing container has a set of computing resources used to process vehicle work tasks.
  • the server cluster includes multiple servers, and the corresponding number of processing containers are obtained according to the number of computing resources of the server cluster.
  • the computing resources include at least: processor resources and storage resources, wherein the server training deployment has at least one master server and At least one slave server.
  • the container cluster manager is installed on the master server to monitor the working status of the slave server.
  • the device further includes: a merging component, configured to merge the sub-running results of each vehicle work task, and generate a processing result; a storage component, configured to store the processing result in a predetermined database, wherein the database allows interactive query database.
  • a merging component configured to merge the sub-running results of each vehicle work task, and generate a processing result
  • a storage component configured to store the processing result in a predetermined database, wherein the database allows interactive query database.
  • a non-volatile readable storage medium includes a stored program, wherein when the program is running, the device where the non-volatile storage medium is located is controlled to execute the above multi-concurrent data processing method.
  • a processor is also provided.
  • the processor is configured to run a program, wherein the above-mentioned multi-concurrent data processing method is executed when the program is running.
  • the road test data collected by the vehicle is obtained; the drive test data is decomposed to generate a plurality of vehicle work tasks to be processed in parallel, wherein the vehicle work tasks are tasks that control the vehicle to work according to the specified control instructions ; Submit multiple vehicle tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster; The vehicle work tasks processed in parallel are processed in parallel to generate processing results, wherein each processing container has a set of computing resources, and the computing resources are used to process the vehicle work tasks.
  • the present application decomposes the drive test data to generate multiple vehicle tasks to be processed in parallel, and uses the container cluster manager to schedule multiple processing containers to perform parallel processing on each vehicle task to be processed in parallel, thereby It solves the technical problem that hardware resources cannot be fully utilized to process high concurrent tasks in related technologies.
  • Fig. 1 is a flow chart of a multi-concurrent data processing method according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of task scheduling according to an embodiment of the present application.
  • Fig. 3 is a schematic diagram of task decomposition according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a multi-concurrent data processing device according to an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a non-volatile storage medium according to an embodiment of the present application.
  • Fig. 6 is a schematic structural diagram of a processor according to an embodiment of the present application.
  • an embodiment of a multi-concurrent data processing method is provided. It should be noted that the steps shown in the flowcharts of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and , although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that shown or described herein.
  • Fig. 1 is a flowchart of a multi-concurrent data processing method according to an embodiment of the present application. As shown in Figure 1, the method may include the following steps:
  • Step S102 acquiring road test data collected by the vehicle.
  • the road test data can be the data collected by the car during the automatic driving process, can be the speed, acceleration and time of the automatic driving process, or can be the performance of each module in the automatic driving process, etc. .
  • the driving trajectory, driving speed, acceleration and performance parameters of each module during the driving process of the test vehicle can be collected through the camera, radar, sensor of the target vehicle, controller, etc. during the automatic driving process.
  • Step S104 decomposing the drive test data to generate a plurality of vehicle work tasks to be processed in parallel, wherein the vehicle work tasks are tasks for controlling the vehicle to work according to specified control instructions.
  • the vehicle work task is to control the vehicle to work according to the specified control command, which may be whether to accelerate, decelerate, and brake suddenly on a certain road section.
  • Decomposing the road test data can be used to label the road test data according to established rules.
  • the rules can be to collect whether there are traffic lights on the driving section, whether there are pedestrians, the number of pedestrians, and whether there is sudden braking during driving.
  • the data processing program may be a cluster computing engine (spark).
  • Step S106 submitting multiple vehicle work tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster.
  • the container cluster manager can be a large-scale container cluster management tool (Kubernetes, referred to as K8S), which can be called a container orchestration engine, a container orchestrator, etc., and is an open source container
  • K8S large-scale container cluster management tool
  • the orchestration engine is used for task scheduling and management.
  • the container cluster manager it can realize automatic deployment, large-scale scalability and application container management, so as to better arrange and schedule computing resources in server clusters.
  • multiple vehicle work tasks to be processed in parallel are submitted to the container cluster manager, and the container cluster manager schedules and manages the submitted tasks.
  • step S108 a plurality of processing containers are scheduled by the container cluster manager to perform parallel processing on each vehicle work task to be processed in parallel to generate a processing result, wherein each processing container has a set of computing resources, and the computing resources are used for Handle vehicle work tasks.
  • the processing container may be a drive unit, which may be represented by driver, and used to distribute tasks.
  • Computing resources can be execution units, which can be represented by executors, and are used to process vehicle work tasks.
  • a server as the node of the container cluster manager to deploy related environments submit multiple vehicle tasks to be processed in parallel to the container cluster manager, and the container cluster manager schedules multiple drive units, which can
  • the number of drive units can be defined by the system, and can also be a specified parameter for the user to submit a task: two, three, four, etc., which are not specifically limited here.
  • the engine drive unit can start multiple execution units, so as to complete the parallel processing of multiple vehicle tasks to be processed in parallel, and generate processing results, which can be queried in the database.
  • the road test data collected by the vehicle is obtained;
  • the task of controlling the instruction work submitting multiple vehicle work tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster; scheduling multiple processing containers through the container cluster manager
  • Each of the vehicle work tasks to be processed in parallel is processed in parallel to generate a processing result, wherein each processing container has a set of computing resources, and the computing resources are used to process the vehicle work tasks.
  • the application decomposes the drive test data to generate multiple vehicle tasks to be processed in parallel, and uses the container cluster manager to schedule multiple processing containers to perform parallel processing on each vehicle task to be processed in parallel, thereby solving the problem of Solved the technical problem in related technologies that hardware resources cannot be fully utilized for high concurrent task processing.
  • the server cluster includes multiple servers, and the corresponding number of processing containers are obtained according to the number of computing resources of the server cluster.
  • the server cluster includes multiple servers, where the servers may be multiple central processing units in the target vehicle, and the corresponding number of processing containers are obtained according to the number of computing resources of the server cluster.
  • the data processing program is submitted to the container cluster manager, and the container cluster manager arranges and schedules the server sets in the server cluster for resource management, at least one master server and at least one slave server in the server intensive training deployment, and the master server controls the slave The server performs scheduling management, wherein the container cluster manager is installed on the master server to monitor the working status of the slave server.
  • the data processing program is submitted to the container cluster manager, and the container cluster manager orchestrates and schedules the server sets in the server cluster to manage resources, and generates processing containers and computing containers.
  • the processing container may be a drive unit, which may be represented by a driver, and is used for distributing tasks.
  • the computing container can be an execution unit, which can be represented by an executor, and is used to process vehicle work tasks.
  • the method further includes: creating a task processing program of the container cluster manager, wherein the task processing program is used to call the processing capacity according to the determination of the preset parameter; Carry out encapsulation and build a task processing image; based on the task processing image, build the container required for the running of the vehicle work task, wherein the container includes: a management container for scheduling management and a processing container for running tasks.
  • a task processing program of the container cluster manager is created, and the task processing program is used to determine the number of containers that need to be called for processing according to preset parameters. For example, write a task processing program, and determine the amount of processing capacity that needs to be called according to preset parameters.
  • the data is screened according to the established rules and then marked with a specific label, the task processing program is packaged, and then the relevant image construction rules (Dockerfile) are written to instruct the system to perform the image according to the specified steps.
  • the container cluster manager schedules multiple processing containers to perform parallel processing on each vehicle work task to be processed in parallel, including: receiving multiple vehicle work tasks to be processed in parallel, scheduling At least one management container and the same number of processing containers as the number of vehicle work tasks; each management container distributes corresponding vehicle work tasks to designated processing containers; starts the processing container, and each processing container runs the assigned vehicle work tasks.
  • multiple vehicle work tasks to be processed in parallel are received, at least one management container and processing containers having the same number as the vehicle work tasks are dispatched, and each management container distributes corresponding vehicle work tasks to designated processing containers;
  • the processing containers are started to run the assigned vehicle work tasks respectively.
  • the method further includes: merging the sub-run results of each vehicle work task to generate a processing result; storing the processing result To a predetermined database, wherein the database is a database that allows interactive query.
  • the split working task execution results are combined to obtain a complete processing result, and the processing result is stored in a predetermined database, and the data to be queried can be selected in the predetermined database.
  • the predetermined database may be a collection (MongoDB) database, which is not specifically limited here.
  • the road test data collected by the vehicle is obtained; the drive test data is decomposed to generate a plurality of vehicle work tasks to be processed in parallel, wherein the vehicle work tasks are tasks that control the vehicle to work according to the specified control instructions ; Submit multiple vehicle tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster; The vehicle work tasks processed in parallel are processed in parallel to generate processing results, wherein each processing container has a set of computing resources, and the computing resources are used to process the vehicle work tasks.
  • the present application decomposes the drive test data to generate multiple vehicle tasks to be processed in parallel, and uses the container cluster manager to schedule multiple processing containers to perform parallel processing on each vehicle task to be processed in parallel, thereby It solves the technical problem that hardware resources cannot be fully utilized to process high concurrent tasks in related technologies.
  • the core of the self-driving car is the self-driving system composed of artificial intelligence, visual computing, radar and global positioning system, so the quality of the self-driving system determines whether the self-driving car can be used in real life on a large scale
  • all behaviors are actually data-driven. Only by using massive high-quality data to train different modules in a timely manner can the automatic driving system be continuously optimized. Existing hardware can quickly and timely process and screen valuable data, which is crucial for the rapid iteration and development of the automatic driving system.
  • the algorithm iterates The speed depends on the generation speed of high-quality data, so if the data collected by the road test cannot be processed in time, it will greatly affect the optimization speed of the automatic driving system, and the existing processing logic cannot make full use of hardware resources, such as a single machine There may be more than a dozen or twenty central processing units, but the same number of tasks are not run in parallel, which will cause a waste of hardware resources.
  • a high-concurrency data processing system and method which utilizes the data processing characteristics of each server itself, thereby solving the problem of excessive load on the server caused by a large number of data requests. question.
  • a solution and system for high concurrent reception of massive data is also proposed, which allocates the processing capacity of each application server according to the load balancing strategy, thereby realizing high concurrent sending and receiving of massive data, and storing the processed message data, thereby solving the problem of data sending and receiving processing Inefficient technical problems.
  • the scheduling units of the above two methods are only servers, and the number of servers needs to be greatly increased if high-concurrency execution is to be performed, and there is still a technical problem that hardware resources cannot be fully utilized.
  • this application proposes a multi-concurrent automatic data processing system based on a large-scale container cluster management tool and a cluster computing engine.
  • the cluster computing engine uses the most advanced Directed Acyclic Graph (DAG for short) scheduler, query optimizer and physical execution engine, which is 100 times faster than previous data processing tools;
  • DAG Directed Acyclic Graph
  • the large-scale container cluster management tool Kubernetes is It is an open source container orchestration engine that supports automated deployment, large-scale scalability, and application container management.
  • This application disassembles a single task through a cluster computing engine, so that a single task can create a distributed data set for parallel operation and summarize the results.
  • it uses a large-scale container cluster management tool to schedule multi-task parallel execution, so that the existing hardware can be fully utilized Resources, high concurrent processing of massive data, so that the data processing speed is only limited by the total number of CPUs of the server rather than the number of servers, thereby solving the technical problem of not being able to make full use of hardware resources for high concurrent task processing in the prior art.
  • the multi-concurrent data processing method of this embodiment may include the following parts.
  • the first part the construction of K8S cluster.
  • the first step is to select a server as the control node of the open source container orchestration engine to deploy related environments.
  • the control (Master) node is mainly used as the management and control center of the cluster.
  • the first step is to use other servers as the workload (Node) node of the open source container orchestration engine to deploy related environments.
  • the workload on the workload node is allocated by the control node, which is mainly used for maintenance and operation. container and provide a running environment for the container orchestration engine.
  • the third step is to test the operation of the container orchestration engine cluster to ensure that the functions such as communication between nodes are normal.
  • the second part cluster computing engine processing program writing and container packaging.
  • the cluster computing engine processing program is mainly developed in python language, which mainly includes two links:
  • Step 1 Divide the tasks submitted by the user into blocks to create a distributed data set for parallel operation.
  • Link 2 Data processing program, which screens massive data according to established rules and puts specific labels on it.
  • the main rules include whether there are traffic lights, whether there are pedestrians and the number of pedestrians, and whether there is an emergency brake, etc.
  • the second step is to encapsulate the processing program of the cluster computing engine into a mirror, so as to facilitate the scheduling and operation of the container orchestration engine cluster.
  • Part III Submit tasks for large-scale and high-concurrency data processing.
  • Step 1 Set the relevant parameters for submitting tasks, mainly including: running the resource allocation unit program (spark.executor.instances), currently setting to add a resource allocation unit requires a CPU resource, and the parallelism of a single task is the resource allocation
  • the number of units is multiplied by the number of cores of each resource allocation unit; specify the basic image program (spark.kubernetes.container.image) used for running, and the image obtained by encapsulating the second step of the second part can be used; since the original data storage is in On the storage server, the container started at runtime is not visible, so it is necessary to make the data visible to the container through the mounting method.
  • the local directory to be mounted during operation (such as: spark.kubernetes.driver.volumes.hostPath.spark -local-dir-2.options.path and spark.kubernetes.driver.volumes.hostPath.spark-local-dir-2.mount.path), thus making the data visible to the container.
  • the second step is to submit the task for execution.
  • FIG. 2 is a schematic diagram of task scheduling according to an embodiment of the present invention.
  • the driver unit 201 and the driver unit 202 are generated after the cluster computing engine task is submitted to the container orchestration engine, and are mainly used for dispatching and managing the distribution tasks and the generated execution units. They are management units.
  • the drive unit 201 includes: execution unit 2011, execution unit 2012, execution unit 2013, execution unit 2014, execution unit 2015;
  • the drive unit 202 includes: execution unit 2021, execution unit 2022, execution unit 2023, execution unit 2024, execution unit 2025.
  • Execution unit 2011, execution unit 2012, execution unit 2013, execution unit 2014, execution unit 2015, execution unit 2021, execution unit 2022, execution unit 2023, execution unit 2024, execution unit 2025 are generated after the cluster computing engine task is submitted to the container orchestration engine , mainly as a specific task execution unit, in fact it is a collection of computing resources, that is, a collection of central processing unit cores and storage capacity (cpu core, memory).
  • the container orchestration engine 203 mainly schedules and manages all server resources and submitted tasks.
  • the Spark processing program image 204 is mainly used as a basic image construction task to run the required container.
  • the MongoDB database 205 is used to save the data screening and processing results.
  • the specific steps are: after the user submits the task, create 201 and 202 and execution units required for executing the task based on 204.
  • 201 and 202 are mainly responsible for scheduling and managing the execution unit, and the execution unit is mainly responsible for executing specific tasks.
  • the execution unit 0 The quantity is determined by the parameters specified by the user when submitting the task, 203 is to schedule and manage 201, and the final result will be stored in 205.
  • FIG. 3 is a schematic diagram of task decomposition according to an embodiment of the present invention.
  • the task execution unit formed after the collected data is decomposed according to the specified parameters.
  • the cluster computing engine tool 305 is responsible for decomposing the task of processing vehicle data collection, scheduling the data processing program to process the data, and merging the split task execution results to obtain a complete result.
  • the vehicle collection data (100G) 306 is the original road test collection data without processing.
  • the database 307 is used to store the results of the data screening process.
  • 306 is submitted as a pending task to 305, 305 decomposes 306 into parallel tasks to obtain a certain number of 301, 302, 303, 304, 301, 302, 303, 304 will perform data interaction with 307 during execution, and will The data processing result is stored in 307.
  • the first step is to select a server as the control node of the container cluster management tool, and install the v1.20.1 version of the container cluster management tool on it.
  • you can specify the image source --image-repository registry.cn-hangzhou.aliyuncs. com/google_containers, the download address is designated as a Chinese address to prevent instability in pulling data from the original version, and the prerequisite is docker installation.
  • the second step is to use other servers as the load nodes of the container cluster management tool to deploy related environments, and then add the load nodes to the cluster. After installation, you can view the status of the load nodes on the control node, and at the same time perform network, deployment, etc. related tests.
  • the third step is to write the processing program in the cluster computing engine, which can be: write the relevant processing program, where the number of blocks is set to five, that is, after the task is submitted, the cluster computing engine will automatically divide the amount of data to be processed in the submitted task into roughly There are five equal copies.
  • the data screening rules mainly include whether there are traffic lights on the road section where the data is collected, whether there are pedestrians and the number of pedestrians, and whether there is sudden braking during driving.
  • the third step is to encapsulate the processing program of the cluster computing engine into a mirror, write the relevant mirror construction rules, and build the mirror according to the specified steps.
  • the entrypoint that is, the command that needs to be executed after the container starts
  • the entrypoint needs to be set as the default startup
  • the fourth step is to submit the task for large-scale and high-concurrency data processing.
  • spark.executor.instances that is, to create several load containers after the specified task is submitted, here specified as five; spark .kubernetes.container.image, that is, the image name encapsulated by the spark handler; spark.kubernetes.driver.volumes.hostPath.spark-local-dir-2.options.path and spark.kubernetes.driver.volumes.hostPath.spark -local-dir-2.mount.path, that is, specify the local directory to be mounted during operation, etc. After setting, submit the task.
  • spark.executor.instances that is, to create several load containers after the specified task is submitted, here specified as five
  • spark .kubernetes.container.image that is, the image name encapsulated by the spark handler
  • the fifth step is to submit the task, select two folders, each with 100G drive test data, submit a task for each folder, after submission, the container cluster management tool will create two management containers (spark driver), each The management container will start 5 processing containers (spark executor). At this time, a total of 200G data is divided into 10 20G tasks to be executed in parallel. The cluster parallelism is 10 at this time. After the operation is completed, the relevant results can be queried in the database.
  • this implementation has the following points: For tasks performed in units of servers, the concurrency is limited by the number of servers, and a server may have 40 central processing units, which results in a huge resource Waste, through the combination of container orchestration engine and cluster computing engine, make full use of server hardware resources, greatly improve the concurrency of task execution, change the number of task execution units from the number of servers to the number of central processing units, under the condition of limited resources The efficiency is maximized, saving not only cost but also time; the use of container orchestration engine for task scheduling and management allows developers to focus more on program development instead of spending a lot of time on the deployment and scaling of containerized applications Etc., it can help developers to manage the cluster simply and efficiently; using the cluster computing engine for task decomposition can save developers from paying attention to the decomposition and scheduling logic, and can save time on the development of the data processing program.
  • a multi-concurrent data processing device is also provided. It should be noted that the multi-concurrent data processing apparatus can be used to execute the above-mentioned multi-concurrent data processing method.
  • Fig. 4 is a schematic diagram of a multi-concurrent data processing apparatus according to an embodiment of the present application.
  • the multi-concurrent data processing apparatus 400 may include: an acquisition component 401 , a decomposition component 402 , a submission component 403 and a processing component 404 .
  • the above acquisition component 401, decomposition component 402, submission component 403 and processing component 404 can run in the terminal as part of the device, and the functions realized by the above modules can be executed by the processor in the terminal, and the terminal can also It can be smart phones (such as Android phones, iOS phones, etc.), tablet computers, applause computers, mobile Internet devices (Mobile Internet Devices, MID), PAD and other terminal devices.
  • the acquisition component 401 is configured to acquire the drive test data collected by the vehicle.
  • the decomposing component 402 is configured to decompose the drive test data to generate multiple vehicle work tasks to be processed in parallel, wherein the vehicle work tasks are tasks for controlling the vehicle to work according to specified control instructions.
  • the submitting component 403 is configured to submit multiple vehicle work tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster.
  • the processing component 404 is configured to schedule a plurality of processing containers through the container cluster manager to perform parallel processing on each vehicle work task to be processed in parallel to generate a processing result, wherein each processing container has a set of computing resources, and the computing Resources are used to handle vehicle work tasks.
  • the server cluster includes multiple servers, and the corresponding number of processing containers are obtained according to the number of computing resources of the server cluster.
  • the computing resources include at least: processor resources and storage resources, wherein the server training deployment has at least one main server and at least one slave server, and the container cluster manager is installed on the master server to monitor the working status of the slave server.
  • the device also includes: a creation component, configured as a task handler for creating a container cluster manager, wherein the task handler is used to determine the number of processing containers that need to be transferred according to preset parameters; an encapsulation component, configured as Encapsulate the task processing program and build a task processing image; build components, set it based on the task processing image, and build the container required for the running of the vehicle work task.
  • the container includes: a management container for scheduling management and a running The processing container for the task.
  • the above creation component, packaging component and construction component can be run in the terminal as a part of the device, and the functions implemented by the above modules can be executed by the processor in the terminal.
  • the processing component includes: a receiving subcomponent, configured to receive a plurality of vehicle work tasks to be processed in parallel; a scheduling subcomponent, configured to schedule at least one management container and processing containers with the same number as the vehicle work tasks; The component is configured to distribute the corresponding vehicle work tasks to the designated processing containers for each management container; the running sub-component is configured to start the processing containers, and each processing container runs the assigned vehicle work tasks respectively.
  • receiving subcomponent, scheduling subcomponent, distribution subcomponent and running subcomponent can run in the terminal as part of the device, and the functions implemented by the above modules can be executed by the processor in the terminal.
  • the device further includes: a merging component, configured to merge the sub-running results of each vehicle work task, and generate a processing result; a storage component, configured to store the processing result in a predetermined database, wherein the database allows interactive query database.
  • a merging component configured to merge the sub-running results of each vehicle work task, and generate a processing result
  • a storage component configured to store the processing result in a predetermined database, wherein the database allows interactive query database.
  • the above merging component and storage component may run in the terminal as a part of the device, and the functions implemented by the above modules may be executed by the processor in the terminal.
  • the drive test data is decomposed to generate a plurality of vehicle tasks to be processed in parallel, and the container cluster manager schedules multiple processing containers for each vehicle to be processed in parallel
  • the work tasks are processed in parallel, thereby solving the technical problem in related technologies that hardware resources cannot be fully utilized for high concurrent task processing.
  • non-volatile readable storage medium wherein the non-volatile readable storage medium includes a stored program, wherein the program executes any one of the embodiments of the present application The method for multi-concurrent data processing.
  • Each functional module provided by the embodiment of the present application can be run in a multi-concurrent data processing method or a similar computing device, and can also be stored as a part of a non-volatile storage medium.
  • Fig. 5 is a schematic structural diagram of a non-volatile storage medium according to an embodiment of the present application.
  • a program product 50 according to an embodiment of the present application is described, on which a computer program is stored, and when the computer program is executed by a processor, the program code that implements the following steps:
  • the server cluster includes multiple servers, and the corresponding number of processing containers are obtained according to the number of computing resources of the server cluster, and the computing resources include at least: processor resources and storage resources, where at least one master server and at least one slave server are deployed in the server training camp, and the container cluster manager is installed on the master server to monitor the working status of the slave servers.
  • the program code for implementing the following steps: creating a task processing program of the container cluster manager, wherein the task processing program is used to determine the number of processing containers that need to be transferred according to preset parameters; Encapsulate the task processing program and build a task processing image; based on the task processing image, build the container required for the running of the vehicle work task, wherein the container includes: a management container for scheduling management and a processing container for running tasks.
  • the program code that implements the following steps: receiving multiple vehicle work tasks to be processed in parallel, dispatching at least one management container and processing containers with the same number as the vehicle work tasks; The container distributes the corresponding vehicle work tasks to designated processing containers; the processing containers are started, and each processing container runs the assigned vehicle work tasks respectively.
  • the program code that implements the following steps: After each processing container executes the assigned vehicle work tasks, merge the sub-run results of each vehicle work task to generate a processing result ; Store the processing result in a predetermined database, wherein the database is a database that allows interactive query.
  • the non-volatile storage medium may also be configured as program codes of various preferred or optional method steps provided by the multi-concurrent data processing method.
  • Non-volatile storage media may include a data signal carrying readable program code in baseband or as part of a carrier wave traveling as a data signal. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a non-volatile storage medium may send, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the program code contained in the non-volatile storage medium can be transmitted by any appropriate medium, including but not limited to wireless, cable, optical cable, radio frequency, etc., or any suitable combination of the above.
  • a processor is provided.
  • Fig. 6 is a schematic structural diagram of a processor according to an embodiment of the present application. As shown in FIG. 6 , the processor 60 is configured to run a program, wherein the program executes the multi-concurrent data processing method described in the embodiment of the present application when running.
  • the above-mentioned processor 60 may execute the execution programs of the multi-concurrent data processing method.
  • the processor 60 may be configured to perform the following steps:
  • the processor 60 may also be configured to perform the following steps: the server cluster includes multiple servers, and a corresponding number of processing containers are obtained according to the number of computing resources of the server cluster, and the computing resources include at least: processor resources and storage resources, where at least one master server and at least one slave server are deployed in the server training camp, and the container cluster manager is installed on the master server to monitor the working status of the slave server.
  • the processor 60 may also be configured to perform the following steps: create a task processing program of the container cluster manager, wherein the task processing program is used to determine the number of processing containers that need to be transferred according to preset parameters; The processing program is packaged to build a task processing image; based on the task processing image, the container required for the running of the vehicle work task is constructed, wherein the container includes: a management container for scheduling management and a processing container for running tasks.
  • the processor 60 may also be configured to perform the following steps: receiving a plurality of vehicle work tasks to be processed in parallel, scheduling at least one management container and processing containers having the same number as the vehicle work tasks; each management container will The corresponding vehicle work tasks are distributed to the designated processing containers; the processing containers are started, and each processing container runs the assigned vehicle work tasks respectively.
  • the processor 60 may also be configured to perform the following steps: After each processing container executes the assigned vehicle work tasks, merge the sub-run results of each vehicle work task to generate a processing result;
  • the processing results are stored in a predetermined database, wherein the database is a database that allows interactive query.
  • the above-mentioned processor 60 can execute various functional applications and data processing by running software programs and modules stored in the memory, that is, realize the above-mentioned multi-concurrent data processing method.
  • the disclosed technical content can be realized in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units may be a logical function division.
  • multiple units or components may be combined or may be Integrate into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of units or modules may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the essence of the technical solution of this application or the part that contributes to the related technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium.
  • a computer device which may be a personal computer, server or network device, etc.
  • the aforementioned storage media include: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disc, etc., which can store program codes. .
  • the solution provided by the embodiment of the present application can be applied in the multi-concurrent data processing process to obtain the drive test data collected by the vehicle; decompose the drive test data to generate multiple vehicle tasks to be processed in parallel, wherein the The vehicle work task is a task of controlling the vehicle to work according to a specified control instruction; the multiple vehicle work tasks to be processed in parallel are submitted to the container cluster manager, wherein the container cluster manager is used for orchestrating and scheduling servers Computing resources in the cluster; through the container cluster manager, a plurality of processing containers are scheduled to perform parallel processing on each vehicle work task to be processed in parallel to generate a processing result, wherein each of the processing containers has a group of the Computing resources for processing the vehicle work tasks.
  • the above solution decomposes the drive test data to generate multiple vehicle tasks to be processed in parallel, thereby making full use of hardware resources to report parallel tasks, thereby solving the problem of not being able to make full use of hardware resources for high concurrent task processing in related technologies technical problem.

Abstract

Disclosed in embodiments of the present application are a multi-concurrency data processing method and device. The method comprises: obtaining road test data acquired by a vehicle; decomposing the road test data to generate a plurality of vehicle work tasks to be processed in parallel, the vehicle work tasks being tasks for controlling the vehicle to work according to specified control instructions; submitting the plurality of vehicle work tasks to be processed in parallel to a container cluster manager, the container cluster manager being used for orchestrating and scheduling computing resources in a server cluster; and scheduling a plurality of processing containers by means of the container cluster manager to respectively perform parallel processing on the vehicle work tasks to be processed in parallel so as to generate processing results, each processing container having a set of computing resources, and the computing resources being used for processing the vehicle work task.

Description

多并发的数据处理方法和装置Multi-concurrent data processing method and device
本申请要求于2021年12月31日提交中国专利局、优先权号为202111676678.8、发明名称为“多并发的数据处理方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the priority number 202111676678.8 and the title of the invention "Multi-Concurrent Data Processing Method and Device" submitted to the China Patent Office on December 31, 2021, the entire contents of which are incorporated herein by reference Applying.
技术领域technical field
本申请实施例涉及车辆领域,具体而言,涉及一种多并发的数据处理方法和装置。The embodiments of the present application relate to the vehicle field, and in particular, relate to a multi-concurrent data processing method and device.
背景技术Background technique
目前,需要及时通过使用海量高质量数据对不同模块进行训练,才能实现不断优化自动驾驶系统。At present, it is necessary to train different modules in a timely manner by using massive amounts of high-quality data in order to continuously optimize the automatic driving system.
在相关技术中,一方面硬件资源不可能无限扩充,另一方面自动驾驶车每天所采集的数据又在急速增加,一辆车每秒就会采集到700MB数据,一天下来数据量将会达到10T级别,同时算法迭代的速度取决于高质量数据的产生速度,所以如果不能及时对路测采集数据进行处理,就会极大地影响自动驾驶系统的优化速度,而现有的处理逻辑仍存在无法充分利用硬件资源进行高并发任务处理的问题。In related technologies, on the one hand, hardware resources cannot be expanded infinitely, and on the other hand, the data collected by self-driving cars is increasing rapidly every day. A car will collect 700MB of data per second, and the amount of data will reach 10T in a day. At the same time, the speed of algorithm iteration depends on the speed of high-quality data generation, so if the road test data cannot be processed in time, it will greatly affect the optimization speed of the automatic driving system, and the existing processing logic still cannot fully The problem of utilizing hardware resources for high concurrent task processing.
针对相关技术中无法充分利用硬件资源进行高并发任务处理的问题,目前尚未提出有效的解决方案。Aiming at the problem that hardware resources cannot be fully utilized to process high-concurrency tasks in related technologies, no effective solution has been proposed so far.
发明内容Contents of the invention
本申请实施例提供了一种多并发的数据处理方法和装置,以至少解决相关技术中无法充分利用硬件资源进行高并发任务处理的技术问题。Embodiments of the present application provide a multi-concurrent data processing method and device, so as to at least solve the technical problem in the related art that hardware resources cannot be fully utilized for high-concurrency task processing.
根据本申请实施例的一个方面,提供了一种多并发的数据处理方法,包括:获取车辆采集到的路测数据;将路测数据进行分解,生成多个待并行处理的车辆工作任务,其中,车辆工作任务为控制车辆按照指定的控制指令工作的任务;将多个待并行处理的车辆工作任务提交至容器集群管理器,其中,容器集群管理器用于编排和调度服务器集群中的计算资源;通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,生成处理结果,其中,每个处理容器具有一组计算资源,计算资源用于处理车辆工作任务。According to an aspect of the embodiment of the present application, a multi-concurrent data processing method is provided, including: acquiring the drive test data collected by the vehicle; decomposing the drive test data to generate multiple vehicle work tasks to be processed in parallel, wherein , the vehicle work task is the task of controlling the vehicle to work according to the specified control instruction; submit multiple vehicle work tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster; The container cluster manager schedules multiple processing containers to process each vehicle task to be processed in parallel in parallel to generate a processing result, wherein each processing container has a set of computing resources, and the computing resources are used to process the vehicle task.
可选地,该方法还包括:服务器集群包括多台服务器,按照服务器集群的计算资源的数量划分得到对应数量的处理容器,计算资源至少包括:处理器资源和存储资源,其中,服务器集训部署有至少一主服务器和至少一个从服务器,容器集群管理器安装在主服务器上,用于监控从服务器的工作状态。Optionally, the method further includes: the server cluster includes multiple servers, and a corresponding number of processing containers are obtained according to the number of computing resources of the server cluster, and the computing resources include at least processor resources and storage resources, wherein the server cluster training deployment has At least one master server and at least one slave server. The container cluster manager is installed on the master server to monitor the working status of the slave servers.
可选地,创建容器集群管理器的任务处理程序,其中,任务处理程序用于按照预设参数确定需要调取处理容器的数量;将任务处理程序进行封装,构建任务处理镜像;基于任务处理镜像,构建车辆工作任务运行时所需的容器,其中,容器包括:用于进行调度管理的管理容器和用于运行任务的处理容器。Optionally, create a task processing program of the container cluster manager, wherein the task processing program is used to determine the number of processing containers that need to be transferred according to preset parameters; the task processing program is encapsulated to build a task processing image; based on the task processing image , constructing the containers required for running the vehicle work tasks, wherein the containers include: a management container for scheduling management and a processing container for running tasks.
可选地,通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,包括:接收到多个待并行处理的车辆工作任务,调度至少一个管理容器和与车辆工作任务数量相同的处理容器;每个管理容器将对应车辆工作任务分发到指定的处理容器;启动处理容器,每个处理容器分别对分配来的车辆工作任务进行运行。Optionally, scheduling multiple processing containers through the container cluster manager to perform parallel processing on each vehicle work task to be processed in parallel, including: receiving multiple vehicle work tasks to be processed in parallel, scheduling at least one management container and communicating with Processing containers with the same number of vehicle work tasks; each management container distributes corresponding vehicle work tasks to designated processing containers; starts the processing containers, and each processing container runs the assigned vehicle work tasks respectively.
可选地,在每个处理容器分别对分配来的车辆工作任务进行运行之后,方法还包括:合并每个车辆工作任务的子运行结果,生成处理结果;将处理结果存储至预定的数据库,其中,数据库为允许交互查询的数据库。Optionally, after each processing container runs the assigned vehicle work tasks, the method further includes: merging the sub-run results of each vehicle work task to generate a processing result; storing the processing result to a predetermined database, wherein , the database is a database that allows interactive queries.
根据本申请实施例的另一方面,还提供了一种多并发的数据处理装置,包括:获取组件,设置为获取车辆采集到的路测数据;分解组件,设置为将路测数据进行分解,生成多个待并行处理的车辆工作任务,其中,车辆工作任务为控制车辆按照指定的控制指令工作的任务;提交组件,设置为将多个待并行处理的车辆工作任务提交至容器集群管理器,其中,容器集群管理器用于编排和调度服务器集群中的计算资源;处理组件,设置为通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,生成处理结果,其中,每个处理容器具有一组计算资源,计算资源用于处理车辆工作任务。According to another aspect of the embodiment of the present application, a multi-concurrent data processing device is also provided, including: an acquisition component configured to acquire drive test data collected by vehicles; a decomposition component configured to decompose the drive test data, Generate multiple vehicle work tasks to be processed in parallel, wherein the vehicle work task is a task to control the vehicle to work according to the specified control instruction; the submission component is set to submit multiple vehicle work tasks to be processed in parallel to the container cluster manager, Among them, the container cluster manager is used to arrange and schedule computing resources in the server cluster; the processing component is set to schedule multiple processing containers through the container cluster manager to perform parallel processing on each vehicle task to be processed in parallel and generate processing results , where each processing container has a set of computing resources used to process vehicle work tasks.
可选地,服务器集群包括多台服务器,按照服务器集群的计算资源的数量划分得到对应数量的处理容器,计算资源至少包括:处理器资源和存储资源,其中,服务器集训部署有至少一主服务器和至少一个从服务器,容器集群管理器安装在主服务器上,用于监控从服务器的工作状态。Optionally, the server cluster includes multiple servers, and the corresponding number of processing containers are obtained according to the number of computing resources of the server cluster. The computing resources include at least: processor resources and storage resources, wherein the server training deployment has at least one master server and At least one slave server. The container cluster manager is installed on the master server to monitor the working status of the slave server.
可选地,该装置还包括:创建组件,设置为创容器集群管理器的任务处理程序,其中,任务处理程序用于按照预设参数确定需要调取的处理容器的数量;封装组件,设置为将任务处理程序进行封装,构建任务处理镜像;构建组件,设置为基于任务处 理镜像,构建车辆工作任务运行时所需的容器,其中,容器包括:用于进行调度管理的管理容器和用于运行任务的处理容器。Optionally, the device also includes: a creation component, configured as a task handler for creating a container cluster manager, wherein the task handler is used to determine the number of processing containers that need to be transferred according to preset parameters; an encapsulation component, configured as Encapsulate the task processing program and build a task processing image; build components, set it based on the task processing image, and build the container required for the running of the vehicle work task. Among them, the container includes: a management container for scheduling management and a running The processing container for the task.
可选地,处理组件包括:接收子组件,设置为接收到多个待并行处理的车辆工作任务;调度子组件,设置为调度至少一个管理容器和与车辆工作任务数量相同的处理容器;分发子组件,设置为每个管理容器将对应车辆工作任务分发到指定的处理容器;运行子组件,设置为启动处理容器,每个处理容器分别对分配来的车辆工作任务进行运行。Optionally, the processing component includes: a receiving subcomponent, configured to receive a plurality of vehicle work tasks to be processed in parallel; a scheduling subcomponent, configured to schedule at least one management container and processing containers with the same number as the vehicle work tasks; The component is configured to distribute the corresponding vehicle work tasks to the designated processing containers for each management container; the running sub-component is configured to start the processing containers, and each processing container runs the assigned vehicle work tasks respectively.
可选地,该装置还包括:合并组件,设置为合并每个车辆工作任务的子运行结果,生成处理结果;存储组件,设置为将处理结果存储至预定的数据库,其中,数据库为允许交互查询的数据库。Optionally, the device further includes: a merging component, configured to merge the sub-running results of each vehicle work task, and generate a processing result; a storage component, configured to store the processing result in a predetermined database, wherein the database allows interactive query database.
根据本发明实施例的另一方面,还提供了一种非易失性可读存储介质。该非易失性存储介质包括存储的程序,其中,在程序运行时控制非易失性存储介质所在设备执行上述的多并发的数据处理方法。According to another aspect of the embodiments of the present invention, a non-volatile readable storage medium is also provided. The non-volatile storage medium includes a stored program, wherein when the program is running, the device where the non-volatile storage medium is located is controlled to execute the above multi-concurrent data processing method.
根据本申请实施例的另一方面,还提供了一种处理器。该处理器设置为运行程序,其中,程序运行时执行上述的多并发的数据处理方法。According to another aspect of the embodiments of the present application, a processor is also provided. The processor is configured to run a program, wherein the above-mentioned multi-concurrent data processing method is executed when the program is running.
在本申请实施例中,获取车辆采集到的路测数据;将路测数据进行分解,生成多个待并行处理的车辆工作任务,其中,车辆工作任务为控制车辆按照指定的控制指令工作的任务;将多个待并行处理的车辆工作任务提交至容器集群管理器,其中,容器集群管理器用于编排和调度服务器集群中的计算资源;通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,生成处理结果,其中,每个处理容器具有一组计算资源,计算资源用于处理车辆工作任务。也就是说,本申请将路测数据进行分解,生成多个待并行处理的车辆工作任务,通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,从而解决了相关技术中无法充分利用硬件资源进行高并发任务处理的技术问题。In the embodiment of the present application, the road test data collected by the vehicle is obtained; the drive test data is decomposed to generate a plurality of vehicle work tasks to be processed in parallel, wherein the vehicle work tasks are tasks that control the vehicle to work according to the specified control instructions ; Submit multiple vehicle tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster; The vehicle work tasks processed in parallel are processed in parallel to generate processing results, wherein each processing container has a set of computing resources, and the computing resources are used to process the vehicle work tasks. That is to say, the present application decomposes the drive test data to generate multiple vehicle tasks to be processed in parallel, and uses the container cluster manager to schedule multiple processing containers to perform parallel processing on each vehicle task to be processed in parallel, thereby It solves the technical problem that hardware resources cannot be fully utilized to process high concurrent tasks in related technologies.
附图说明Description of drawings
此处所说明的附图用来提供对本申请实施例的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described here are used to provide a further understanding of the embodiments of the present application, and constitute a part of the present application. The schematic embodiments of the present application and their descriptions are used to explain the present application, and do not constitute improper limitations to the present application. In the attached picture:
图1是根据本申请实施例的一种多并发的数据处理方法的流程图;Fig. 1 is a flow chart of a multi-concurrent data processing method according to an embodiment of the present application;
图2是根据本申请实施例的一种任务调度的示意图;FIG. 2 is a schematic diagram of task scheduling according to an embodiment of the present application;
图3是根据本申请实施例的一种任务分解的示意图;Fig. 3 is a schematic diagram of task decomposition according to an embodiment of the present application;
图4是根据本申请实施例的一种多并发的数据处理装置的示意图;FIG. 4 is a schematic diagram of a multi-concurrent data processing device according to an embodiment of the present application;
图5是根据本申请实施例的一种非易失性存储介质的结构示意图;FIG. 5 is a schematic structural diagram of a non-volatile storage medium according to an embodiment of the present application;
图6是根据本申请实施例的一种处理器的结构示意图。Fig. 6 is a schematic structural diagram of a processor according to an embodiment of the present application.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。In order to enable those skilled in the art to better understand the solution of the present application, the technical solution in the embodiment of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiment of the application. Obviously, the described embodiment is only It is an embodiment of a part of the application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the scope of protection of this application.
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first" and "second" in the description and claims of the present application and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the application described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to the expressly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.
根据本申请实施例,提供了一种多并发的数据处理方法实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。According to the embodiment of the present application, an embodiment of a multi-concurrent data processing method is provided. It should be noted that the steps shown in the flowcharts of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and , although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that shown or described herein.
图1是根据本申请实施例的一种多并发的数据处理方法的流程图。如图1所示,该方法可以包括如下步骤:Fig. 1 is a flowchart of a multi-concurrent data processing method according to an embodiment of the present application. As shown in Figure 1, the method may include the following steps:
步骤S102,获取车辆采集到的路测数据。Step S102, acquiring road test data collected by the vehicle.
在本发明上述步骤S102提供的技术方案中,路测数据可以为汽车在自动驾驶过程中收集的数据,可以为自动驾驶过程的速度、加速度和时间,也可以为自动驾驶过程各个模块的性能等。In the technical solution provided in the above step S102 of the present invention, the road test data can be the data collected by the car during the automatic driving process, can be the speed, acceleration and time of the automatic driving process, or can be the performance of each module in the automatic driving process, etc. .
在该实施例中,可以通过摄像头、雷达、目标车辆的传感器、控制器等采集测试车辆在自动驾驶过程中的行驶轨迹、行驶速度、加速度和行驶过程中各个模块的性能 参数等。In this embodiment, the driving trajectory, driving speed, acceleration and performance parameters of each module during the driving process of the test vehicle can be collected through the camera, radar, sensor of the target vehicle, controller, etc. during the automatic driving process.
步骤S104,将路测数据进行分解,生成多个待并行处理的车辆工作任务,其中,车辆工作任务为控制车辆按照指定的控制指令工作的任务。Step S104, decomposing the drive test data to generate a plurality of vehicle work tasks to be processed in parallel, wherein the vehicle work tasks are tasks for controlling the vehicle to work according to specified control instructions.
在本发明上述步骤S104提供的技术方案中,车辆工作任务为控制车辆按照指定的控制指令工作的任务,可以为在某一路段是否加速、是否减速和是否急刹车等。对路测数据进行分解可以为按照既定规则对路测数据打上标签,规则可以为采集行驶路段是否有红绿路灯、是否存在行人以及行人数量和行驶过程中是否存在急刹等。In the technical solution provided by the above-mentioned step S104 of the present invention, the vehicle work task is to control the vehicle to work according to the specified control command, which may be whether to accelerate, decelerate, and brake suddenly on a certain road section. Decomposing the road test data can be used to label the road test data according to established rules. The rules can be to collect whether there are traffic lights on the driving section, whether there are pedestrians, the number of pedestrians, and whether there is sudden braking during driving.
举例而言,获取车辆采集到的路测数据,通过数据处理程序将获取到的路测数据按照既定规则对路测数据进行分解,并打上特定标签进行标记,调度数据处理程序对标记的路测数据进行处理,生成多个待并行处理的车辆工作任务。其中,数据处理程序可以为集群计算引擎(spark)。For example, obtain the drive test data collected by the vehicle, decompose the obtained drive test data according to the established rules through the data processing program, and mark it with a specific label, and schedule the data processing program to perform the marked drive test The data is processed to generate multiple vehicle work tasks to be processed in parallel. Wherein, the data processing program may be a cluster computing engine (spark).
步骤S106,将多个待并行处理的车辆工作任务提交至容器集群管理器,其中,容器集群管理器用于编排和调度服务器集群中的计算资源。Step S106, submitting multiple vehicle work tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster.
在本发明上述步骤S106提供的技术方案中,容器集群管理器可以为大规模容器集群管理工具(Kubernetes,简称为K8S),可称为容器编排引擎、容器编排器等,是一个开源的一个容器编排引擎,用于进行任务的调度及管理,基于容器集群管理器可以实现自动化部署、大规模可伸缩和应用容器化管理,从而更好的编排和调度服务器集群中的计算资源。In the technical solution provided in the above step S106 of the present invention, the container cluster manager can be a large-scale container cluster management tool (Kubernetes, referred to as K8S), which can be called a container orchestration engine, a container orchestrator, etc., and is an open source container The orchestration engine is used for task scheduling and management. Based on the container cluster manager, it can realize automatic deployment, large-scale scalability and application container management, so as to better arrange and schedule computing resources in server clusters.
可选地,将多个待并行处理的车辆工作任务提交至容器集群管理器,容器集群管理器对提交的任务进行调度和管理。Optionally, multiple vehicle work tasks to be processed in parallel are submitted to the container cluster manager, and the container cluster manager schedules and manages the submitted tasks.
步骤S108,通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,生成处理结果,其中,每个处理容器具有一组所述计算资源,计算资源用于处理车辆工作任务。In step S108, a plurality of processing containers are scheduled by the container cluster manager to perform parallel processing on each vehicle work task to be processed in parallel to generate a processing result, wherein each processing container has a set of computing resources, and the computing resources are used for Handle vehicle work tasks.
在本发明上述步骤S108提供的技术方案中,处理容器可以为驱动单元,可以用driver表示,用于分发任务。计算资源可以为执行单元,可以用executor表示,用于处理车辆工作任务。In the technical solution provided in step S108 of the present invention, the processing container may be a drive unit, which may be represented by driver, and used to distribute tasks. Computing resources can be execution units, which can be represented by executors, and are used to process vehicle work tasks.
可选地,选择一台服务器作为容器集群管理器的节点,进行相关环境的部署,将多个待并行处理的车辆工作任务提交至容器集群管理器,容器集群管理器调度多个驱动单元,可选地,驱动单元可以为系统自定义数量,还可以为用户提交任务的指定参数:可以为两个、三个、四个等,此处不做具体限定。引擎驱动单元可以启动多个执 行单元,从而完成对多个待并行处理的车辆工作任务进行并行处理,生成处理结果,生成的处理结果可以在数据库中进行查询。Optionally, select a server as the node of the container cluster manager to deploy related environments, submit multiple vehicle tasks to be processed in parallel to the container cluster manager, and the container cluster manager schedules multiple drive units, which can Optionally, the number of drive units can be defined by the system, and can also be a specified parameter for the user to submit a task: two, three, four, etc., which are not specifically limited here. The engine drive unit can start multiple execution units, so as to complete the parallel processing of multiple vehicle tasks to be processed in parallel, and generate processing results, which can be queried in the database.
由上述可知,在本申请上述实施例中,获取车辆采集到的路测数据;将路测数据进行分解,生成多个待并行处理的车辆工作任务,其中,车辆工作任务为控制车辆按照指定的控制指令工作的任务;将多个待并行处理的车辆工作任务提交至容器集群管理器,其中,容器集群管理器用于编排和调度服务器集群中的计算资源;通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,生成处理结果,其中,每个处理容器具有一组所述计算资源,计算资源用于处理车辆工作任务。也就是说本申请将路测数据进行分解,生成多个待并行处理的车辆工作任务,通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,从而解决了相关技术中无法充分利用硬件资源进行高并发任务处理的技术问题。As can be seen from the above, in the above-mentioned embodiments of the present application, the road test data collected by the vehicle is obtained; The task of controlling the instruction work; submitting multiple vehicle work tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster; scheduling multiple processing containers through the container cluster manager Each of the vehicle work tasks to be processed in parallel is processed in parallel to generate a processing result, wherein each processing container has a set of computing resources, and the computing resources are used to process the vehicle work tasks. That is to say, the application decomposes the drive test data to generate multiple vehicle tasks to be processed in parallel, and uses the container cluster manager to schedule multiple processing containers to perform parallel processing on each vehicle task to be processed in parallel, thereby solving the problem of Solved the technical problem in related technologies that hardware resources cannot be fully utilized for high concurrent task processing.
下面对该实施例的上述方法进行进一步介绍。The above-mentioned method of this embodiment will be further introduced below.
作为一种可选的实施例方式,服务器集群包括多台服务器,按照服务器集群的计算资源的数量划分得到对应数量的处理容器,计算资源至少包括:处理器资源和存储资源,其中,服务器集训部署有至少一主服务器和至少一个从服务器,容器集群管理器安装在主服务器上,用于监控从服务器的工作状态。As an optional embodiment, the server cluster includes multiple servers, and the corresponding number of processing containers are obtained according to the number of computing resources of the server cluster. There are at least one master server and at least one slave server, and the container cluster manager is installed on the master server to monitor the working status of the slave servers.
在该实施例中,服务器集群包括多台服务器,其中服务器可以为目标车辆中的多个中央处理器,按照服务器集群的计算资源的数量划分得到对应数量的处理容器。In this embodiment, the server cluster includes multiple servers, where the servers may be multiple central processing units in the target vehicle, and the corresponding number of processing containers are obtained according to the number of computing resources of the server cluster.
可选地,数据处理程序提交至容器集群管理器,容器集群管理器编排和调度服务器集群中的服务器集进行资源的管理,服务器集训部署中至少一主服务器和至少一个从服务器,主服务器对从服务器进行调度管理,其中,容器集群管理器安装在主服务器上,用于监控从服务器的工作状态。Optionally, the data processing program is submitted to the container cluster manager, and the container cluster manager arranges and schedules the server sets in the server cluster for resource management, at least one master server and at least one slave server in the server intensive training deployment, and the master server controls the slave The server performs scheduling management, wherein the container cluster manager is installed on the master server to monitor the working status of the slave server.
举例而言,数据处理程序提交至容器集群管理器,容器集群管理器编排和调度服务器集群中的服务器集进行资源的管理,生成处理容器和计算容器。可选地,处理容器可以为驱动单元,可以用driver表示,用于分发任务。计算容器可以为执行单元,可以用executor表示,用于处理车辆工作任务。For example, the data processing program is submitted to the container cluster manager, and the container cluster manager orchestrates and schedules the server sets in the server cluster to manage resources, and generates processing containers and computing containers. Optionally, the processing container may be a drive unit, which may be represented by a driver, and is used for distributing tasks. The computing container can be an execution unit, which can be represented by an executor, and is used to process vehicle work tasks.
作为一种可选的实施例方式,该方法还包括:创建容器集群管理器的任务处理程序,其中,任务处理程序用于按照预设参数器的确定需要调取处理容数量;将任务处理程序进行封装,构建任务处理镜像;基于任务处理镜像,构建车辆工作任务运行时所需的容器,其中,容器包括:用于进行调度管理的管理容器和用于运行任务的处理 容器。As an optional embodiment, the method further includes: creating a task processing program of the container cluster manager, wherein the task processing program is used to call the processing capacity according to the determination of the preset parameter; Carry out encapsulation and build a task processing image; based on the task processing image, build the container required for the running of the vehicle work task, wherein the container includes: a management container for scheduling management and a processing container for running tasks.
在该实施例中,创建容器集群管理器的任务处理程序,任务处理程序用于按照预设参数确定需要调取处理容器的数量。比如,编写任务处理程序,按照预设参数确定需要调取处理容数量。In this embodiment, a task processing program of the container cluster manager is created, and the task processing program is used to determine the number of containers that need to be called for processing according to preset parameters. For example, write a task processing program, and determine the amount of processing capacity that needs to be called according to preset parameters.
可选地,在该实施例中,对数据按照既定规则进行筛选后打上特定标签,将任务处理程序进行封装,然后编写相关的镜像构建规则(Dockerfile),用于指示系统按照指定步骤进行镜像的构建,构建任务处理镜像,基于任务处理镜像,构建车辆工作任务运行时行需要的调度管理的管理容器和运行任务的处理容器。Optionally, in this embodiment, the data is screened according to the established rules and then marked with a specific label, the task processing program is packaged, and then the relevant image construction rules (Dockerfile) are written to instruct the system to perform the image according to the specified steps. Constructing, constructing a task processing image, based on the task processing image, constructing a management container for scheduling management and a processing container for running tasks required by the vehicle work task runtime.
作为一种可选的实施例方式,通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,包括:接收到多个待并行处理的车辆工作任务,调度至少一个管理容器和与车辆工作任务数量相同的处理容器;每个管理容器将对应车辆工作任务分发到指定的处理容器;启动处理容器,每个处理容器分别对分配来的车辆工作任务进行运行。As an optional embodiment, the container cluster manager schedules multiple processing containers to perform parallel processing on each vehicle work task to be processed in parallel, including: receiving multiple vehicle work tasks to be processed in parallel, scheduling At least one management container and the same number of processing containers as the number of vehicle work tasks; each management container distributes corresponding vehicle work tasks to designated processing containers; starts the processing container, and each processing container runs the assigned vehicle work tasks.
在该实施例中,接收到多个待并行处理的车辆工作任务,调度至少一个管理容器和与车辆工作任务数量相同的处理容器,每个管理容器将对应车辆工作任务分发到指定的处理容器;启动处理容器,从而分别对分配来的车辆工作任务进行运行。In this embodiment, multiple vehicle work tasks to be processed in parallel are received, at least one management container and processing containers having the same number as the vehicle work tasks are dispatched, and each management container distributes corresponding vehicle work tasks to designated processing containers; The processing containers are started to run the assigned vehicle work tasks respectively.
作为一种可选的实施例方式,在每个处理容器分别对分配来的车辆工作任务进行运行之后,方法还包括:合并每个车辆工作任务的子运行结果,生成处理结果;将处理结果存储至预定的数据库,其中,数据库为允许交互查询的数据库。As an optional embodiment, after each processing container runs the assigned vehicle work task, the method further includes: merging the sub-run results of each vehicle work task to generate a processing result; storing the processing result To a predetermined database, wherein the database is a database that allows interactive query.
在该实施例中,将拆分之后的工作任务运行结果进行合并,得到完整的处理结果,处理结果存储至预定的数据库,可以选择在预定数据库中对需要查询的数据进行选择。其中,预定的数据库可以为集合(MongoDB)数据库,此处不做具体限定。In this embodiment, the split working task execution results are combined to obtain a complete processing result, and the processing result is stored in a predetermined database, and the data to be queried can be selected in the predetermined database. Wherein, the predetermined database may be a collection (MongoDB) database, which is not specifically limited here.
在本申请实施例中,获取车辆采集到的路测数据;将路测数据进行分解,生成多个待并行处理的车辆工作任务,其中,车辆工作任务为控制车辆按照指定的控制指令工作的任务;将多个待并行处理的车辆工作任务提交至容器集群管理器,其中,容器集群管理器用于编排和调度服务器集群中的计算资源;通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,生成处理结果,其中,每个处理容器具有一组计算资源,计算资源用于处理车辆工作任务。也就是说,本申请将路测数据进行分解,生成多个待并行处理的车辆工作任务,通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,从而解决了相关技术中无法充分利用硬件资源进行高并发任务处理的技术问题。In the embodiment of the present application, the road test data collected by the vehicle is obtained; the drive test data is decomposed to generate a plurality of vehicle work tasks to be processed in parallel, wherein the vehicle work tasks are tasks that control the vehicle to work according to the specified control instructions ; Submit multiple vehicle tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster; The vehicle work tasks processed in parallel are processed in parallel to generate processing results, wherein each processing container has a set of computing resources, and the computing resources are used to process the vehicle work tasks. That is to say, the present application decomposes the drive test data to generate multiple vehicle tasks to be processed in parallel, and uses the container cluster manager to schedule multiple processing containers to perform parallel processing on each vehicle task to be processed in parallel, thereby It solves the technical problem that hardware resources cannot be fully utilized to process high concurrent tasks in related technologies.
下面结合优选的实施方式对本申请发明实施例的技术方案进行举例说明。The technical solutions of the embodiments of the invention of the present application are illustrated below in conjunction with preferred implementation modes.
自动驾驶汽车的核心是由人工智能、视觉计算、雷达以及全球定位系统组合构成的自动驾驶系统,所以自动驾驶系统的好坏决定了自动驾驶汽车能否大规模应用于实际生活,而自动驾驶系统所有的行为说到底实际上都是数据驱动的,只有及时通过使用海量高质量数据对不同模块进行训练才能不断优化自动驾驶系统,所以面对自动驾驶测试过程中采集到的海量数据,如何充分利用现有硬件,迅速及时的进行处理,筛选得到有价值数据,对于自动驾驶系统的快速迭代以及发展是至关重要的。The core of the self-driving car is the self-driving system composed of artificial intelligence, visual computing, radar and global positioning system, so the quality of the self-driving system determines whether the self-driving car can be used in real life on a large scale In the final analysis, all behaviors are actually data-driven. Only by using massive high-quality data to train different modules in a timely manner can the automatic driving system be continuously optimized. Existing hardware can quickly and timely process and screen valuable data, which is crucial for the rapid iteration and development of the automatic driving system.
如何及时利用有限的硬件资源对极速累积的测试数据进行处理是自动驾驶数据平台所面临的主要问题。一方面硬件资源不可能无限扩充,另一方面自动驾驶车每天所采集的数据又在极速增加,一辆车每秒就会采集到700MB数据,一天下来数据量将会达到10T级别,同时算法迭代速度又取决于高质量数据的产生速度,所以如果不能及时对路测采集数据进行处理,就会极大地影响自动驾驶系统的优化速度,而现有的处理逻辑无法充分利用硬件资源,比如单个机器可能存在十几,二十几个中央处理器,但是没有并行运行相同数量的任务,这会造成硬件资源的浪费。How to use limited hardware resources in a timely manner to process the extremely fast accumulated test data is the main problem faced by the autonomous driving data platform. On the one hand, hardware resources cannot be expanded infinitely. On the other hand, the data collected by self-driving cars is increasing rapidly. A car will collect 700MB of data per second, and the amount of data will reach 10T in a day. At the same time, the algorithm iterates The speed depends on the generation speed of high-quality data, so if the data collected by the road test cannot be processed in time, it will greatly affect the optimization speed of the automatic driving system, and the existing processing logic cannot make full use of hardware resources, such as a single machine There may be more than a dozen or twenty central processing units, but the same number of tasks are not run in parallel, which will cause a waste of hardware resources.
因此,为了克服以上问题,在一种相关技术中,提出了一种高并发数据处理系统及方法,利用各个服务器自身的数据处理特性,从而解决了大量数据请求对服务器带来的负载过重的问题。还提出了一种海量数据高并发接收解决方法及系统,根据负载均衡策略分配各个应用服务器的处理量,从而实现高并发收发海量数据,并将处理后的消息数据存储,进而解决了数据收发处理效率低的技术问题。Therefore, in order to overcome the above problems, in a related technology, a high-concurrency data processing system and method is proposed, which utilizes the data processing characteristics of each server itself, thereby solving the problem of excessive load on the server caused by a large number of data requests. question. A solution and system for high concurrent reception of massive data is also proposed, which allocates the processing capacity of each application server according to the load balancing strategy, thereby realizing high concurrent sending and receiving of massive data, and storing the processed message data, thereby solving the problem of data sending and receiving processing Inefficient technical problems.
但是,上述两种方法调度的单位都只是服务器,想要高并发执行就需要大量增加服务器数量,仍存在无法充分利用硬件资源的技术问题。However, the scheduling units of the above two methods are only servers, and the number of servers needs to be greatly increased if high-concurrency execution is to be performed, and there is still a technical problem that hardware resources cannot be fully utilized.
而本申请提出了一种基大规模容器集群管理工具及集群计算引擎的多并发自动化数据处理系统。其中,集群计算引擎使用最先进的有向无环(Directed Acyclic Graph,简称DAG)调度器、查询优化器和物理执行引擎,比以前的数据处理工具快100倍;大规模容器集群管理工具Kubernete则是开源的一个容器编排引擎,它支持自动化部署、大规模可伸缩、应用容器化管理。However, this application proposes a multi-concurrent automatic data processing system based on a large-scale container cluster management tool and a cluster computing engine. Among them, the cluster computing engine uses the most advanced Directed Acyclic Graph (DAG for short) scheduler, query optimizer and physical execution engine, which is 100 times faster than previous data processing tools; the large-scale container cluster management tool Kubernetes is It is an open source container orchestration engine that supports automated deployment, large-scale scalability, and application container management.
本申请通过集群计算引擎进行单任务的拆解,使单任务能够创建并行操作的分布式数据集并汇总结果,同时,通过大规模容器集群管理工具调度多任务并行执行,从而能够充分利用现存硬件资源,高并发处理海量数据,使数据处理速度仅受限于服务器总中央处理器数量而不取决于服务器数量,进而解决现有技术中无法充分利用硬件资源进行高并发任务处理的技术问题。该实施例的多并发的数据处理方法可以包括以 下几个部分。This application disassembles a single task through a cluster computing engine, so that a single task can create a distributed data set for parallel operation and summarize the results. At the same time, it uses a large-scale container cluster management tool to schedule multi-task parallel execution, so that the existing hardware can be fully utilized Resources, high concurrent processing of massive data, so that the data processing speed is only limited by the total number of CPUs of the server rather than the number of servers, thereby solving the technical problem of not being able to make full use of hardware resources for high concurrent task processing in the prior art. The multi-concurrent data processing method of this embodiment may include the following parts.
第一部分:K8S集群的搭建。The first part: the construction of K8S cluster.
第一步,选择一台服务器作为开源的容器编排引擎的控制节点,进行相关环境的部署,控制(Master)节点主要用来作为集群的管理控制中心。The first step is to select a server as the control node of the open source container orchestration engine to deploy related environments. The control (Master) node is mainly used as the management and control center of the cluster.
第一步,将其他服务器作为选择一台服务器作为开源的容器编排引擎的工作负载(Node)节点,进行相关环境的部署,工作负载节点上的工作负载由控制节点来分配,主要用来维护运行的容器并提供容器编排引擎的运行环境。The first step is to use other servers as the workload (Node) node of the open source container orchestration engine to deploy related environments. The workload on the workload node is allocated by the control node, which is mainly used for maintenance and operation. container and provide a running environment for the container orchestration engine.
第三步,测试容器编排引擎集群的运行情况,保证节点间通信等功能正常。The third step is to test the operation of the container orchestration engine cluster to ensure that the functions such as communication between nodes are normal.
第二部分:集群计算引擎处理程序编写及容器封装。The second part: cluster computing engine processing program writing and container packaging.
第一步,集群计算引擎处理程序主要采用python语言进行开发,主要包括两个环节:In the first step, the cluster computing engine processing program is mainly developed in python language, which mainly includes two links:
环节一:对用户提交的任务进行分块从而创建并行操作的分布式数据集,在程序中可以设定需要将单个任务以什么数量级的形式执行,即Task数。Step 1: Divide the tasks submitted by the user into blocks to create a distributed data set for parallel operation. In the program, you can set the order of magnitude in which a single task needs to be executed, that is, the number of Tasks.
环节二:数据处理程序,对海量数据按照既定规则进行筛选后打上特定标签,主要规则包括是否有红绿灯、是否存在行人以及行人数量、是否存在急刹等。Link 2: Data processing program, which screens massive data according to established rules and puts specific labels on it. The main rules include whether there are traffic lights, whether there are pedestrians and the number of pedestrians, and whether there is an emergency brake, etc.
第二步,将集群计算引擎处理程序封装成镜像,从而方便容器编排引擎集群调度运行。The second step is to encapsulate the processing program of the cluster computing engine into a mirror, so as to facilitate the scheduling and operation of the container orchestration engine cluster.
第三部分:提交任务进行大规模高并发数据处理。Part III: Submit tasks for large-scale and high-concurrency data processing.
第一步:设定提交任务的相关参数,主要包括:运行资源分配单元程序(spark.executor.instances),目前设定增加一个资源分配单元需要一个中央处理器资源,单任务并行度是资源分配单元数目乘以每个资源分配单元核数;指定运行所使用的基础镜像程序(spark.kubernetes.container.image),可以采用第二部分的第二步封装得到的镜像;由于原始数据存储是在存储服务器上,运行时启动的容器并不可见,所以需要通过挂载方式使数据对容器可见,具体为,指定运行中挂载的本地目录(如:spark.kubernetes.driver.volumes.hostPath.spark-local-dir-2.options.path与spark.kubernetes.driver.volumes.hostPath.spark-local-dir-2.mount.path),从而使数据对容器可见。Step 1: Set the relevant parameters for submitting tasks, mainly including: running the resource allocation unit program (spark.executor.instances), currently setting to add a resource allocation unit requires a CPU resource, and the parallelism of a single task is the resource allocation The number of units is multiplied by the number of cores of each resource allocation unit; specify the basic image program (spark.kubernetes.container.image) used for running, and the image obtained by encapsulating the second step of the second part can be used; since the original data storage is in On the storage server, the container started at runtime is not visible, so it is necessary to make the data visible to the container through the mounting method. Specifically, specify the local directory to be mounted during operation (such as: spark.kubernetes.driver.volumes.hostPath.spark -local-dir-2.options.path and spark.kubernetes.driver.volumes.hostPath.spark-local-dir-2.mount.path), thus making the data visible to the container.
第二步,提交任务执行。The second step is to submit the task for execution.
在该实施列中,有以下几点需要说明:(1)单任务并发度的计算:申请的计算节 点(Executor)数目和计算节点的核数,决定了同一时刻可以并行执行的Task。比如:拆解任务时拆解为十份,那么计算的时候就会生成十个Task,资源配置如果为五个计算节点,每个节点分配两个中央处理器,同时可以并行的Task数量就是10,即单任务并发度为5*2=10。(2)多任务并发度的计算即为任务数量乘以单任务并发度,比如,如果单任务并发度为10,提交五个任务,则整个集群的并发度为5*10=50。In this implementation, the following points need to be explained: (1) Calculation of single-task concurrency: The number of computing nodes (Executors) applied for and the number of cores of computing nodes determine the tasks that can be executed in parallel at the same time. For example, when a task is disassembled into ten parts, ten tasks will be generated during calculation. If the resource configuration is five computing nodes, and each node is assigned two CPUs, the number of tasks that can be parallelized at the same time is 10. , that is, the concurrency of a single task is 5*2=10. (2) The calculation of multi-task concurrency is the number of tasks multiplied by the single-task concurrency. For example, if the single-task concurrency is 10 and five tasks are submitted, the concurrency of the entire cluster is 5*10=50.
下面对本发明实施例的技术方案的每一部分进一步详细说明。Each part of the technical solutions of the embodiments of the present invention will be further described in detail below.
第一部分,如图2所示,图2是根据本发明实施例的一种任务调度的示意图。The first part is shown in FIG. 2 , which is a schematic diagram of task scheduling according to an embodiment of the present invention.
驱动单元201,驱动单元202是集群计算引擎任务提交至容器编排引擎后产生的,主要用来对分发任务,对所产生的执行单元进行调度管理,是管理单位。其中,驱动单元201包括:执行单元2011、执行单元2012、执行单元2013、执行单元2014、执行单元2015;驱动单元202包括:执行单元2021、执行单元2022、执行单元2023、执行单元2024、执行单元2025。The driver unit 201 and the driver unit 202 are generated after the cluster computing engine task is submitted to the container orchestration engine, and are mainly used for dispatching and managing the distribution tasks and the generated execution units. They are management units. Among them, the drive unit 201 includes: execution unit 2011, execution unit 2012, execution unit 2013, execution unit 2014, execution unit 2015; the drive unit 202 includes: execution unit 2021, execution unit 2022, execution unit 2023, execution unit 2024, execution unit 2025.
执行单元2011、执行单元2012、执行单元2013、执行单元2014、执行单元2015和执行单元2021、执行单元2022、执行单元2023、执行单元2024、执行单元2025集群计算引擎任务提交至容器编排引擎后产生,主要作为具体任务执行单元,实际上它是一组计算资源的集合,即中央处理器核心和存储量(cpu核心、memory)的集合。Execution unit 2011, execution unit 2012, execution unit 2013, execution unit 2014, execution unit 2015, execution unit 2021, execution unit 2022, execution unit 2023, execution unit 2024, execution unit 2025 are generated after the cluster computing engine task is submitted to the container orchestration engine , mainly as a specific task execution unit, in fact it is a collection of computing resources, that is, a collection of central processing unit cores and storage capacity (cpu core, memory).
容器编排引擎203,主要对所有服务器资源以及提交任务进行调度管理。The container orchestration engine 203 mainly schedules and manages all server resources and submitted tasks.
Spark处理程序镜像204,主要是作为基础镜像构建任务运行所需容器。The Spark processing program image 204 is mainly used as a basic image construction task to run the required container.
MongoDB数据库205,用于保存数据筛选处理结果。The MongoDB database 205 is used to save the data screening and processing results.
具体步骤为:用户提交任务后,以204为基础创建执行任务所需的201和202以及执行单元,其中,201和202主要负责调度管理执行单元,执行单元主要负责执行具体任务,执行单元0的数量由用户提交任务时指定的参数确定,203则是对201进行调度管理,同时最终取得的结果会存放至205中。The specific steps are: after the user submits the task, create 201 and 202 and execution units required for executing the task based on 204. Among them, 201 and 202 are mainly responsible for scheduling and managing the execution unit, and the execution unit is mainly responsible for executing specific tasks. The execution unit 0 The quantity is determined by the parameters specified by the user when submitting the task, 203 is to schedule and manage 201, and the final result will be stored in 205.
第二部分,如图3所示,图3是根据本发明实施例的一种任务分解的示意图。The second part is shown in FIG. 3 , which is a schematic diagram of task decomposition according to an embodiment of the present invention.
数据处理程序任务一(25G)301,数据处理程序任务二(25G)302,数据处理程序任务三(25G)303,数据处理程序任务四(25G)304,原始任务数据分解后执行模块,即将车辆采集数据按指定参数分解后形成的任务执行单位。Data processing program task 1 (25G) 301, data processing program task 2 (25G) 302, data processing program task 3 (25G) 303, data processing program task 4 (25G) 304, the execution module after the original task data is decomposed, that is, the vehicle The task execution unit formed after the collected data is decomposed according to the specified parameters.
集群计算引擎工具305,负责对处理车辆采集数据任务的分解,调度数据处理程序对数据进行处理,并且将拆分后任务执行结果进行合并得到完整结果。The cluster computing engine tool 305 is responsible for decomposing the task of processing vehicle data collection, scheduling the data processing program to process the data, and merging the split task execution results to obtain a complete result.
车辆采集数据(100G)306,即未进行处理的原始路测采集数据。The vehicle collection data (100G) 306 is the original road test collection data without processing.
数据库307,用于保存数据筛选处理结果。The database 307 is used to store the results of the data screening process.
具体步骤为:306作为待处理任务提交给305,305将306分解为并行任务得到一定数量的301、302、303、304,301、302、303、304执行过程中会和307进行数据交互,将数据处理结果存入307中。The specific steps are: 306 is submitted as a pending task to 305, 305 decomposes 306 into parallel tasks to obtain a certain number of 301, 302, 303, 304, 301, 302, 303, 304 will perform data interaction with 307 during execution, and will The data processing result is stored in 307.
下面对结合具体实施例对本发明实施例的技术方案进一步详细说明。The technical solutions of the embodiments of the present invention will be further described in detail below in conjunction with specific embodiments.
第一步,选择一个服务器作为容器集群管理工具的控制节点,在上面安装v1.20.1版本的容器集群管理工具,安装过程中可以指定镜像源--image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers,即将下载地址指定为中国地址,防止从原版拉取数据不稳定,前置条件为docker安装。The first step is to select a server as the control node of the container cluster management tool, and install the v1.20.1 version of the container cluster management tool on it. During the installation process, you can specify the image source --image-repository=registry.cn-hangzhou.aliyuncs. com/google_containers, the download address is designated as a Chinese address to prevent instability in pulling data from the original version, and the prerequisite is docker installation.
第二步,将其他服务器作为容器集群管理工具的负载节点,进行相关环境的部署,然后将负载节点添加到集群中,安装完毕后可在控制节点查看负载节点状态,同时可以进行网络、部署等相关测试。The second step is to use other servers as the load nodes of the container cluster management tool to deploy related environments, and then add the load nodes to the cluster. After installation, you can view the status of the load nodes on the control node, and at the same time perform network, deployment, etc. related tests.
第三步,集群计算引擎中处理程序的编写,可以是:编写相关处理程序,其中分块数设定为五,即任务提交后集群计算引擎会自动将所提交任务待处理数据量分为大致相等的五份,数据筛选规则主要有采集数据的路段是否有红绿灯、是否存在行人以及行人数量、行驶过程中是否存在急刹等。The third step is to write the processing program in the cluster computing engine, which can be: write the relevant processing program, where the number of blocks is set to five, that is, after the task is submitted, the cluster computing engine will automatically divide the amount of data to be processed in the submitted task into roughly There are five equal copies. The data screening rules mainly include whether there are traffic lights on the road section where the data is collected, whether there are pedestrians and the number of pedestrians, and whether there is sudden braking during driving.
第三步,将集群计算引擎处理程序封装成镜像,编写相关的镜像构建规则,按照指定步骤进行镜像的构建,其中,需要设置entrypoint(即容器启动后所需要执行的命令),作为默认的启动容器时需要执行的操作,从而提高自动化程度,进而满足程序运行需要。The third step is to encapsulate the processing program of the cluster computing engine into a mirror, write the relevant mirror construction rules, and build the mirror according to the specified steps. Among them, the entrypoint (that is, the command that needs to be executed after the container starts) needs to be set as the default startup The operation that needs to be performed when the container is used, so as to improve the degree of automation, and then meet the needs of program operation.
第四步,提交任务进行大规模高并发数据处理,首先设定提交任务的相关参数,分别为:spark.executor.instances,即指定任务提交后创建几个负载容器,这里指定为五个;spark.kubernetes.container.image,即spark处理程序封装成的镜像名;spark.kubernetes.driver.volumes.hostPath.spark-local-dir-2.options.path与spark.kubernetes.driver.volumes.hostPath.spark-local-dir-2.mount.path,即指定运行中挂载的本地目录等,设定好之后就进行任务的提交。The fourth step is to submit the task for large-scale and high-concurrency data processing. First, set the relevant parameters of the submitted task, which are: spark.executor.instances, that is, to create several load containers after the specified task is submitted, here specified as five; spark .kubernetes.container.image, that is, the image name encapsulated by the spark handler; spark.kubernetes.driver.volumes.hostPath.spark-local-dir-2.options.path and spark.kubernetes.driver.volumes.hostPath.spark -local-dir-2.mount.path, that is, specify the local directory to be mounted during operation, etc. After setting, submit the task.
第五步,进行任务的提交,选定两个文件夹,分别具有100G路测数据,每个文件夹提交一次任务,提交后容器集群管理工具会创建两个管理容器(spark driver),每个管理容器又会启动5个处理容器(spark executor),此时一共200G数据是分为10 个20G任务在并行执行,集群并行度此时为10,运行完成后相关结果可在数据库中查询。The fifth step is to submit the task, select two folders, each with 100G drive test data, submit a task for each folder, after submission, the container cluster management tool will create two management containers (spark driver), each The management container will start 5 processing containers (spark executor). At this time, a total of 200G data is divided into 10 20G tasks to be executed in parallel. The cluster parallelism is 10 at this time. After the operation is completed, the relevant results can be queried in the database.
由上述可知,该实施列具有以下几点:对于以服务器台数为单位执行任务的,并发度受限于服务器数量,而一台服务器可能具有40个中央处理器,这就造成了资源的极大浪费,通过容器编排引擎与集群计算引擎相结合,充分利用服务器硬件资源,使任务执行并发度获得极大提高,将任务执行单位从服务器台数变为中央处理器个数,在有限的资源条件下实现了效率最大化,不仅节约了成本而且节省了时间;采用了容器编排引擎进行任务调度及管理,可以使开发人员更专注于程序开发,而不用花费大量时间在容器化应用程序的部署、伸缩等上面,可以帮助开发人员简单高效地管理集群;采用集群计算引擎进行任务分解可以使开发人员不用关注分解调度逻辑,可以节省时间放在数据处理程序部分的开发上。From the above, it can be seen that this implementation has the following points: For tasks performed in units of servers, the concurrency is limited by the number of servers, and a server may have 40 central processing units, which results in a huge resource Waste, through the combination of container orchestration engine and cluster computing engine, make full use of server hardware resources, greatly improve the concurrency of task execution, change the number of task execution units from the number of servers to the number of central processing units, under the condition of limited resources The efficiency is maximized, saving not only cost but also time; the use of container orchestration engine for task scheduling and management allows developers to focus more on program development instead of spending a lot of time on the deployment and scaling of containerized applications Etc., it can help developers to manage the cluster simply and efficiently; using the cluster computing engine for task decomposition can save developers from paying attention to the decomposition and scheduling logic, and can save time on the development of the data processing program.
根据本申请实施例,还提供了一种多并发的数据处理装置。需要说明的是,该多并发的数据处理装置可以用于执行上述多并发的数据处理方法。According to an embodiment of the present application, a multi-concurrent data processing device is also provided. It should be noted that the multi-concurrent data processing apparatus can be used to execute the above-mentioned multi-concurrent data processing method.
图4是根据本申请实施例的一种多并发的数据处理装置的示意图。如图4所示,该多并发的数据处理装置400可以包括:获取组件401、分解组件402、提交组件403和处理组件404。Fig. 4 is a schematic diagram of a multi-concurrent data processing apparatus according to an embodiment of the present application. As shown in FIG. 4 , the multi-concurrent data processing apparatus 400 may include: an acquisition component 401 , a decomposition component 402 , a submission component 403 and a processing component 404 .
此处需要说明的是,上述获取组件401、分解组件402、提交组件403和处理组件404可以作为装置的一部分运行在终端中,可以通过终端中的处理器来执行上述模块实现的功能,终端也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌声电脑以及移动互联网设备(Mobile Internet Devices,MID)、PAD等终端设备。It should be noted here that the above acquisition component 401, decomposition component 402, submission component 403 and processing component 404 can run in the terminal as part of the device, and the functions realized by the above modules can be executed by the processor in the terminal, and the terminal can also It can be smart phones (such as Android phones, iOS phones, etc.), tablet computers, applause computers, mobile Internet devices (Mobile Internet Devices, MID), PAD and other terminal devices.
获取组件401,设置为获取车辆采集到的路测数据。The acquisition component 401 is configured to acquire the drive test data collected by the vehicle.
分解组件402,设置为将路测数据进行分解,生成多个待并行处理的车辆工作任务,其中,车辆工作任务为控制车辆按照指定的控制指令工作的任务。The decomposing component 402 is configured to decompose the drive test data to generate multiple vehicle work tasks to be processed in parallel, wherein the vehicle work tasks are tasks for controlling the vehicle to work according to specified control instructions.
提交组件403,设置为将多个待并行处理的车辆工作任务提交至容器集群管理器,其中,容器集群管理器用于编排和调度服务器集群中的计算资源。The submitting component 403 is configured to submit multiple vehicle work tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster.
处理组件404,设置为通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,生成处理结果,其中,每个处理容器具有一组所述计算资源,计算资源用于处理车辆工作任务。The processing component 404 is configured to schedule a plurality of processing containers through the container cluster manager to perform parallel processing on each vehicle work task to be processed in parallel to generate a processing result, wherein each processing container has a set of computing resources, and the computing Resources are used to handle vehicle work tasks.
可选地,服务器集群包括多台服务器,按照服务器集群的计算资源的数量划分得到对应数量的所述处理容器,计算资源至少包括:处理器资源和存储资源,其中,服 务器集训部署有至少一主服务器和至少一个从服务器,容器集群管理器安装在主服务器上,用于监控从服务器的工作状态。Optionally, the server cluster includes multiple servers, and the corresponding number of processing containers are obtained according to the number of computing resources of the server cluster. The computing resources include at least: processor resources and storage resources, wherein the server training deployment has at least one main server and at least one slave server, and the container cluster manager is installed on the master server to monitor the working status of the slave server.
可选地,该装置还包括:创建组件,设置为创容器集群管理器的任务处理程序,其中,任务处理程序用于按照预设参数确定需要调取的处理容器的数量;封装组件,设置为将任务处理程序进行封装,构建任务处理镜像;构建组件,设置为基于任务处理镜像,构建车辆工作任务运行时所需的容器,其中,容器包括:用于进行调度管理的管理容器和用于运行任务的处理容器。Optionally, the device also includes: a creation component, configured as a task handler for creating a container cluster manager, wherein the task handler is used to determine the number of processing containers that need to be transferred according to preset parameters; an encapsulation component, configured as Encapsulate the task processing program and build a task processing image; build components, set it based on the task processing image, and build the container required for the running of the vehicle work task. Among them, the container includes: a management container for scheduling management and a running The processing container for the task.
此处需要说明的是,上述创建组件、封装组件和构建组件可以作为装置的一部分运行在终端中,可以通过终端中的处理器来执行上述模块实现的功能。It should be noted here that the above creation component, packaging component and construction component can be run in the terminal as a part of the device, and the functions implemented by the above modules can be executed by the processor in the terminal.
可选地,处理组件包括:接收子组件,设置为接收到多个待并行处理的车辆工作任务;调度子组件,设置为调度至少一个管理容器和与车辆工作任务数量相同的处理容器;分发子组件,设置为每个管理容器将对应车辆工作任务分发到指定的处理容器;运行子组件,设置为启动处理容器,每个处理容器分别对分配来的车辆工作任务进行运行。Optionally, the processing component includes: a receiving subcomponent, configured to receive a plurality of vehicle work tasks to be processed in parallel; a scheduling subcomponent, configured to schedule at least one management container and processing containers with the same number as the vehicle work tasks; The component is configured to distribute the corresponding vehicle work tasks to the designated processing containers for each management container; the running sub-component is configured to start the processing containers, and each processing container runs the assigned vehicle work tasks respectively.
此处需要说明的是,上述接收子组件、调度子组件、分发子组件和运行子组件可以作为装置的一部分运行在终端中,可以通过终端中的处理器来执行上述模块实现的功能。It should be noted here that the above receiving subcomponent, scheduling subcomponent, distribution subcomponent and running subcomponent can run in the terminal as part of the device, and the functions implemented by the above modules can be executed by the processor in the terminal.
可选地,该装置还包括:合并组件,设置为合并每个车辆工作任务的子运行结果,生成处理结果;存储组件,设置为将处理结果存储至预定的数据库,其中,数据库为允许交互查询的数据库。Optionally, the device further includes: a merging component, configured to merge the sub-running results of each vehicle work task, and generate a processing result; a storage component, configured to store the processing result in a predetermined database, wherein the database allows interactive query database.
此处需要说明的是,上述合并组件和存储组件可以作为装置的一部分运行在终端中,可以通过终端中的处理器来执行上述模块实现的功能。It should be noted here that the above merging component and storage component may run in the terminal as a part of the device, and the functions implemented by the above modules may be executed by the processor in the terminal.
在该实施例的多并发的数据处理装置中,将路测数据进行分解,生成多个待并行处理的车辆工作任务,通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,从而解决了相关技术中无法充分利用硬件资源进行高并发任务处理的技术问题。In the multi-concurrent data processing device of this embodiment, the drive test data is decomposed to generate a plurality of vehicle tasks to be processed in parallel, and the container cluster manager schedules multiple processing containers for each vehicle to be processed in parallel The work tasks are processed in parallel, thereby solving the technical problem in related technologies that hardware resources cannot be fully utilized for high concurrent task processing.
根据本申请实施例,还提供了一种非易失性可读存储介质,其中,该非易失性可读存储介质包括存储的程序,其中,所述程序执行本申请实施例中任意一项所述的多并发的数据处理的方法。According to an embodiment of the present application, there is also provided a non-volatile readable storage medium, wherein the non-volatile readable storage medium includes a stored program, wherein the program executes any one of the embodiments of the present application The method for multi-concurrent data processing.
本申请实施例所提供的各个功能模块可以在多并发的数据处理方法或者类似的运算装置中运行,也可以作为非易失性存储介质的一部分进行存储。Each functional module provided by the embodiment of the present application can be run in a multi-concurrent data processing method or a similar computing device, and can also be stored as a part of a non-volatile storage medium.
图5是根据本申请实施例的一种非易失性存储介质的结构示意图。如图5所示,描述了根据本申请的实施方式的程序产品50,其上存储有计算机程序,计算机程序被处理器执行时实现如下步骤的程序代码:Fig. 5 is a schematic structural diagram of a non-volatile storage medium according to an embodiment of the present application. As shown in FIG. 5 , a program product 50 according to an embodiment of the present application is described, on which a computer program is stored, and when the computer program is executed by a processor, the program code that implements the following steps:
获取车辆采集到的路测数据;将路测数据进行分解,生成多个待并行处理的车辆工作任务,其中,车辆工作任务为控制车辆按照指定的控制指令工作的任务。Obtain the road test data collected by the vehicle; decompose the road test data to generate multiple vehicle work tasks to be processed in parallel, wherein the vehicle work task is the task of controlling the vehicle to work according to the specified control instructions.
将多个待并行处理的车辆工作任务提交至容器集群管理器,其中,容器集群管理器用于编排和调度服务器集群中的计算资源;通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,生成处理结果,其中,每个处理容器具有一组计算资源,计算资源用于处理车辆工作任务。Submit multiple vehicle work tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster; the container cluster manager schedules multiple processing containers for each to be parallelized The processed vehicle work tasks are processed in parallel to generate processing results, wherein each processing container has a set of computing resources for processing the vehicle work tasks.
可选地,计算机程序还被处理器执行时实现如下步骤的程序代码:服务器集群包括多台服务器,按照服务器集群的计算资源的数量划分得到对应数量的处理容器,计算资源至少包括:处理器资源和存储资源,其中,服务器集训部署有至少一主服务器和至少一个从服务器,容器集群管理器安装在主服务器上,用于监控从服务器的工作状态。Optionally, when the computer program is executed by the processor, the program code that implements the following steps: the server cluster includes multiple servers, and the corresponding number of processing containers are obtained according to the number of computing resources of the server cluster, and the computing resources include at least: processor resources and storage resources, where at least one master server and at least one slave server are deployed in the server training camp, and the container cluster manager is installed on the master server to monitor the working status of the slave servers.
可选地,计算机程序还被处理器执行时实现如下步骤的程序代码:创建容器集群管理器的任务处理程序,其中,任务处理程序用于按照预设参数确定需要调取的处理容器的数量;将任务处理程序进行封装,构建任务处理镜像;基于任务处理镜像,构建车辆工作任务运行时所需的容器,其中,容器包括:用于进行调度管理的管理容器和用于运行任务的处理容器。Optionally, when the computer program is executed by the processor, the program code for implementing the following steps: creating a task processing program of the container cluster manager, wherein the task processing program is used to determine the number of processing containers that need to be transferred according to preset parameters; Encapsulate the task processing program and build a task processing image; based on the task processing image, build the container required for the running of the vehicle work task, wherein the container includes: a management container for scheduling management and a processing container for running tasks.
可选地,计算机程序还被处理器执行时实现如下步骤的程序代码:接收到多个待并行处理的车辆工作任务,调度至少一个管理容器和与车辆工作任务数量相同的处理容器;每个管理容器将对应车辆工作任务分发到指定的处理容器;启动处理容器,每个处理容器分别对分配来的车辆工作任务进行运行。Optionally, when the computer program is executed by the processor, the program code that implements the following steps: receiving multiple vehicle work tasks to be processed in parallel, dispatching at least one management container and processing containers with the same number as the vehicle work tasks; The container distributes the corresponding vehicle work tasks to designated processing containers; the processing containers are started, and each processing container runs the assigned vehicle work tasks respectively.
可选地,计算机程序还被处理器执行时实现如下步骤的程序代码:在每个处理容器分别对分配来的车辆工作任务进行运行之后,合并每个车辆工作任务的子运行结果,生成处理结果;将处理结果存储至预定的数据库,其中,数据库为允许交互查询的数据库。Optionally, when the computer program is executed by the processor, the program code that implements the following steps: After each processing container executes the assigned vehicle work tasks, merge the sub-run results of each vehicle work task to generate a processing result ; Store the processing result in a predetermined database, wherein the database is a database that allows interactive query.
可选地,在本实施例中,非易失性存储介质还可以被设置为多并发的数据处理方法提供的各种优选地或可选的方法步骤的程序代码。Optionally, in this embodiment, the non-volatile storage medium may also be configured as program codes of various preferred or optional method steps provided by the multi-concurrent data processing method.
可选地,本实施例中的具体示例可以参考上述实施例中所描述的示例,本实施例在此不再赘述。Optionally, for specific examples in this embodiment, reference may be made to the examples described in the foregoing embodiments, and details are not repeated in this embodiment.
非易失性存储介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信 号、光信号或上述的任意合适的组合。非易失性存储介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。Non-volatile storage media may include a data signal carrying readable program code in baseband or as part of a carrier wave traveling as a data signal. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A non-volatile storage medium may send, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
非易失性存储介质中包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、射频等等,或者上述的任意合适的组合。The program code contained in the non-volatile storage medium can be transmitted by any appropriate medium, including but not limited to wireless, cable, optical cable, radio frequency, etc., or any suitable combination of the above.
根据本申请实施例,提供了一种处理器。图6是根据本申请实施例的一种处理器的结构示意图。如图6所示,该处理器60设置为运行程序,其中,所述程序运行时执行本申请实施例所述的多并发的数据处理方法。According to an embodiment of the present application, a processor is provided. Fig. 6 is a schematic structural diagram of a processor according to an embodiment of the present application. As shown in FIG. 6 , the processor 60 is configured to run a program, wherein the program executes the multi-concurrent data processing method described in the embodiment of the present application when running.
在发明本实施例中,上述处理器60可以执行多并发的数据处理方法的运行程序。In this embodiment of the invention, the above-mentioned processor 60 may execute the execution programs of the multi-concurrent data processing method.
可选地,在本实施例中,处理器60可以被设置为执行下述步骤:Optionally, in this embodiment, the processor 60 may be configured to perform the following steps:
获取车辆采集到的路测数据;将所述路测数据进行分解,生成多个待并行处理的车辆工作任务,其中,所述车辆工作任务为控制所述车辆按照指定的控制指令工作的任务。Acquiring drive test data collected by the vehicle; decomposing the drive test data to generate multiple vehicle work tasks to be processed in parallel, wherein the vehicle work tasks are tasks for controlling the vehicle to work according to specified control instructions.
将多个待并行处理的车辆工作任务提交至容器集群管理器,其中,容器集群管理器用于编排和调度服务器集群中的计算资源;通过容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,生成处理结果,其中,每个处理容器具有一组计算资源,计算资源用于处理车辆工作任务。Submit multiple vehicle work tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster; the container cluster manager schedules multiple processing containers for each to be parallelized The processed vehicle work tasks are processed in parallel to generate processing results, wherein each processing container has a set of computing resources for processing the vehicle work tasks.
可选地,处理器60可以还被设置为执行下述步骤:服务器集群包括多台服务器,按照服务器集群的计算资源的数量划分得到对应数量的处理容器,计算资源至少包括:处理器资源和存储资源,其中,服务器集训部署有至少一主服务器和至少一个从服务器,容器集群管理器安装在主服务器上,用于监控从服务器的工作状态。Optionally, the processor 60 may also be configured to perform the following steps: the server cluster includes multiple servers, and a corresponding number of processing containers are obtained according to the number of computing resources of the server cluster, and the computing resources include at least: processor resources and storage resources, where at least one master server and at least one slave server are deployed in the server training camp, and the container cluster manager is installed on the master server to monitor the working status of the slave server.
可选地,处理器60可以还被设置为执行下述步骤:创建容器集群管理器的任务处理程序,其中,任务处理程序用于按照预设参数确定需要调取的处理容器的数量;将任务处理程序进行封装,构建任务处理镜像;基于任务处理镜像,构建车辆工作任务运行时所需的容器,其中,容器包括:用于进行调度管理的管理容器和用于运行任务的处理容器。Optionally, the processor 60 may also be configured to perform the following steps: create a task processing program of the container cluster manager, wherein the task processing program is used to determine the number of processing containers that need to be transferred according to preset parameters; The processing program is packaged to build a task processing image; based on the task processing image, the container required for the running of the vehicle work task is constructed, wherein the container includes: a management container for scheduling management and a processing container for running tasks.
可选地,处理器60可以还被设置为执行下述步骤:接收到多个待并行处理的车辆工作任务,调度至少一个管理容器和与车辆工作任务数量相同的处理容器;每个管理容器将对应车辆工作任务分发到指定的处理容器;启动处理容器,每个处理容器分别对分配来的车辆工作任务进行运行。Optionally, the processor 60 may also be configured to perform the following steps: receiving a plurality of vehicle work tasks to be processed in parallel, scheduling at least one management container and processing containers having the same number as the vehicle work tasks; each management container will The corresponding vehicle work tasks are distributed to the designated processing containers; the processing containers are started, and each processing container runs the assigned vehicle work tasks respectively.
可选地,处理器60可以还被设置为执行下述步骤:在每个处理容器分别对分配来的车辆工作任务进行运行之后,合并每个车辆工作任务的子运行结果,生成处理结果;将处理结果存储至预定的数据库,其中,数据库为允许交互查询的数据库。Optionally, the processor 60 may also be configured to perform the following steps: After each processing container executes the assigned vehicle work tasks, merge the sub-run results of each vehicle work task to generate a processing result; The processing results are stored in a predetermined database, wherein the database is a database that allows interactive query.
上述处理器60可以通过运行存储在存储器内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的多并发的数据处理方法。The above-mentioned processor 60 can execute various functional applications and data processing by running software programs and modules stored in the memory, that is, realize the above-mentioned multi-concurrent data processing method.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the above embodiments of the present application are for description only, and do not represent the advantages and disadvantages of the embodiments.
在本申请的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present application, the descriptions of each embodiment have their own emphases, and for parts not described in detail in a certain embodiment, reference may be made to relevant descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,可以为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed technical content can be realized in other ways. Wherein, the device embodiments described above are only illustrative. For example, the division of the units may be a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or may be Integrate into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of units or modules may be in electrical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对相关技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solution of this application or the part that contributes to the related technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium. Several instructions are included to make a computer device (which may be a personal computer, server or network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disc, etc., which can store program codes. .
以上所述仅是本申请的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。The above description is only the preferred embodiment of the present application. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of the present application, some improvements and modifications can also be made. These improvements and modifications are also It should be regarded as the protection scope of this application.
工业实用性Industrial Applicability
本申请实施例提供的方案可以应用于多并发的数据处理过程中,获取车辆采集到的路测数据;将所述路测数据进行分解,生成多个待并行处理的车辆工作任务,其中,所述车辆工作任务为控制所述车辆按照指定的控制指令工作的任务;将所述多个待并行处理的车辆工作任务提交至容器集群管理器,其中,所述容器集群管理器用于编排和调度服务器集群中的计算资源;通过所述容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,生成处理结果,其中,每个所述处理容器具有一组所述计算资源,所述计算资源用于处理所述车辆工作任务。上述方案通过将所述路测数据进行分解,生成多个待并行处理的车辆工作任务,从而充分利用硬件资源进行告发并行任务,进而解决了相关技术中无法充分利用硬件资源进行高并发任务处理的技术问题。The solution provided by the embodiment of the present application can be applied in the multi-concurrent data processing process to obtain the drive test data collected by the vehicle; decompose the drive test data to generate multiple vehicle tasks to be processed in parallel, wherein the The vehicle work task is a task of controlling the vehicle to work according to a specified control instruction; the multiple vehicle work tasks to be processed in parallel are submitted to the container cluster manager, wherein the container cluster manager is used for orchestrating and scheduling servers Computing resources in the cluster; through the container cluster manager, a plurality of processing containers are scheduled to perform parallel processing on each vehicle work task to be processed in parallel to generate a processing result, wherein each of the processing containers has a group of the Computing resources for processing the vehicle work tasks. The above solution decomposes the drive test data to generate multiple vehicle tasks to be processed in parallel, thereby making full use of hardware resources to report parallel tasks, thereby solving the problem of not being able to make full use of hardware resources for high concurrent task processing in related technologies technical problem.

Claims (12)

  1. 一种多并发的数据处理方法,包括:A multi-concurrent data processing method, including:
    获取车辆采集到的路测数据;Obtain the road test data collected by the vehicle;
    将所述路测数据进行分解,生成多个待并行处理的车辆工作任务,其中,所述车辆工作任务为控制所述车辆按照指定的控制指令工作的任务;Decomposing the drive test data to generate multiple vehicle work tasks to be processed in parallel, wherein the vehicle work tasks are tasks for controlling the vehicle to work according to specified control instructions;
    将所述多个待并行处理的车辆工作任务提交至容器集群管理器,其中,所述容器集群管理器用于编排和调度服务器集群中的计算资源;Submitting the plurality of vehicle work tasks to be processed in parallel to a container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster;
    通过所述容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,生成处理结果,其中,每个所述处理容器具有一组所述计算资源,所述计算资源用于处理所述车辆工作任务。A plurality of processing containers are scheduled by the container cluster manager to perform parallel processing on each vehicle work task to be processed in parallel to generate a processing result, wherein each processing container has a set of computing resources, and the computing A resource is used to process the vehicle work task.
  2. 根据权利要求1所述的方法,其中,所述服务器集群包括多台服务器,按照所述服务器集群的计算资源的数量划分得到对应数量的所述处理容器,所述计算资源至少包括:处理器资源和存储资源,其中,所述服务器集训部署有至少一主服务器和至少一个从服务器,所述容器集群管理器安装在所述主服务器上,用于监控所述从服务器的工作状态。The method according to claim 1, wherein the server cluster includes a plurality of servers, and the corresponding number of processing containers are obtained by dividing the computing resources of the server cluster, and the computing resources include at least: processor resources and storage resources, wherein at least one master server and at least one slave server are deployed in the server camp training, and the container cluster manager is installed on the master server to monitor the working status of the slave server.
  3. 根据权利要求2所述的方法,其中,所述方法还包括:The method according to claim 2, wherein the method further comprises:
    创建所述容器集群管理器的任务处理程序,其中,所述任务处理程序用于按照预设参数确定需要调取的所述处理容器的数量;Create a task processing program of the container cluster manager, wherein the task processing program is used to determine the number of processing containers that need to be transferred according to preset parameters;
    将所述任务处理程序进行封装,构建任务处理镜像;Encapsulating the task processing program to construct a task processing image;
    基于所述任务处理镜像,构建所述车辆工作任务运行时所需的容器,其中,所述容器包括:用于进行调度管理的管理容器和用于运行任务的处理容器。Based on the task processing image, the containers required for running the vehicle work tasks are constructed, wherein the containers include: a management container for scheduling management and a processing container for running tasks.
  4. 根据权利要求3所述的方法,其中,通过所述容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,包括:The method according to claim 3, wherein, through the container cluster manager, scheduling a plurality of processing containers to perform parallel processing on each vehicle work task to be processed in parallel, comprising:
    接收到所述多个待并行处理的车辆工作任务,调度至少一个管理容器和与所述车辆工作任务数量相同的所述处理容器;receiving the plurality of vehicle work tasks to be processed in parallel, scheduling at least one management container and the same number of processing containers as the vehicle work tasks;
    每个所述管理容器将对应所述车辆工作任务分发到指定的处理容器;Each of the management containers distributes the corresponding vehicle work tasks to designated processing containers;
    启动所述处理容器,每个所述处理容器分别对分配来的所述车辆工作任务进 行运行。The processing containers are started, and each of the processing containers runs on the assigned vehicle work tasks.
  5. 根据权利要求4所述的方法,其中,在每个所述处理容器分别对分配来的所述车辆工作任务进行运行之后,所述方法还包括:The method according to claim 4, wherein, after each of the processing containers respectively executes the assigned vehicle work tasks, the method further comprises:
    合并每个所述车辆工作任务的子运行结果,生成所述处理结果;Merging the sub-running results of each of the vehicle work tasks to generate the processing results;
    将所述处理结果存储至预定的数据库,其中,所述数据库为允许交互查询的数据库。The processing result is stored in a predetermined database, wherein the database is a database that allows interactive query.
  6. 一种多并发的数据处理装置,包括:A multi-concurrent data processing device, comprising:
    获取组件,设置为获取车辆采集到的路测数据;The acquisition component is set to acquire the road test data collected by the vehicle;
    分解组件,设置为将所述路测数据进行分解,生成多个待并行处理的车辆工作任务,其中,所述车辆工作任务为控制车辆按照指定的控制指令工作的任务;A decomposition component is configured to decompose the drive test data to generate a plurality of vehicle work tasks to be processed in parallel, wherein the vehicle work tasks are tasks for controlling the vehicle to work according to specified control instructions;
    提交组件,设置为将所述多个待并行处理的车辆工作任务提交至容器集群管理器,其中,所述容器集群管理器用于编排和调度服务器集群中的计算资源;The submission component is configured to submit the plurality of vehicle work tasks to be processed in parallel to the container cluster manager, wherein the container cluster manager is used to arrange and schedule computing resources in the server cluster;
    处理组件,设置为通过所述容器集群管理器调度多个处理容器分别对每个待并行处理的车辆工作任务进行并行处理,生成处理结果,其中,每个所述处理容器具有一组所述计算资源,所述计算资源用于处理所述车辆工作任务。The processing component is configured to schedule a plurality of processing containers through the container cluster manager to perform parallel processing on each vehicle work task to be processed in parallel to generate a processing result, wherein each processing container has a set of computing resources, the computing resources are used to process the vehicle work tasks.
  7. 根据权利要求6所述的装置,其中,所述服务器集群包括多台服务器,按照所述服务器集群的计算资源的数量划分得到对应数量的所述处理容器,所述计算资源至少包括:处理器资源和存储资源,其中,所述服务器集训部署有至少一主服务器和至少一个从服务器,所述容器集群管理器安装在所述主服务器上,用于监控所述从服务器的工作状态。The device according to claim 6, wherein the server cluster includes a plurality of servers, and the corresponding number of processing containers are obtained by dividing the computing resources of the server cluster, and the computing resources include at least: processor resources and storage resources, wherein at least one master server and at least one slave server are deployed in the server camp training, and the container cluster manager is installed on the master server to monitor the working status of the slave server.
  8. 根据权利要求7所述的装置,其中,所述装置还包括:The device according to claim 7, wherein the device further comprises:
    创建组件,设置为创建所述容器集群管理器的任务处理程序,其中,所述任务处理程序设置为按照预设参数确定需要调取的所述处理容器的数量;The creation component is configured to create a task processing program of the container cluster manager, wherein the task processing program is configured to determine the number of processing containers that need to be transferred according to preset parameters;
    封装组件,设置为将所述任务处理程序进行封装,构建任务处理镜像;An encapsulation component is configured to encapsulate the task processing program to construct a task processing image;
    构建组件,设置为基于所述任务处理镜像,构建所述车辆工作任务运行时所需的容器,其中,所述容器包括:用于进行调度管理的管理容器和用于运行任务的处理容器。The building component is configured to build the container required for running the vehicle work task based on the task processing image, wherein the container includes: a management container for scheduling management and a processing container for running the task.
  9. 根据权利要求8所述的装置,其中,所述处理组件包括:The apparatus of claim 8, wherein the processing component comprises:
    接收子组件,设置为接收到所述多个待并行处理的车辆工作任务;The receiving subassembly is configured to receive the plurality of vehicle work tasks to be processed in parallel;
    调度子组件,设置为调度至少一个管理容器和与所述车辆工作任务数量相同的所述处理容器;a scheduling subcomponent configured to schedule at least one management container and the same number of processing containers as the number of work tasks of the vehicle;
    分发子组件,设置为每个所述管理容器将对应所述车辆工作任务分发到指定的处理容器;The distribution subcomponent is configured to distribute the corresponding vehicle work task to a designated processing container for each of the management containers;
    运行子组件,设置为启动所述处理容器,每个所述处理容器分别对分配来的所述车辆工作任务进行运行。The running subcomponent is configured to start the processing containers, and each of the processing containers respectively runs the assigned vehicle work tasks.
  10. 根据权利要求9所述的装置,其中,所述装置还包括:The device according to claim 9, wherein the device further comprises:
    合并组件,设置为合并每个所述车辆工作任务的子运行结果,生成所述处理结果;A merging component is configured to merge the sub-running results of each of the vehicle work tasks to generate the processing results;
    存储组件,设置为将所述处理结果存储至预定的数据库,其中,所述数据库为允许交互查询的数据库。The storage component is configured to store the processing result in a predetermined database, wherein the database is a database that allows interactive query.
  11. 一种非易失性存储介质,所述非易失性存储介质包括存储的程序,其中,在所述程序运行时控制所述非易失性存储介质所在设备执行权利要求1至5中任意一项所述的多并发的数据处理方法。A non-volatile storage medium, the non-volatile storage medium includes a stored program, wherein, when the program is running, the device where the non-volatile storage medium is located is controlled to execute any one of claims 1 to 5 The multi-concurrent data processing method described in the item.
  12. 一种处理器,所述处理器设置为运行程序,其中,所述程序运行时执行权利要求1至5中任意一项所述的多并发的数据处理方法。A processor, the processor is configured to run a program, wherein the multi-concurrent data processing method according to any one of claims 1 to 5 is executed when the program is running.
PCT/CN2022/104711 2021-12-31 2022-07-08 Multi-concurrency data processing method and device WO2023124000A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111676678.8 2021-12-31
CN202111676678.8A CN114327834A (en) 2021-12-31 2021-12-31 Multi-concurrent data processing method and device

Publications (1)

Publication Number Publication Date
WO2023124000A1 true WO2023124000A1 (en) 2023-07-06

Family

ID=81023792

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/104711 WO2023124000A1 (en) 2021-12-31 2022-07-08 Multi-concurrency data processing method and device

Country Status (2)

Country Link
CN (1) CN114327834A (en)
WO (1) WO2023124000A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327834A (en) * 2021-12-31 2022-04-12 中国第一汽车股份有限公司 Multi-concurrent data processing method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9860569B1 (en) * 2015-10-05 2018-01-02 Amazon Technologies, Inc. Video file processing
CN110888722A (en) * 2019-11-15 2020-03-17 北京奇艺世纪科技有限公司 Task processing method and device, electronic equipment and computer readable storage medium
US20200301801A1 (en) * 2019-03-20 2020-09-24 Salesforce.Com, Inc. Content-sensitive container scheduling on clusters
CN111897622A (en) * 2020-06-10 2020-11-06 中国科学院计算机网络信息中心 High-throughput computing method and system based on container technology
CN112650556A (en) * 2020-12-25 2021-04-13 芜湖雄狮汽车科技有限公司 Multitask concurrent testing method and device for vehicle
CN114327834A (en) * 2021-12-31 2022-04-12 中国第一汽车股份有限公司 Multi-concurrent data processing method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9860569B1 (en) * 2015-10-05 2018-01-02 Amazon Technologies, Inc. Video file processing
US20200301801A1 (en) * 2019-03-20 2020-09-24 Salesforce.Com, Inc. Content-sensitive container scheduling on clusters
CN110888722A (en) * 2019-11-15 2020-03-17 北京奇艺世纪科技有限公司 Task processing method and device, electronic equipment and computer readable storage medium
CN111897622A (en) * 2020-06-10 2020-11-06 中国科学院计算机网络信息中心 High-throughput computing method and system based on container technology
CN112650556A (en) * 2020-12-25 2021-04-13 芜湖雄狮汽车科技有限公司 Multitask concurrent testing method and device for vehicle
CN114327834A (en) * 2021-12-31 2022-04-12 中国第一汽车股份有限公司 Multi-concurrent data processing method and device

Also Published As

Publication number Publication date
CN114327834A (en) 2022-04-12

Similar Documents

Publication Publication Date Title
CN110704186B (en) Computing resource allocation method and device based on hybrid distribution architecture and storage medium
JP7106513B2 (en) Data processing methods and related products
WO2020108303A1 (en) Heterogeneous computing-based task processing method and software-hardware framework system
US9851989B2 (en) Methods and apparatus to manage virtual machines
Scolati et al. A containerized big data streaming architecture for edge cloud computing on clustered single-board devices
US7970892B2 (en) Tuning and optimizing distributed systems with declarative models
Wang et al. Metis: Learning to schedule long-running applications in shared container clusters at scale
CN112272234A (en) Platform management system and method for realizing edge cloud collaborative intelligent service
CN109800937A (en) Robot cluster dispatches system
CN113553190B (en) Computing cluster system, scheduling method, device and storage medium
WO2023124000A1 (en) Multi-concurrency data processing method and device
Lovas et al. Orchestrated platform for cyber-physical systems
De Benedetti et al. JarvSis: a distributed scheduler for IoT applications
Kijsipongse et al. A hybrid GPU cluster and volunteer computing platform for scalable deep learning
Bhattacharjee et al. Stratum: A bigdata-as-a-service for lifecycle management of iot analytics applications
Kim et al. RIDE: real-time massive image processing platform on distributed environment
Sojoodi et al. Ignite-GPU: a GPU-enabled in-memory computing architecture on clusters
CN111353609A (en) Machine learning system
Scolati et al. A containerized edge cloud architecture for data stream processing
CN110162381A (en) Proxy executing method in a kind of container
US20210397482A1 (en) Methods and systems for building predictive data models
Bocciarelli et al. A microservice-based approach for fine-grained simulation in MSaaS platforms.
CN111506407B (en) Resource management and job scheduling method and system combining Pull mode and Push mode
KR102064882B1 (en) Deep learning platform capable of resource management of graphic processing unit, and method for managing resource of graphic processing unit using the same
CN109583071B (en) Parallel optimization method and system based on cloud simulation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913289

Country of ref document: EP

Kind code of ref document: A1