WO2023193814A1 - 融合系统的数据处理方法、装置、设备和系统 - Google Patents

融合系统的数据处理方法、装置、设备和系统 Download PDF

Info

Publication number
WO2023193814A1
WO2023193814A1 PCT/CN2023/087164 CN2023087164W WO2023193814A1 WO 2023193814 A1 WO2023193814 A1 WO 2023193814A1 CN 2023087164 W CN2023087164 W CN 2023087164W WO 2023193814 A1 WO2023193814 A1 WO 2023193814A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
storage
memory
node
memory pool
Prior art date
Application number
PCT/CN2023/087164
Other languages
English (en)
French (fr)
Inventor
孙宏伟
李光成
刘华伟
游俊
崔文林
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202210633694.7A external-priority patent/CN116932196A/zh
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023193814A1 publication Critical patent/WO2023193814A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Definitions

  • the present application relates to the field of data, and in particular, to a data processing method, device, equipment and system for a fusion system.
  • Storage and computing are two key systems for distributed applications (such as big data and databases), which determine the overall performance and energy consumption of the system.
  • computing and storage are deployed on the same node, and the system performance is high, but the storage-to-computing ratio is fixed, and computing and storage cannot be flexibly expanded separately.
  • the storage cluster and computing cluster are connected through the network, enabling flexible expansion of computing and storage respectively on demand.
  • data needs to go through various conversion operations such as protocols and formats from the computing side to the storage side, resulting in long data processing time and high system energy consumption, and system performance has become a bottleneck.
  • This application provides data processing methods, devices, equipment and systems for a fusion system, thereby shortening data processing time, increasing data transmission speed, and reducing system energy consumption.
  • the first aspect provides a data processing method for fusion systems.
  • the converged system includes computing nodes and storage nodes. Computing nodes are connected to storage nodes through the network to build a storage-computing separation architecture.
  • the storage media of computing nodes and the storage media of storage nodes are uniformly addressed to form a global memory pool, that is, the global memory shared by computing nodes and storage nodes.
  • the memory operation instruction is a technology that uses the memory interface to perform memory operations on the memory
  • the global memory shared by the computing node and the storage node is processed according to the memory operation instruction to process the memory operation of the requested data, avoiding the need for data to be transferred from the computing side to the storage
  • the memory operation instructions include at least one of memory allocation, memory setting, memory copy, memory movement, memory release and memory comparison.
  • the embodiment of the present application does not limit the type of storage medium of the global memory pool.
  • the storage media of the global memory pool include memory, hard disk, memory server and storage-class-memory (SCM).
  • SCM storage-class-memory
  • the memory operation of processing the requested data is performed on the global memory pool according to the memory operation instructions, including: reading the data to be processed from the global memory pool; using the memory operation instructions to process the data to be processed according to the processing request data. Obtain the processed data and write the processed data into the storage space indicated by the first address in the global memory pool. Place The storage space indicated by the first address includes one of the storage space provided by the storage medium of the computing node and the storage space provided by the storage medium of the storage node.
  • the computing node may determine the storage space indicated by the first address according to the storage policy. For example, the computing node determines the storage space indicated by the first address according to the access characteristics of the application.
  • Example 1 Determine to write the processed data to the storage space indicated by the first address in the global memory pool based on user requirements and storage medium characteristics.
  • the storage medium characteristics include at least one of write latency, read latency, total storage capacity, available storage capacity, access speed, central processing unit CPU consumption, energy consumption ratio, and reliability. Therefore, reading and writing operations on the system based on user needs and system storage media characteristics not only enhances the user's control authority over the system, improves the user's system experience, but also expands the applicable application scenarios of the system.
  • Example 2 Determining to write the processed data to the storage space indicated by the first address in the global memory pool based on user needs and storage medium characteristics, including: determining to write the processed data to the first address in the global memory pool based on user needs and storage medium characteristics.
  • An address indicates the storage space of the computing node.
  • Example 3 Determining to write the processed data to the storage space indicated by the first address in the global memory pool based on user needs and storage medium characteristics, including: determining to write the processed data to the first address in the global memory pool based on user needs and storage medium characteristics.
  • An address indicates the storage space of the storage node. Therefore, the processed data is stored in the memory on the storage side of the global memory pool, which improves the reliability and persistence of the data.
  • the method further includes: reading the processed data from the global memory pool according to the first address.
  • the processed data is read from the global memory pool according to the first address, including: when the processed data needs to be persisted, the computing node reads the data from the global memory pool according to the first address. Get the processed data and write the processed data to the storage node.
  • the computing node in the converged system uses processed data
  • the computing node reads the processed data from the global memory pool according to the first address.
  • the method further includes: when reading the processed data from the global memory pool according to the first address, performing another memory operation on the global memory pool to process the requested data according to the memory operation instruction.
  • the method also includes: prefetching data from the storage node according to the memory operation instruction and storing it in the global memory pool. Therefore, the computing node can quickly obtain data and shorten the data processing time.
  • the method also includes: performing data memory operations between the global memory pool and the storage node based on memory operation instructions based on the hot and cold characteristics of the data.
  • Cold data refers to data that is accessed less frequently. Move cold data from the global memory pool to the storage node to release the storage space of the global memory pool and improve the utilization of the storage space of the global memory pool.
  • Hot data refers to data that is accessed more frequently. Moving hot data from the storage node to the global memory pool enables the computing node to obtain the required data as quickly as possible, shortening the data processing time and reducing the computing resources occupied by frequent reading and writing of data.
  • a second aspect provides a data processing device for a fusion system, which device includes various modules for executing the data processing method of the fusion system in the first aspect or any possible design of the first aspect.
  • a computing device in a third aspect, includes at least one processor and a memory, and the memory is used to store a set of computer instructions; when the processor serves as the computing device in the first aspect or any possible implementation of the first aspect, When the node executes the set of computer instructions, it executes the operating steps of the data processing method of the fusion system in the first aspect or any possible implementation of the first aspect.
  • a computer-readable storage medium including: computer software instructions; when the computer software instructions are run in a computing device, the computing device is caused to execute as in the first aspect or any possible implementation of the first aspect. The steps of the method.
  • a computer program product is provided.
  • the computer program product When the computer program product is run on a computer, it causes the computing device to perform the operation steps of the method described in the first aspect or any possible implementation of the first aspect.
  • Figure 1 is a schematic architectural diagram of a data processing system provided by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of data distributed processing provided by an embodiment of the present application.
  • Figure 3 is a schematic diagram of a deployment scenario of a global memory pool provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of a three-layer structure storage system provided by an embodiment of the present application.
  • Figure 5 is a schematic process diagram of a data processing method provided by an embodiment of the present application.
  • Figure 6 is a schematic process diagram of another data processing method provided by an embodiment of the present application.
  • Figure 7 is a schematic process diagram of Map task and Reduce task processing based on the global memory pool provided by the embodiment of the present application.
  • Figure 8 is a schematic structural diagram of a data processing device provided by an embodiment of the present application.
  • Figure 9 is a schematic structural diagram of a computing device provided by an embodiment of the present application.
  • Big data is a collection of data that cannot be captured, managed, and processed over a period of time using conventional software tools. Since a large amount of data contained in big data has correlations, data analysis methods, models or tools are used to analyze big data, mine the data relationships in big data, and use the data relationships of big data to make predictions or decisions. For example, analyze user shopping trend data and push items that the user may purchase to improve the user's shopping experience. Therefore, big data has the characteristics of large data volume, fast data growth, diverse data types and high utilization value.
  • a database is a computer software system that stores and manages data according to a data structure. It can also be understood that a database is a collection of computers used to store and manage large amounts of data, that is, an electronic file cabinet. Data can include travel records, consumption records, web pages browsed, messages sent, images, music and sounds, etc.
  • Figure 1 is a schematic architectural diagram of a data processing system provided by an embodiment of the present application.
  • the data processing system 100 includes a client 110, a computing cluster 120 and a storage cluster 130.
  • Storage cluster 130 includes at least two storage nodes 131.
  • a storage node 131 includes one or more controllers, network cards, and multiple hard disks.
  • Hard drives are used to store data.
  • the hard disk can be a magnetic disk or other type of storage medium, such as a solid state drive or a shingled magnetic recording hard drive.
  • the network card is used to communicate with the computing nodes 121 included in the computing cluster 120.
  • the controller is used to write data to the hard disk or read data from the hard disk according to the read/write data request sent by the computing node 121. In the process of reading and writing data, the controller needs to convert the address carried in the read/write data request into an address that the hard disk can recognize.
  • Client 110 communicates with computing cluster 120 and storage cluster 130 over network 140. For example, the client 110 sends a service request to the computing cluster 120 through the network 140, requesting the computing cluster 120 to process the service data contained in the service request. Perform distributed processing.
  • the network 140 may refer to an enterprise's internal network (such as a Local Area Network (LAN)) or the Internet.
  • the data processing system 100 supports running big data, database, high-performance computing, artificial intelligence, distributed storage, cloud native and other applications. It can be understood that the business data in the embodiments of this application include data from applications such as big data, databases, high-performance computing, artificial intelligence (Artificial Intelligence, AI), distributed storage, and cloud native.
  • applications such as big data, databases, high-performance computing, artificial intelligence (Artificial Intelligence, AI), distributed storage, and cloud native.
  • the storage cluster 130 stores and manages large amounts of data based on a distributed file system 132 and a distributed database 133 .
  • the client 110 is installed with a client program 111.
  • the client 110 runs the client program 111 to display a user interface (UI).
  • the user 150 operates the user interface to access the distributed file system 132 and the distributed database 133 to obtain data and instructions.
  • the computing cluster 120 processes data services.
  • the client 110 may refer to a computer connected to the network 140, and may also be called a workstation. Different clients can share resources on the network (such as computing resources, storage resources).
  • the computing cluster 120 includes at least two computing nodes 121, and the computing nodes 121 can communicate with each other.
  • the computing node 121 is a computing device, such as a server, a desktop computer, or a controller of a storage array.
  • the big data service submitted by the client 110 may be called a job.
  • a job can be divided into multiple tasks (tasks), and multiple computing nodes execute multiple tasks in parallel. When all tasks end, a job is marked as completed.
  • a task is generally a part of data or a staged processing process in a job. All tasks are scheduled to be completed in parallel or serially.
  • the computing cluster 120 performs distributed processing of big data services based on the map-reduce (MapReduce) model 134.
  • MapReduce MapReduce
  • the MapReduce model is a distributed programming model, which decomposes big data business into mapping (map) tasks and reduction (reduce) tasks.
  • Multiple computing nodes 121 execute map tasks, collect processing results, and execute reduce tasks.
  • computing cluster 120 includes a control node 122 and at least two computing nodes 121.
  • the control node and the computing node can be independent physical devices, and the control node can also be called a control device or a named node.
  • Compute nodes may be called computing devices or data nodes.
  • the control node 122 is used to manage the namespace of the distributed file system and the access of the client 110 to the distributed file system.
  • control node 122 indicates the computing node that executes the map task and the computing node that executes the reduce task.
  • the computing node executes the intermediate data (also known as map data or shuffle data) obtained by the map task based on the MapReduce model 134 and stores the intermediate data in the global memory pool 170 .
  • Computing nodes executing reduce tasks read intermediate data from the global memory pool 170 .
  • the storage cluster 130 may also process the data according to the MapReduce model 134 and then store the data.
  • system administrator 160 can call the application platform interface (API) 112 or the command-line interface (CLI) 113 or the graphical user interface (Graphical User Interface) through the client 110. , GUI) to access the distributed file system 132 and the distributed database 133 to configure system information, etc., such as the deployment information of the global memory pool 170 configured with unified addressing for the computing node 121 and the storage node 131 provided by the embodiment of the present application. and storage strategies.
  • API application platform interface
  • CLI command-line interface
  • GUI graphical user interface
  • Embodiments of the present application provide a storage and computing fusion architecture and a data processing method based on memory operation instructions, that is, a global memory pool formed by unified addressing based on the storage media of computing nodes and the storage media of the storage nodes.
  • memory operation instructions Technology that performs memory operations on the global memory pool to process requested data. This not only solves the flexibility problem of separate expansion of storage and computing, but also solves the performance problem of separation of storage and computing.
  • the memory operation of processing request data on the global memory pool according to the memory operation instructions described in the embodiment of the present application includes read operations and write operations on the global memory pool.
  • a data processing system configured with a global memory pool can also be called a fusion system.
  • Memory operation instructions are a technology that uses memory interfaces to perform memory operations on storage media. Due to the two different memory operations on the computing side and the storage side, based on the memory operation instructions, computing and storage are integrated through a globally unified global memory pool. , that is, the memory operation of processing the requested data is performed on the global storage medium shared by the computing node and the storage node according to the memory operation instructions, which effectively simplifies the process between the computing node and the storage node.
  • the data movement process between the computing side and the storage side avoids the need for data to go through various conversion operations such as protocols and formats from the computing side to the storage side.
  • the storage and computing separation architecture based on the global memory pool not only ensures that computing and storage can be flexibly expanded on demand, but also It can quickly perform read and write operations on the system, shorten the end-to-end data processing time, increase the data transmission speed, reduce system energy consumption, and improve system performance bottlenecks.
  • format conversion includes serialization and deserialization.
  • Serialization refers to the process of converting an object into a sequence of bytes.
  • Deserialization is the process of restoring a sequence of bytes into an object.
  • the global memory pool may include the storage medium of the computing node and the storage medium of the storage node in the data processing system.
  • the storage medium of the computing node includes at least one of a local storage medium within the computing node and an extended storage medium connected to the computing node.
  • the storage medium of the storage node includes at least one of a local storage medium within the storage node and an extended storage medium connected to the storage node.
  • the global memory pool includes local storage media within computing nodes and local storage media within storage nodes.
  • the global memory pool includes local storage media within the computing node, extended storage media connected to the computing node, and any one of local storage media within the storage node and extended storage media connected to the storage node.
  • the global memory pool includes local storage media within the computing node, extended storage media connected to the computing node, local storage media within the storage node, and extended storage media connected to the storage node.
  • the global memory pool 300 includes a storage medium 310 in each of the N computing nodes, an extended storage medium 320 connected to each of the N computing nodes, and a storage medium 330 in each of the M storage nodes.
  • An extended storage medium 340 connected to each of the M storage nodes.
  • the storage capacity of the global memory pool may include part of the storage capacity in the storage medium of the computing node and part of the storage capacity in the storage medium of the storage node.
  • the global memory pool is a storage medium that can be accessed by uniformly addressed computing nodes and storage nodes.
  • the storage capacity of the global memory pool can be used by computing nodes or storage nodes through memory interfaces such as large memory, distributed data structures, data caches, and metadata. Compute nodes running applications can use these memory interfaces to perform memory operations on the global memory pool.
  • the global memory pool constructed based on the storage capacity of the storage medium of the computing node and the storage medium of the storage node provides a unified memory interface for the computing nodes to use in the north direction, allowing the computing nodes to use the unified memory interface to write data into the global memory pool.
  • the storage space provided by the computing node or the storage space provided by the storage node implements the calculation and storage of data based on memory operation instructions.
  • the above description takes the storage medium in the computing node and the storage medium in the storage node to construct a global memory pool as an example.
  • the deployment method of the global memory pool can be flexible and changeable, and is not limited in the embodiments of this application.
  • the global memory pool is built from the storage media of the storage nodes.
  • a high-performance storage layer (Global Cache).
  • the global memory pool is constructed from the storage media of computing nodes.
  • a high-performance storage layer Using the storage media of separate storage nodes or the storage media of computing nodes to build a global memory pool can reduce the occupation of storage resources on the storage side and provide a more flexible expansion solution.
  • the memory is a memory device used to store programs and various data.
  • Access speed refers to the data transfer speed when writing data or reading data to the memory. Access speed can also be called read and write speed. Memory can be divided into different levels based on storage capacity and access speed.
  • FIG. 4 is a schematic diagram of a three-layer structure storage system provided by an embodiment of the present application. From the first layer to the third layer, the storage capacity increases step by step, the access speed decreases step by step, and the cost decreases step by step.
  • the first level includes a register 411, a level 1 cache 412, a level 2 cache 413 and a level 3 cache 414 located in a central processing unit (CPU).
  • the second level contains memory that can serve as the main memory of the computer system.
  • Dynamic Random Access Memory (DRAM) 421 double data rate synchronous dynamic random access memory Take memory (double data rate SDRAM, DDR SDRAM) 422, storage-class-memory (SCM) 423.
  • DRAM Dynamic Random Access Memory
  • DDR SDRAM double data rate SDRAM
  • SCM storage-class-memory
  • Main memory can be simply referred to as main memory or internal memory, which is the memory that exchanges information with the CPU.
  • the memory contained in the third level can be used as auxiliary memory for the computer system.
  • network storage 431 solid state drive (Solid State Disk or Solid State Drive, SSD) 432, hard disk drive (Hard Disk Drive, HDD) 433.
  • Auxiliary storage can be simply called auxiliary storage or external storage.
  • main memory external memory has a larger storage capacity and slower access speed. It can be seen that the closer the memory is to the CPU, the smaller the capacity, the faster the access speed, the greater the bandwidth, and the lower the latency. Therefore, the memory contained in the third level stores data that is not frequently accessed by the CPU, improving data reliability.
  • the memory contained in the second level can be used as a cache device to store data frequently accessed by the CPU, significantly improving the access performance of the system.
  • the storage medium of the global memory pool provided by the embodiment of this application includes memory (such as DRAM), SSD, hard disk, memory server and SCM.
  • the global memory pool can be set according to the type of storage medium, that is, one type of storage medium is used to construct a memory pool, and different types of storage media construct different types of global memory pools, so that the global memory pool can be used in
  • the computing node selects storage media based on the access characteristics of the application, which enhances the user's control authority over the system, improves the user's system experience, and expands the applicable application scenarios of the system.
  • the DRAM in the computing node and the DRAM in the storage node are uniformly addressed to form a DRAM memory pool.
  • the DRAM memory pool is used in application scenarios that require high access performance, moderate data capacity, and no data persistence requirements.
  • the SCM in the computing node and the SCM in the storage node are uniformly addressed to form an SCM memory pool.
  • the SCM memory pool is used in application scenarios that are not sensitive to access performance, have large data capacity, and require data persistence.
  • the storage medium characteristics include at least one of write latency, read latency, total storage capacity, available storage capacity, access speed, CPU consumption, energy consumption ratio and reliability.
  • the write latency refers to the latency for the computing node 121 to write data to the storage medium.
  • the read latency refers to the latency for the computing node 121 to read data from the storage medium.
  • Storage capacity refers to the total storage capacity of the storage medium that can store data. Available storage capacity is the remaining storage capacity minus the used storage capacity from the total storage capacity.
  • the access speed refers to the speed at which the computing node 121 performs read and write operations on the storage medium.
  • CPU consumption refers to the occupancy rate of the CPU of the computing node 121 when the computing node 121 writes data to the storage medium or reads data from the storage medium.
  • Energy consumption ratio refers to the energy consumed per unit time (such as electrical energy).
  • Reliability refers to the durability of data stored in a storage medium.
  • FIG. 5 is a schematic flowchart of a data processing method of a fusion system provided by an embodiment of the present application.
  • the computing cluster 120 includes a control node 122 and at least two computing nodes 121.
  • the storage cluster 130 includes at least two storage nodes 131.
  • the control node 122 is used to control the computing node 121 to perform data distributed processing.
  • the computing node 121 and the storage node 131 are configured with
  • the global memory pool 170 includes the storage medium of the computing node 121 and the storage medium of the storage node 131 . As shown in Figure 5, the method includes the following steps.
  • Step 510 The computing node 121 obtains the processing request data.
  • the client 110 responds to the user operation and sends a service request for the data service to the control node 122 .
  • the control node 122 may receive the service request for the big data service sent by the client 110 through the local area network or the Internet.
  • a business request may include business identification and business data.
  • the service identifier is used to uniquely indicate a data service.
  • Business data may be data for distributed data processing by computing nodes or identification data indicating data to be processed.
  • Big data business includes data analysis business, data query business and data modification business, etc.
  • big data business refers to analyzing customers’ personal data Use and purchase behavior data to describe user portraits and classify customers, so that targeted products or preferential products can be recommended to specific customers, improve customer satisfaction, stabilize customer relationships, etc.
  • big data business refers to analyzing the historical sales volume of a product to predict future sales volume, discovering the reasons for the decline in sales volume or the reasons for the increase in sales volume, and recommending constructive suggestions to increase sales volume.
  • Database operations may also refer to operations in which users operate the database user interface to submit database services, etc.
  • Database operations include database creation, deletion, modification and query, etc.
  • the computing node 121 can receive the service request sent by the control node 122 and convert the service request into processing request data that complies with the operation rules of the memory operation instruction, so that the computing node 121 performs memory operations on the global memory pool to process the request data according to the memory operation instruction.
  • Step 520 The computing node 121 performs memory operations on the global memory pool to process the requested data according to the memory operation instructions.
  • Memory manipulation instructions may also be called memory semantics or memory manipulation functions.
  • Memory operation instructions include at least one of memory allocation (malloc), memory set (memset), memory copy (memcpy), memory move (memmove), memory release (memory release) and memory comparison (memcmp).
  • Memory allocation is used to allocate a section of memory to support application running.
  • Memory settings are used to set the data mode of the global memory pool, such as initialization.
  • Memory copy is used to copy the data stored in the storage space indicated by the source address (source) to the storage space indicated by the destination address (destination).
  • Memory movement is used to copy the data stored in the storage space indicated by the source address (source) to the storage space indicated by the destination address (destination), and delete the data stored in the storage space indicated by the source address (source).
  • Memory comparison is used to compare whether the data stored in two storage spaces are equal.
  • Memory release is used to release data stored in memory to improve the utilization of system memory resources and thereby improve system performance.
  • Processing request data is used to indicate the operation to be performed on the data to be processed. For example, processing the request data indicates obtaining the first quarter sales volume of product A.
  • the data to be processed can include sales for the entire year.
  • the specific operation process included in step 520 is as described in steps 521 to 523 below.
  • Step 521 The computing node 121 reads the data to be processed and the application data.
  • To-be-processed data and application data can be stored in storage spaces such as global memory pools, storage media of storage nodes, or storage media of computing nodes.
  • the computing node 121 may read the data to be processed and the application data from the global memory pool, the storage medium of the storage node, or the storage medium of the computing node. For example, the computing node 121 reads application data from local memory and reads pending data from the global memory pool.
  • the data to be processed may be objects that need to be processed according to instructions from the processing request data.
  • Application data includes application programs and application configuration data.
  • Step 522 The computing node 121 starts the application program according to the application data, and uses the memory operation instructions to process the data to be processed according to the processing request data to obtain processed data.
  • the processing request data is used to instruct terabytes (TB) of data to be processed to be sorted.
  • the computing node 121 starts the database according to the database application data, obtains the data to be processed, and sorts the data to be processed.
  • the computing node 121 may start the application program based on the application data before obtaining the processing request data, or may start the application program based on the application data after obtaining the processing request data.
  • Step 523 The computing node 121 writes the processed data into the storage space indicated by the first address in the global memory pool.
  • the computing node 121 automatically selects a storage medium for storing processed data from the global memory pool according to the storage policy.
  • Storage policies include application access characteristics and storage media characteristics in the global memory pool.
  • the storage space indicated by the first address includes one of the storage space provided by the storage medium of the computing node 121 and the storage space provided by the storage medium of the storage node 131 .
  • the computing node 121 determines to write the processed data to the storage space indicated by the first address in the global memory pool based on user requirements and storage medium characteristics.
  • User requirements are used to indicate requirements related to storage media characteristics. Processing request data including user requirements.
  • the storage medium characteristics include at least one of write latency, read latency, total storage capacity, available storage capacity, access speed, central processing unit CPU consumption, energy consumption ratio, and reliability.
  • computing node 121 is configured with storage media characteristics of multiple types of storage media.
  • the user requirement indicates an access speed range or a specific access speed
  • the computing node 121 determines the storage medium that meets the user requirement from the global memory pool.
  • user requirements indicate the memory access speed.
  • the computing node 121 selects a storage medium that meets the memory access speed from the global memory pool, such as at least one of memory, DRAM, and SCM.
  • Example 1 The computing node 121 determines to write the processed data into the storage space of the computing node indicated by the first address in the global memory pool based on user requirements and storage medium characteristics.
  • the storage space of the computing node indicated by the first address meets the requirements indicated by the user. Access speed.
  • the processed data is stored in the local memory on the computing side in the global memory pool for local memory access, which effectively shortens the data processing time and improves the data transmission speed.
  • the computing node 121 is configured with an association between a storage medium and a customer level. User needs indicate the first customer level. The computing node 121 determines the storage medium associated with the first customer level from the association relationship according to the first customer level, and determines the storage medium associated with the first customer level to be used to store the processed data.
  • Example 2 The computing node 121 determines to write the processed data into the storage space of the storage node indicated by the first address in the global memory pool based on user requirements and storage medium characteristics. The storage space of the computing node indicated by the first address meets the user requirements. First customer level.
  • the computing node 121 selects a storage medium to store processed data from the global memory pool, it dynamically selects a storage medium that meets the user's needs based on the user's needs for storage medium characteristics such as access speed or reliability to ensure data processing. performance and reliability scenario requirements.
  • the computing node 121 is configured with priorities of multiple types of storage media determined based on characteristics of the storage media, and determines the storage used to store the processed data based on the priorities of multiple types of storage media indicated by the storage policy. medium.
  • the priorities of multiple types of storage media may be determined based on the access speed of the storage media. For example, the access speed of memory is higher than the access speed of hard disk, and the access speed of hard disk is higher than the access speed of extended storage media.
  • the priority of multiple types of storage media can be determined based on the priority of the deployment mode. For example, the priority of local storage media is higher than that of extended storage media.
  • the priorities of multiple types of storage media may be comprehensively determined based on the characteristics of multiple types of storage media.
  • the priority of multiple types of storage media is determined based on the priority of the deployment mode (such as local storage media and extended storage media).
  • the priority of storage media in the same deployment mode can be determined based on the access speed of the storage media.
  • the computing node 121 can also select a storage medium with an available storage capacity greater than a threshold from multiple types of storage media that meet user needs as a storage medium to store the processed data.
  • the computing node 121 determines whether the available storage capacity of the highest-priority storage medium is greater than the threshold according to the priorities of multiple types of storage media, starting from the highest-priority storage medium. If the available storage capacity of the highest-priority storage medium is greater than the threshold, , indicating that the storage medium with the highest priority has excess storage space to store the processed data, then the storage medium with the highest priority is selected to store the processed data; if the available storage capacity of the storage medium with the highest priority is less than or equal to the threshold value, indicating that the storage medium with the highest priority does not have excess storage space to store the processed data, then it is judged whether the available storage capacity of the storage medium with the second highest priority is greater than the threshold, and multiple types of storage media are traversed in sequence, and finally from multiple types of storage The storage medium in which the processed data is stored is determined in the medium.
  • the storage strategy can be set according to business needs, scenario requirements or user needs.
  • the above description of the storage strategy is only an example.
  • the storage policy can also refer to data localization preference, that is, preferentially storing processed data to local storage media in the global memory pool.
  • the storage strategy may also refer to selecting performance priority, storage capacity priority, cost priority, etc. according to the application's trade-off between performance and cost.
  • the storage policy and configuration information of multiple types of storage media may be pre-configured.
  • the storage strategy provided by the embodiment of the present application can be applied to at least one application supported by the above-mentioned data processing system 100, namely big data, database, high-performance computing, artificial intelligence, distributed storage, and cloud native.
  • the computing node 121 may use the storage strategy when selecting a storage medium for storing intermediate data when processing big data services or processing tasks in big data services or processing system global tasks.
  • FIG. 6 is a schematic flowchart of another data processing method of the fusion system provided by an embodiment of the present application.
  • Step 610 The computing node 121 reads data from the global memory pool according to the first address and processes the data.
  • compute node 121 is a compute node that performs reduce tasks.
  • the computing node 121 reads the processed data required to execute the reduce task from the global memory pool according to the first address.
  • the computing node 121 may obtain the first address from the first address control node 122 .
  • Step 620 The computing node 121 reads data from the global memory pool according to the first address and writes the data to the storage node 131.
  • the computing node 121 can read the relevant data from the global memory pool and write it to the storage node 131.
  • the storage node 131 writes the processed data into the storage medium of the storage node 131 according to the memory operation instructions.
  • the computing node 121 can also save data written to the storage node 131.
  • the computing node 121 may also perform step 630, that is, prefetching data from the storage node 131 according to the memory operation instructions and storing the data in the global memory pool. Therefore, the computing node 121 can obtain data as quickly as possible and shorten the end-to-end data processing time.
  • the computing node 121 can also perform step 640, that is, perform memory operations on the data between the global memory pool and the storage node based on the memory operation instructions according to the hot and cold characteristics of the data.
  • Hot data refers to data that is accessed more frequently. Such as online data. Cold data refers to data that is accessed less frequently. Such as enterprise backup data, business and operation log data, call records and statistical data. Hot data has high access frequency requirements and high efficiency requirements, so cold data has low access frequency and slow efficiency requirements for nearby calculation and deployment, and can be deployed in a centralized manner.
  • the embodiment of this application constructs a global memory pool under a storage separation architecture.
  • the global memory pool integrates storage and computing, and implements logical fusion on the basis of physical separation.
  • computing Nodes and storage nodes use unified memory semantics for data exchange and sharing, thus avoiding repeated intermediate operations such as format conversion and protocol conversion between computing nodes and storage nodes, achieving ultimate high performance and low energy consumption.
  • FIG. 7 it is a schematic diagram of a data processing process for executing map tasks and reduce tasks provided by an embodiment of the present application.
  • Initialization phase 1 Load application data, that is, load the program configuration file and start the application.
  • Application data can be stored in non-volatile memory or in a global memory pool, and compute nodes load application data into the compute node's local memory.
  • compute nodes use Direct Memory Access (DMA) to load application data from non-volatile memory to local memory, or through a Virtual File System (VFS) built on top of the memory hierarchy or Portable Operating System Interface (POSIX) file system interface loads application data.
  • DMA Direct Memory Access
  • VFS Virtual File System
  • POSIX Portable Operating System Interface
  • 2Load the data to be processed.
  • the data to be processed can be stored in non-volatile memory.
  • the computing node loads the data to be processed into the local memory of the computing node.
  • the computing node obtains the data to be processed from the non-volatile memory pool and loads it into the local memory according to the memory operation instructions, eliminating the access overhead of the file system stack.
  • the computing node can also use Remote Direct Memory Access (RDMA) to obtain the data to be processed stored in the remote node, achieving efficient access and reducing the CPU overhead and network protocol stack overhead of remote data access.
  • RDMA Remote Direct Memory Access
  • Data processing stage 3 Use memory operation instructions to process the data to be processed according to the processing request data to obtain processed data. For example, each computing node sorts the processed data to obtain intermediate data. The computing node selects the storage medium to store the processed data from the global memory pool based on the amount of processed data, such as storing the processed data in HBM, DRAM or SCM. When the storage capacity in the global memory pool is insufficient, it can be solved through remote memory expansion or memory disk expansion, that is, the processed data is stored in other storage media. The destination node reads intermediate data from the source node or the source node writes the intermediate data to the destination node to complete data exchange (shuffle).
  • the source node can write intermediate data into the global memory pool, and then the destination node can read the intermediate data from the global memory pool. Read intermediate data. 4 Merge the sorting results.
  • the computing device includes corresponding hardware structures and/or software modules that perform each function.
  • the units and method steps of each example described in conjunction with the embodiments disclosed in this application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software driving the hardware depends on the specific application scenarios and design constraints of the technical solution.
  • FIG. 8 is a schematic structural diagram of a possible data processing device provided by this embodiment. These data processing devices can be used to implement the functions of the computing devices or computing nodes in the above method embodiments, and therefore can also achieve the beneficial effects of the above method embodiments.
  • the data processing device may be the computing node 121 as shown in Figure 5, or may be a module (such as a chip) applied to the server.
  • the data processing device 800 includes a communication module 810 , a data processing module 820 and a storage module 830 .
  • the data processing device 800 is used to implement the functions of the computing node 121 in the method embodiment shown in FIG. 5 .
  • the communication module 810 is used to obtain processing request data.
  • the data processing module 820 is used to perform memory operations on the global memory pool to process the requested data according to the memory operation instructions. For example, the data processing module 820 is used to perform steps 510 and 520 in FIG. 5 .
  • the storage module 830 is used to store memory operation instructions so that the data processing module 820 can obtain processed data.
  • the communication module 810 is used to execute S560 in FIG. 5 .
  • the data processing module 820 is specifically configured to read data to be processed and application data from the global memory pool; use the memory operation instructions to process the data to be processed according to the processing request data to obtain processed data, and convert the processed data to The data is written into the storage space indicated by the first address in the global memory pool.
  • the data processing module 820 is also used to read processed data from the global memory pool according to the first address.
  • the data processing module 820 is also configured to prefetch data from the storage node according to the memory operation instruction and store it in the global memory pool.
  • the data processing module 820 is also configured to perform memory operations on data between the global memory pool and the storage node based on the memory operation instructions according to the hot and cold characteristics of the data.
  • the data processing device 800 in the embodiment of the present application can be implemented by an application-specific integrated circuit (ASIC) or a programmable logic device (PLD).
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • the above PLD can be a complex program.
  • Logic device complex programmable logical device, CPLD
  • field-programmable gate array field-programmable gate array
  • FPGA field-programmable gate array
  • GAL general array logic
  • the data processing device 800 and its respective modules can also be software modules.
  • the data processing device 800 may correspond to performing the method described in the embodiment of the present application, and the above and other operations and/or functions of the various units in the data processing device 800 are respectively intended to implement the steps in Figure 5 or Figure 6
  • the corresponding processes of each method are not repeated here for the sake of brevity.
  • FIG. 9 is a schematic structural diagram of a computing device 900 provided in this embodiment.
  • computing device 900 includes a processor 910, a bus 920, a memory 930, a communication interface 940, and a memory unit 950 (which may also be referred to as a main memory unit).
  • the processor 910, the memory 930, the memory unit 950 and the communication interface 940 are connected through a bus 920.
  • the processor 910 can be a CPU, and the processor 910 can also be other general-purpose processors, digital signal processors (digital signal processing, DSP), ASICs, FPGAs or other programmable logic devices, Discrete gate or transistor logic devices, discrete hardware components, etc.
  • DSP digital signal processor
  • a general-purpose processor can be a microprocessor or any conventional processor, etc.
  • the processor can also be a graphics processing unit (GPU), a neural network processing unit (NPU), a microprocessor, an ASIC, or one or more integrations used to control the execution of the program of this application. circuit.
  • GPU graphics processing unit
  • NPU neural network processing unit
  • ASIC application specific integrated circuit
  • the communication interface 940 is used to implement communication between the computing device 900 and external devices or devices.
  • the communication interface 940 is used to obtain processing request data, and the computing node 121 performs the above-mentioned processing on the global memory pool according to the memory operation instructions. Memory operations that handle requested data.
  • Bus 920 may include a path for communicating information between the components described above, such as processor 910, memory unit 950, and storage 930.
  • the bus 920 may also include a power bus, a control bus, a status signal bus, etc.
  • the various buses are labeled bus 920 in the figure.
  • the bus 920 may be a Peripheral Component Interconnect Express (PCIe) bus, an extended industry standard architecture (EISA) bus, a unified bus (unified bus, Ubus or UB), or a computer quick link ( compute express link (CXL), cache coherent interconnect for accelerators (CCIX), etc.
  • PCIe Peripheral Component Interconnect Express
  • EISA extended industry standard architecture
  • CXL compute express link
  • CIX cache coherent interconnect for accelerators
  • the bus 920 can be divided into an address bus, a data bus, a control bus, etc.
  • computing device 900 may include multiple processors.
  • the processor may be a multi-CPU processor.
  • a processor here may refer to one or more devices, circuits, and/or computing units for processing data (eg, computer program instructions).
  • the computing device 900 is used to implement the functions of the computing node 121 shown in Figure 5
  • the processor 910 may perform the memory operation of processing the request data on the global memory pool according to the memory operation instruction.
  • FIG. 9 only takes the computing device 900 including a processor 910 and a memory 930 as an example.
  • the processor 910 and the memory 930 are respectively used to indicate a type of device or device.
  • the quantity of each type of device or equipment can be determined based on business needs.
  • the memory unit 950 may correspond to the global memory pool used to store processed data and other information in the above method embodiments.
  • Memory unit 950 may be a pool of volatile or non-volatile memory, or may include both volatile and non-volatile memory.
  • non-volatile memory can be read-only memory (ROM), programmable ROM (PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically removable memory. Erase electrically programmable read-only memory (EPROM, EEPROM) or flash memory.
  • Volatile memory can be random access memory (RAM), which is used as an external cache.
  • RAM static random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • Double data rate synchronous dynamic random access memory double data date SDRAM, DDR SDRAM
  • enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • synchronous link dynamic random access memory direct rambus RAM, DR RAM
  • the memory 930 may correspond to the storage medium used to store computer instructions, memory operation instructions, storage strategies and other information in the above method embodiments, for example, a magnetic disk, such as a mechanical hard disk or a solid state hard disk.
  • the above-mentioned computing device 900 may be a general-purpose device or a special-purpose device.
  • computing device 900 may be an edge device (eg, a box carrying a chip with processing capabilities), or the like.
  • the computing device 900 may also be a server or other device with computing capabilities.
  • the computing device 900 may correspond to the data processing apparatus 800 in this embodiment, and may correspond to the corresponding subject executing any method according to FIG. 5 or FIG. 6, and the data processing apparatus 800
  • the above and other operations and/or functions of each module are respectively to implement the corresponding processes of each method in Figure 5 or Figure 6. For the sake of simplicity, they will not be described again here.
  • the method steps in this embodiment can be implemented by hardware or by a processor executing software instructions.
  • Software instructions can be composed of corresponding software modules, and software modules can be stored in random access memory (random access memory, RAM), flash memory, read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM) , PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically erasable programmable read-only memory (electrically EPROM, EEPROM), register, hard disk, mobile hard disk, CD-ROM or other well-known in the art any other form of storage media.
  • An exemplary storage medium is coupled to the processor such that the processor can read information from the storage medium and write information to the storage medium.
  • the storage medium can also be an integral part of the processor.
  • the processor and storage media may be located in an ASIC. Additionally, the ASIC can be located in a computing device. Of course, the processor and storage medium may also exist as discrete components in a computing device.
  • the computer program product includes one or more computer programs or instructions.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, a network device, a user equipment, or other programmable device.
  • the computer program or instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another, e.g.
  • the computer program or instructions may be transmitted from one website, computer, server or data center to another website, computer, server or data center through wired or wireless means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center that integrates one or more available media.
  • the available media may be magnetic media, such as floppy disks, hard disks, and magnetic tapes; they may also be optical media, such as digital video discs (DVDs); they may also be semiconductor media, such as solid state drives (solid state drives). ,SSD).
  • SSD solid state drives

Abstract

公开了融合系统的数据处理方法、装置、设备和系统,涉及数据领域。该融合系统包括计算节点和存储节点。计算节点通过网络与存储节点连接构建了存算分离架构。计算节点的存储介质和存储节点的存储介质经过统一编址构成全局内存池,即计算节点和存储节点共享的全局存储介质。对系统进行读写操作时,获取到处理请求数据,根据内存操作指令对全局内存池进行处理请求数据的内存操作。从而,基于具有全局内存池的存算分离架构既确保计算和存储分别按需灵活扩展,又能够快速地对系统进行读写操作,缩短了端到端的数据处理时长,提升了数据传输速度,降低了系统能耗,改善了系统性能瓶颈。

Description

融合系统的数据处理方法、装置、设备和系统
本申请要求于2022年4月8日提交中国专利局、申请号为202210369243.7、发明名称为“一种计算和存储的融合系统”的中国专利申请的优先权”的中国专利申请的优先权,以及于2022年6月6日提交的申请号为202210633694.7、发明名称为“融合系统的数据处理方法、装置、设备和系统”的中国专利申请的优先权,前述两件专利申请的全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据领域,尤其涉及一种融合系统的数据处理方法、装置、设备和系统。
背景技术
存储和计算是分布式应用(如:大数据、数据库)的两大关键系统,决定着系统整体的性能与能耗。通常,计算和存储部署在同一节点,系统性能较高,但存算比例确定,计算和存储不能分别灵活扩展。随后,存算分离架构应运而生,存储集群和计算集群通过网络连接,实现了计算和存储分别按需灵活扩展。但是,数据从计算侧到存储侧需要经过协议和格式等多种转化操作,导致数据处理时长较长以及系统能耗较高,系统性能成为了瓶颈。
发明内容
本申请提供了融合系统的数据处理方法、装置、设备和系统,由此缩短数据处理时长,提升数据传输速度,降低系统能耗。
第一方面,提供了一种融合系统的数据处理方法。该融合系统包括计算节点和存储节点。计算节点通过网络与存储节点连接构建了存算分离架构。计算节点的存储介质和存储节点的存储介质经过统一编址构成全局内存池,即计算节点和存储节点共享的全局内存。对系统进行读写操作时,获取到处理请求数据,根据内存操作指令对全局内存池进行处理请求数据的内存操作。
如此,由于内存操作指令是一种利用内存接口对内存进行内存操作的技术,根据内存操作指令对计算节点和存储节点共享的全局内存进行处理请求数据的内存操作,避免了数据从计算侧到存储侧需要经过协议和格式等多种转化操作,有效地简化了在计算节点和存储节点之间的数据搬移过程。从而,基于具有全局内存池的存算分离架构既确保计算和存储分别按需灵活扩展,又能够快速地对系统进行读写操作,缩短了端到端的数据处理时长,提升了数据传输速度,降低了系统能耗,改善了系统性能瓶颈。
其中,内存操作指令包括内存分配、内存设置、内存复制、内存移动、内存释放和内存比较中至少一种。本申请实施例对全局内存池的存储介质的类型不予限定。全局内存池的存储介质包括内存、硬盘、内存服务器和存储级内存(storage-class-memory,SCM)。如此,在分布式处理系统中引入包含多种存储介质的全局内存池,使得存储数据的存储介质具有更多的可能性,选择匹配的存储介质存储数据,实现能够快速地对系统进行读写操作,提升了数据传输速度,缩短了端到端的数据处理时长。
在一种可能的实现方式中,根据内存操作指令对全局内存池进行处理请求数据的内存操作,包括:从全局内存池读取待处理数据;根据处理请求数据使用内存操作指令对待处理数据进行处理得到处理后数据,将处理后数据写入全局内存池中第一地址指示的存储空间。所 述第一地址指示的存储空间包括计算节点的存储介质提供的存储空间和存储节点的存储介质提供的存储空间中一个。
计算节点可以根据存储策略确定第一地址指示的存储空间。比如,计算节点根据应用的访问特性确定第一地址指示的存储空间。
示例一,根据用户需求和存储介质特征确定将处理后数据写入全局内存池中第一地址指示的存储空间。存储介质特征包括写时延、读时延、总存储容量、可用存储容量、存取速度、中央处理器CPU消耗、能耗比和可靠性中至少一个。从而,基于用户需求和系统的存储介质特征对系统进行读写操作,既增强了用户对系统控制权限,提升了用户的系统体验又扩展了系统适用的应用场景。
示例二,根据用户需求和存储介质特征确定将处理后数据写入全局内存池中第一地址指示的存储空间,包括:根据用户需求和存储介质特征确定将处理后数据写入全局内存池中第一地址指示的计算节点的存储空间。从而将处理后数据存储到全局内存池中计算侧的本地内存,以便进行本地内存访问,有效地缩短了数据处理时长,提升了数据传输速度。
示例三,根据用户需求和存储介质特征确定将处理后数据写入全局内存池中第一地址指示的存储空间,包括:根据用户需求和存储介质特征确定将处理后数据写入全局内存池中第一地址指示的存储节点的存储空间。从而,将处理后数据存储到全局内存池中存储侧的内存,提升了数据的可靠性以及持久化。
在另一种可能的实现方式中,将处理后数据写入全局内存池中第一地址指示的存储空间之后,方法还包括:根据第一地址从全局内存池读取处理后数据。
例如,将全局内存池存储的数据下盘时,根据第一地址从全局内存池读取处理后数据,包括:需要将处理后数据进行持久化时,计算节点根据第一地址从全局内存池读取处理后数据,将处理后数据写入存储节点。
又如,融合系统中计算节点使用处理后数据时,计算节点根据第一地址从全局内存池读取处理后数据。
在另一种可能的实现方式中,方法还包括:当根据第一地址从全局内存池读取处理后数据时,根据内存操作指令对全局内存池进行另一处理请求数据的内存操作。
从而,从应用的角度而言提升了数据处理的效率和数据传输速度,从硬件设备的角度而言提升了系统资源的利用率。
在另一种可能的实现方式中,方法还包括:根据内存操作指令从存储节点预取数据,存储到全局内存池。从而,以便于计算节点能够快速地获取到数据,缩短了数据处理时长。
在另一种可能的实现方式中,方法还包括:根据数据冷热特性,基于内存操作指令在全局内存池与存储节点之间进行数据的内存操作。冷数据是指访问频次较低的数据。将冷数据从全局内存池搬移到存储节点,释放全局内存池的存储空间,提升全局内存池的存储空间的利用率。热数据是指访问频次较高的数据。将热数据从存储节点搬移到全局内存池,使计算节点尽快地获取到所需的数据,缩短了数据处理时长,减少频繁读写数据所占用的计算资源。
第二方面,提供了一种融合系统的数据处理装置,所述装置包括用于执行第一方面或第一方面任一种可能设计中的融合系统的数据处理方法的各个模块。
第三方面,提供一种计算设备,该计算设备包括至少一个处理器和存储器,存储器用于存储一组计算机指令;当处理器作为第一方面或第一方面任一种可能实现方式中的计算节点执行所述一组计算机指令时,执行第一方面或第一方面任一种可能实现方式中的融合系统的数据处理方法的操作步骤。
第四方面,提供一种计算机可读存储介质,包括:计算机软件指令;当计算机软件指令在计算设备中运行时,使得计算设备执行如第一方面或第一方面任意一种可能的实现方式中所述方法的操作步骤。
第五方面,提供一种计算机程序产品,当计算机程序产品在计算机上运行时,使得计算设备执行如第一方面或第一方面任意一种可能的实现方式中所述方法的操作步骤。
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。
附图说明
图1为本申请实施例提供的一种数据处理系统的架构示意图;
图2为本申请实施例提供的一种数据分布式处理的示意图;
图3为本申请实施例提供的一种全局内存池的部署场景示意图;
图4为本申请实施例提供的一种三层结构的存储系统示意图;
图5为本申请实施例提供的一种数据处理方法的过程示意图;
图6为本申请实施例提供的另一种数据处理方法的过程示意图;
图7为本申请实施例提供的一种基于全局内存池的Map任务和Reduce任务处理的过程示意图;
图8为本申请实施例提供的一种数据处理装置的结构示意图;
图9为本申请实施例提供的一种计算设备的结构示意图。
具体实施方式
随着互联网、物联网、网络带宽、智能终端和云计算等服务的发展,促使数据类型和数据规模以前所未有的速度发展,大数据、数据库、高性能计算(High Performance Computing,HPC)等应用随之产生,数据从单一处理对象转变为基础资源。
大数据是一种无法在一段时间范围内用常规软件工具进行捕捉、管理和处理的数据集合。由于大数据包含的大量数据间具有关联关系,利用数据分析的方法、模型或工具对大数据进行分析,挖掘大数据中的数据关系,利用大数据的数据关系进行预测或决策。例如,对用户购物趋势数据进行分析,向用户推送用户可能购买的物品,提高用户的购物体验。因此,大数据具有数据量大、数据增长速度快、数据类型多样和利用价值高等特征。
数据库是一个按数据结构来存储和管理数据的计算机软件系统。也可理解为,数据库是一个计算机中用于存储和管理大量数据的集合,即电子化的文件柜。数据可以包括出行记录、消费记录、浏览的网页、发送的消息、图像、音乐和声音等等。
由于这些应用的数据量非常大,单一的计算节点无法满足计算需求。通常,采用分布式数据处理系统对数据进行处理。
图1为本申请实施例提供的一种数据处理系统的架构示意图。如图1所示,数据处理系统100包括客户端110、计算集群120和存储集群130。
存储集群130包含至少两个存储节点131。一个存储节点131包括一个或多个控制器、网卡与多个硬盘。硬盘用于存储数据。硬盘可以是磁盘或者其他类型的存储介质,例如固态硬盘或者叠瓦式磁记录硬盘等。网卡用于与计算集群120包含的计算节点121通信。控制器用于根据计算节点121发送的读/写数据请求,往硬盘中写入数据或者从硬盘中读取数据。在读写数据的过程中,控制器需要将读/写数据请求中携带的地址转换为硬盘能够识别的地址。
客户端110通过网络140与计算集群120和存储集群130进行通信。例如客户端110通过网络140向计算集群120发送业务请求,请求计算集群120对业务请求包含的业务数据进 行分布式处理。网络140可以是指企业内部网络(如:局域网((Local Area Network,LAN))或互联网(Internet)。
该数据处理系统100支持运行大数据、数据库、高性能计算、人工智能、分布式存储和云原生等应用。可理解的,本申请实施例中业务数据包括大数据、数据库、高性能计算、人工智能(Artificial Intelligence,AI)、分布式存储和云原生等应用的数据。
在一些实施例中,存储集群130基于分布式文件系统132和分布式数据库133存储和管理大量的数据。客户端110安装有客户端程序111,客户端110运行客户端程序111显示一种用户界面(user interface,UI),用户150操作用户界面访问分布式文件系统132和分布式数据库133获取数据,指示计算集群120处理数据业务。客户端110可以是指连入网络140的计算机,也可称为工作站(workstation)。不同的客户端可以共享网络上的资源(如:计算资源、存储资源)。
计算集群120包含至少两个计算节点121,计算节点121之间可以相互通信。计算节点121是一种计算设备,如服务器、台式计算机或者存储阵列的控制器等。例如,客户端110提交的大数据业务可以称为作业(job)。作业可以切分为多个任务(task),由多个计算节点并行执行多个任务,所有任务结束时标志一个作业完成。任务一般是一个作业中对一部分数据或者分阶段的处理过程,所有任务经过调度并行或串行完成。在一些实施例中,计算集群120基于映射归约(MapReduce)模型134对大数据业务进行分布式处理。MapReduce模型是一种分布式编程模型,即将大数据业务进行分解为映射(map)任务和归约(reduce)任务,由多个计算节点121执行map任务,收集处理结果执行reduce任务。在一些实施例中,如图2所示,计算集群120包括控制节点122和至少两个计算节点121。控制节点和计算节点可以是独立的物理设备,则控制节点也可称为控制设备或命名节点。计算节点可以称为计算设备或数据节点。控制节点122用于管理分布式文件系统的命名空间和客户端110对分布式文件系统的访问。以及,控制节点122指示执行map任务的计算节点和执行reduce任务的计算节点。计算节点基于MapReduce模型134执行map任务得到的中间数据(或称为map数据或shuffle数据),将中间数据存储到全局内存池170。执行reduce任务的计算节点从全局内存池170读取中间数据。可选地,存储集群130在存储数据时也可以根据MapReduce模型134对数据进行处理后存储。
在另一些实施例中,系统管理员160可以通过客户端110调用应用平台接口(application platform interface,API)112或命令行界面(command-line interface,CLI)接口113或图形用户界面(Graphical User Interface,GUI)访问分布式文件系统132和分布式数据库133来配置系统信息等,例如本申请实施例提供的为计算节点121和存储节点131配置的经过统一编址构成的全局内存池170的部署信息和存储策略。
本申请实施例提供一种存算融合架构与基于内存操作指令的数据处理方法,即基于计算节点的存储介质和所述存储节点的存储介质经过统一编址构成的全局内存池,根据内存操作指令对全局内存池进行处理请求数据的内存操作的技术。从而既解决了存算一体的分别扩容灵活性问题,又解决了存算分离的性能问题。本申请实施例所述的根据内存操作指令对全局内存池进行处理请求数据的内存操作包括对全局内存池的读操作和写操作。配置有全局内存池的数据处理系统也可以称为融合系统。内存操作指令是一种利用内存接口对存储介质进行内存操作的技术,由于将计算侧和存储侧两种不同的内存操作,基于内存操作指令通过全局统一编制的全局内存池进行计算和存储的融合,即根据内存操作指令对计算节点和存储节点共享的全局存储介质进行处理请求数据的内存操作,有效地简化了在计算节点和存储节点之 间的数据搬移过程,避免了数据从计算侧到存储侧需要经过协议和格式等多种转化操作,从而,基于具有全局内存池的存算分离架构既确保计算和存储分别按需灵活扩展,又能够快速地对系统进行读写操作,缩短了端到端的数据处理时长,提升了数据传输速度,降低了系统能耗,改善了系统性能瓶颈。
比如,格式转换包括序列化和反序列化。序列化是指将对象转换为字节序列的过程。反序列化是指将字节序列恢复为对象的过程。
需要说明的是,本申请实施例提供的全局内存池可以包括数据处理系统中计算节点的存储介质和存储节点的存储介质。计算节点的存储介质包括计算节点内的本地存储介质和计算节点连接的扩展存储介质中至少一种。存储节点的存储介质包括存储节点内的本地存储介质和存储节点连接的扩展存储介质中至少一种。
例如,全局内存池包括计算节点内的本地存储介质和存储节点内的本地存储介质。
又如,全局内存池包括计算节点内的本地存储介质、计算节点连接的扩展存储介质,以及存储节点内的本地存储介质和存储节点连接的扩展存储介质中任意一种。
又如,全局内存池包括计算节点内的本地存储介质、计算节点连接的扩展存储介质、存储节点内的本地存储介质和存储节点连接的扩展存储介质。
示例地,如图3所示,为本申请实施例提供的一种全局内存池的部署场景示意图。全局内存池300包括N个计算节点中每个计算节点内的存储介质310、N个计算节点中每个计算节点连接的扩展存储介质320、M个存储节点中每个存储节点内的存储介质330和M个存储节点中每个存储节点连接的扩展存储介质340。
应理解,全局内存池的存储容量可以包括计算节点的存储介质中的部分存储容量和存储节点的存储介质中的部分存储容量。全局内存池是经过统一编址的计算节点和存储节点均可以访问的存储介质。全局内存池的存储容量可以通过大内存、分布式数据结构、数据缓存、元数据等内存接口供计算节点或存储节点使用。计算节点运行应用程序可以使用这些内存接口对全局内存池进行内存操作。如此,基于计算节点的存储介质的存储容量和存储节点的存储介质构建的全局内存池北向提供了统一的内存接口供计算节点使用,使计算节点使用统一的内存接口将数据写入全局内存池的计算节点提供的存储空间或存储节点提供的存储空间,实现基于内存操作指令的数据的计算和存储。
上述是以计算节点内的存储介质和存储节点内的存储介质构建全局内存池为例进行说明。全局内存池的部署方式可以灵活多变,本申请实施例不予限定。例如,全局内存池由存储节点的存储介质构建。例如高性能的存储层(Global Cache)。又如,全局内存池由计算节点的存储介质构建。例如高性能的存储层。使用单独的存储节点的存储介质或计算节点的存储介质构建全局内存池可以减少存储侧的存储资源的占用,以及提供更灵活的扩展方案。
需要说明的是,存储器是用于存储程序和各种数据的记忆器件。存储器的容量越大,存取速度越慢。反之,存储器的容量越小,存取速度越快。存取速度是指对存储器写入数据或读取数据时的数据传输速度。存取速度也可以称为读写速度。依据存储容量和存取速度可以将存储器划分为不同层级。
示例地,图4为本申请实施例提供的一种三层结构的存储系统示意图。从第一层至第三层,存储容量逐级增加,存取速度逐级降低,成本逐级减少。如图4所示,第一层级包含位于中央处理器(central processing unit,CPU)内的寄存器411、一级缓存412、二级缓存413和三级缓存414。第二层级包含的存储器可以作为计算机系统的主存储器。例如,动态随机存取存储器(Dynamic Random Access Memory,DRAM)421,双倍数据速率同步动态随机存 取存储器(double data rate SDRAM,DDR SDRAM)422,存储级内存(storage-class-memory,SCM)423。主存储器可以简称为主存或内存,即与CPU交换信息的存储器。第三层级包含的存储器可以作为计算机系统的辅助存储器。例如,网络存储器431,固态驱动器(Solid State Disk或Solid State Drive,SSD)432,硬盘驱动器(Hard Disk Drive,HDD)433。辅助存储器可以简称为辅存或外存。相对主存,外存的存储容量大,存取速度慢。可见,距离CPU越近的存储器,容量越小、存取速度越快、带宽越大、延迟越低。因此,第三层级包含的存储器存储CPU不经常访问的数据,提高数据的可靠性。第二层级包含的存储器可以作为缓存设备,用于存储CPU经常访问的数据,显著地改善系统的访问性能。
本申请实施例提供的全局内存池的存储介质包括内存(如:DRAM)、SSD、硬盘、内存服务器和SCM。
在一些实施例中,可以根据存储介质的类型设置全局内存池,即利用一种类型的存储介质构建一种内存池,不同类型的存储介质构建不同类型的全局内存池,使全局内存池应用于不同的场景,计算节点根据应用的访问特征选择存储介质,增强了用户对系统控制权限,提升了用户的系统体验又扩展了系统适用的应用场景。例如,将计算节点中的DRAM和存储节点中的DRAM进行统一编址构成DRAM内存池。DRAM内存池用于对访问性能要求高,数据容量适中,无数据持久化诉求的应用场景。又如,将计算节点中的SCM和存储节点中的SCM进行统一编址构成SCM内存池。SCM内存池则用于对访问性能不敏感,数据容量大,对数据持久化有诉求的应用场景。
不同的存储介质具有不同的存储介质特征。存储介质特征包括写时延、读时延、总存储容量、可用存储容量、存取速度、CPU消耗、能耗比和可靠性中至少一个。写时延是指计算节点121将数据写入存储介质的时延。读时延是指计算节点121从存储介质读取数据的时延。存储容量是指存储介质的可存储数据的总存储容量。可用存储容量是指总存储容量减去已使用的存储容量的剩余存储容量。存取速度是指计算节点121对存储介质进行读写操作的速度。CPU消耗是指计算节点121向存储介质写入数据或从存储介质读取数据使用计算节点121的CPU的占用率。能耗比是指单位时间内所消耗的能量(如电能)。可靠性是指存储介质存储数据的持久程度。
下面将结合附图对本申请实施例提供的融合系统的数据处理方法的实施方式进行详细描述。
图5为本申请实施例提供的一种融合系统的数据处理方法的流程示意图。在这里以客户端110、计算集群120和存储集群130为例进行说明。计算集群120包括控制节点122和至少两个计算节点121,存储集群130包括至少两个存储节点131,控制节点122用于控制计算节点121执行数据分布式处理,计算节点121和存储节点131配置有全局内存池170,全局内存池170包括计算节点121的存储介质和存储节点131的存储介质。如图5所示,该方法包括以下步骤。
步骤510、计算节点121获取处理请求数据。
客户端110响应用户操作,向控制节点122发送数据业务的业务请求。控制节点122可以通过局域网或互联网接收客户端110发送的大数据业务的业务请求。业务请求可以包括业务标识和业务数据。业务标识用于唯一指示一个数据业务。业务数据可以是计算节点进行数据分布式处理的数据或指示待处理数据的标识数据。
用户操作可以是指用户操作大数据用户界面提交大数据业务的操作。大数据业务包括数据分析业务、数据查询业务和数据修改业务等。例如,大数据业务是指分析客户的个人数据 和购买行为数据来描绘用户画像对客户进行分类,使得可以向特定客户推荐针对性的产品或优惠产品,提升客户满意度,稳固客户关系等。又如,大数据业务是指分析产品的历史销售量预测未来的销售量,发现销售量下降原因或销售量上升原因,推荐提升销售量的建设性建议。
用户操作还可以是指用户操作数据库用户界面提交数据库业务的操作等。数据库操作包括数据库创建、删除、修改和查询等。
计算节点121可以接收控制节点122发送的业务请求,将业务请求转换为符合内存操作指令操作规则的处理请求数据,以便于计算节点121根据内存操作指令对全局内存池进行处理请求数据的内存操作。
步骤520、计算节点121根据内存操作指令对全局内存池进行处理请求数据的内存操作。
内存操作指令也可以称为内存语义或内存操作函数。内存操作指令包括内存分配(malloc)、内存设置(memset)、内存复制(memcpy)、内存移动(memmove)、内存释放(memory release)和内存比较(memcmp)中至少一种。
内存分配用于支持应用程序运行分配一段内存。
内存设置用于设置全局内存池的数据模式,例如初始化。
内存复制用于将源地址(source)指示的存储空间存储的数据复制到目的地址(destination)指示的存储空间。
内存移动用于将源地址(source)指示的存储空间存储的数据复制到目的地址(destination)指示的存储空间,并删除源地址(source)指示的存储空间存储的数据。
内存比较用于比较两个存储空间存储的数据是否相等。
内存释放用于释放内存中存储的数据,以提高系统内存资源的利用率,进而提升系统性能。
处理请求数据用于指示对待处理数据进行的操作。例如,处理请求数据指示获取产品A的第一季度销售量。待处理数据可以包括全年销售量。步骤520所包括的具体操作过程如下步骤521至步骤523的阐述。
步骤521、计算节点121读取待处理数据和应用数据。
待处理数据和应用数据可以存储在全局内存池、存储节点的存储介质或计算节点的存储介质等存储空间中。计算节点121可以从全局内存池、存储节点的存储介质或计算节点的存储介质读取待处理数据和应用数据。例如,计算节点121从本地内存中读取应用数据,从全局内存池中读取待处理数据。待处理数据可以是根据处理请求数据的指示需要处理的对象。应用数据包括应用程序和应用配置数据。
步骤522、计算节点121根据应用数据启动应用程序,根据处理请求数据使用内存操作指令对待处理数据进行处理得到处理后数据。
例如,假设应用数据包括数据库应用程序和数据库配置数据,处理请求数据用于指示对太字节(Terabyte,TB)的待处理数据进行排序。计算节点121根据数据库应用数据启动数据库,获取待处理数据并对待处理数据进行排序。
需要说明的是,计算节点121可以在获取处理请求数据之前,根据应用数据启动应用程序,也可以在获取处理请求数据之后,根据应用数据启动应用程序。
如此,基于内存操作指令的数据处理可以应用于数据处理的全生命周期,使用内存操作指令进行数据处理,实现去输入/输出(Input/Output,IO)化,以及数据端到端的极速访问,结合存算融合的全局内存池,避免执行数据搬移中的格式和协议转换,有效地降低了端到端的 数据处理时长。
步骤523、计算节点121将处理后数据写入全局内存池中第一地址指示的存储空间。
计算节点121根据存储策略自动地从全局内存池中选择用于存储处理后数据的存储介质。存储策略包括应用的访问特征和全局内存池中存储介质特征等。第一地址指示的存储空间包括计算节点121的存储介质提供的存储空间和存储节点131的存储介质提供的存储空间中一个。
在一些实施例中,计算节点121根据用户需求和存储介质特征确定将处理后数据写入全局内存池中第一地址指示的存储空间。用户需求用于指示与存储介质特征相关的需求。处理请求数据包括用户需求。存储介质特征包括写时延、读时延、总存储容量、可用存储容量、存取速度、中央处理器CPU消耗、能耗比和可靠性中至少一个。
例如,计算节点121配置有多种类型存储介质的存储介质特征。用户需求指示存取速度范围或者一个具体的存取速度,计算节点121从全局内存池中确定满足用户需求的存储介质。比如用户需求指示内存的存取速度。计算节点121从全局内存池中选取符合内存的存取速度的存储介质,比如内存、DRAM和SCM中至少一种。示例一,计算节点121根据用户需求和存储介质特征确定将处理后数据写入全局内存池中第一地址指示的计算节点的存储空间,第一地址指示的计算节点的存储空间满足用户需求指示的存取速度。从而将处理后数据存储到全局内存池中计算侧的本地内存,以便进行本地内存访问,有效地缩短了数据处理时长,提升了数据传输速度。
又如,计算节点121配置有存储介质和客户等级的关联关系。用户需求指示第一客户等级。计算节点121根据第一客户等级从关联关系中确定与第一客户等级关联的存储介质,确定与第一客户等级关联的存储介质用于存储处理后数据。示例二,计算节点121根据用户需求和存储介质特征确定将处理后数据写入全局内存池中第一地址指示的存储节点的存储空间,第一地址指示的计算节点的存储空间满足用户需求指示的第一客户等级。
如此,计算节点121从全局内存池中选择存储处理后数据的存储介质时,基于用户对存取速度或可靠性等存储介质特征的用户需求,动态地选取满足用户需求的存储介质,确保数据处理的性能和可靠性的场景需求。
在另一些实施例中,计算节点121配置有依据存储介质特征确定的多种类型存储介质的优先级,根据存储策略指示的多种类型存储介质的优先级,确定用于存储处理后数据的存储介质。其中,多种类型存储介质的优先级可以是依据存储介质的存取速度确定的。比如内存的存取速度高于硬盘的存取速度,硬盘的存取速度高于扩展存储介质的存取速度。多种类型存储介质的优先级可以是依据部署模式的优先级确定的。比如本地存储介质的优先级高于扩展存储介质的优先级。多种类型存储介质的优先级可以是依据多种类型存储介质特征综合确定的。多种类型存储介质的优先级是依据部署模式(如:本地存储介质和扩展存储介质)的优先级确定的。对于同一种部署模式下的存储介质的优先级可以依据存储介质的存取速度确定的。
可选地,计算节点121还可以从满足用户需求的多种类型存储介质中选取可用存储容量大于阈值的存储介质作为存储处理后数据的存储介质。
计算节点121根据多种类型存储介质的优先级,从最高优先级的存储介质开始,判断最高优先级的存储介质的可用存储容量是否大于阈值,若最高优先级的存储介质的可用存储容量大于阈值,表示最高优先级的存储介质有多余的存储空间存储处理后数据,则选取该最高优先级的存储介质存储处理后数据;若最高优先级的存储介质的可用存储容量小于或等于阈 值,表示最高优先级的存储介质没有多余的存储空间存储处理后数据,则判断次高优先级的存储介质的可用存储容量是否大于阈值,依次遍历多种类型存储介质,最终从多种类型存储介质中确定存储处理后数据的存储介质。
需要说明的是,在实际应用中,可以根据业务需求、场景需求或用户需求等自行设置存储策略,上述对存储策略的阐述只是举例说明。例如,存储策略还可以是指数据本地化偏好,即优先将处理后数据存储到全局内存池中本地存储介质。又如,存储策略还可以是指按照应用对性能和成本的权衡选择性能优先、存储容量优先、成本优先等。
在计算节点121使用存储策略选取存储处理后数据的存储介质之前,即执行步骤523之前,可以预先配置存储策略和多种类型存储介质的配置信息。
本申请实施例提供的存储策略可以适用于上述数据处理系统100支持的至少一个应用,即大数据、数据库、高性能计算、人工智能、分布式存储和云原生。例如计算节点121可以在处理大数据业务或处理大数据业务中任务或处理系统全局的任务中,选取用于存储中间数据的存储介质时使用存储策略。
需要说明的是,本申请实施例中计算节点121根据处理请求数据对全局内存池进行内存操作后,即将处理后数据写入全局内存池或从全局内存池读取数据后,表示本次处理请求结束。全局内存池提供了将数据异步写入到存储节点的持久化和容量层,以及从持久化和容量层预取和缓存的能力。随后,计算节点121或存储节点131还可以从全局内存池读取处理后数据。关于对全局内存池存储的数据的数据处理操作如下图6的阐述。图6为本申请实施例提供的另一种融合系统的数据处理方法的流程示意图。
步骤610、计算节点121根据第一地址从全局内存池读取数据,对数据进行处理。
在计算节点执行任务需要全局内存池存储的数据时,计算节点可以从全局内存池读取相关数据。例如,计算节点121是执行reduce任务的计算节点。计算节点121根据第一地址从全局内存池读取执行reduce任务所需的处理后数据。计算节点121可以从第一地址控制节点122获取第一地址。
步骤620、计算节点121根据第一地址从全局内存池读取数据,将数据写入存储节点131。
在全局内存池存储的数据需要持久化,即将全局内存池存储的数据搬移到存储节点时,计算节点121可以从全局内存池读取相关数据,写入存储节点131。存储节点131根据内存操作指令将处理后数据写入存储节点131的存储介质。可选地,计算节点121还可以保存写入存储节点131的数据。
在另一些实施例中,计算节点121还可以执行步骤630,即根据内存操作指令从存储节点131预取数据,存储到全局内存池。从而,以便于计算节点121能够尽快地获取到数据,缩短了端到端的数据处理时长。
计算节点121还可以执行步骤640,即根据数据冷热特性,基于内存操作指令在全局内存池与存储节点之间进行数据的内存操作。
热数据是指访问频次比较多的数据。如在线类数据。冷数据是指访问频次比较少的数据。如企业备份数据、业务与操作日志数据、话单与统计数据。热数据因为访问频次需求大,效率要求高,所以就近计算和部署冷数据访问频次低,效率要求慢,可以做集中化部署。
本申请实施例在存储分离架构下构建了全局内存池,作为计算和存储统一的缓存层,全局内存池将存储和计算进行存算融合,物理分离的基础上实现逻辑融合,在这层中计算节点和存储节点采用统一内存语义进行数据交换与共享,从而避免了计算节点和存储节点之间反复的格式转换和协议转换等中间操作,达到极致地高性能与低能耗。
下面举例说明基于融合系统的数据处理的过程。如图7所示,为本申请实施例提供的一种执行map任务和reduce任务的数据处理过程的示意图。
1)初始化阶段:①加载应用数据,即加载程序配置文件,启动应用程序。应用数据可以存储在非易失性内存或全局内存池中,计算节点将应用数据加载到计算节点的本地内存中。例如,计算节点采用直接内存访问(Direct Memory Access,DMA)方式将应用数据从非易失性内存加载到本地内存,或者通过在内存层级之上构建的虚拟文件系统(Virtual File System,VFS)或可移植操作系统接口(Portable Operating System Interface,POSIX)文件系统接口加载应用数据。②加载待处理数据,待处理数据可以存储在非易失性内存中,计算节点将待处理数据加载到计算节点的本地内存中。例如,计算节点根据内存操作指令从非易失性内存池中获取待处理数据加载到本地内存中,消除了文件系统栈的访问开销。计算节点还可以采用远端直接内存访问(Remote Direct Memory Access,RDMA)获取存储在远端节点的待处理数据,实现高效地访问,减少了远端数据访问的CPU开销和网络协议栈开销。
2)数据处理阶段:③根据处理请求数据使用内存操作指令对待处理数据进行处理得到处理后数据,例如每个计算节点对处理后数据进行排序得到中间数据。计算节点根据处理后数据的数据量,从全局内存池中选择存储处理后数据的存储介质,例如将处理后数据存储到HBM、DRAM或者SCM。当全局内存池中的存储容量不足时,可以通过远端内存扩展或者内存盘扩展解决,即将处理后数据存储到其他存储介质。目的节点从源节点读取中间数据或者由源节点将中间数据写入到目的节点完成数据交换(shuffle)。如果全局内存池是由独立的存储介质(如:计算节点的存储介质和存储节点的存储介质)构建,源节点可以将中间数据写入到全局内存池中,再由目的节点从全局内存池中读取中间数据。④将排序结果进行合并。
3)持久化阶段:⑤测试报告和合并排序结果写入到本地的非易失性内存或全局内存池的本地内存,由全局内存池组件异步写入到存储节点的存储空间。
4)清理阶段:⑥释放本地内存或全局内存池。
可以理解的是,为了实现上述实施例中的功能,计算设备包括了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本申请中所公开的实施例描述的各示例的单元及方法步骤,本申请能够以硬件或硬件和计算机软件相结合的形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用场景和设计约束条件。
上文中结合图1至图7,详细描述了根据本实施例所提供的融合系统的数据处理方法,下面将结合图8,描述根据本实施例所提供的数据处理装置。
图8为本实施例提供的可能的数据处理装置的结构示意图。这些数据处理装置可以用于实现上述方法实施例中计算设备或计算节点的功能,因此也能实现上述方法实施例所具备的有益效果。在本实施例中,该数据处理装置可以是如图5所示的计算节点121,还可以是应用于服务器的模块(如芯片)。
如图8所示,数据处理装置800包括通信模块810、数据处理模块820和存储模块830。数据处理装置800用于实现上述图5中所示的方法实施例中计算节点121的功能。
通信模块810用于获取处理请求数据。
数据处理模块820,用于根据内存操作指令对全局内存池进行处理请求数据的内存操作。例如,数据处理模块820用于执行图5中步骤510和步骤520。
存储模块830用于存储内存操作指令,以便于数据处理模块820获取处理后数据。例如,通信模块810用于执行图5中S560。
数据处理模块820具体用于从全局内存池读取待处理数据和应用数据;根据所述处理请求数据使用所述内存操作指令对所述待处理数据进行处理得到处理后数据,将所述处理后数据写入所述全局内存池中第一地址指示的存储空间。
数据处理模块820还用于根据第一地址从全局内存池读取处理后数据。
数据处理模块820还用于根据所述内存操作指令从所述存储节点预取数据,存储到所述全局内存池。
数据处理模块820还用于根据数据冷热特性,基于所述内存操作指令在所述全局内存池与所述存储节点之间进行数据的内存操作。
应理解的是,本申请实施例的数据处理装置800可以通过专用集成电路(application-specific integrated circuit,ASIC)实现,或可编程逻辑器件(programmable logic device,PLD)实现,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD),现场可编程门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。也可以通过软件实现图5或图6所示的数据处理方法时,及其各个模块也可以为软件模块,数据处理装置800及其各个模块也可以为软件模块。
根据本申请实施例的数据处理装置800可对应于执行本申请实施例中描述的方法,并且数据处理装置800中的各个单元的上述和其它操作和/或功能分别为了实现图5或图6中的各个方法的相应流程,为了简洁,在此不再赘述。
图9为本实施例提供的一种计算设备900的结构示意图。如图所示,计算设备900包括处理器910、总线920、存储器930、通信接口940和内存单元950(也可以称为主存(main memory)单元)。处理器910、存储器930、内存单元950和通信接口940通过总线920相连。
应理解,在本实施例中,处理器910可以是CPU,该处理器910还可以是其他通用处理器、数字信号处理器(digital signal processing,DSP)、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者是任何常规的处理器等。
处理器还可以是图形处理器(graphics processing unit,GPU)、神经网络处理器(neural network processing unit,NPU)、微处理器、ASIC、或一个或多个用于控制本申请方案程序执行的集成电路。
通信接口940用于实现计算设备900与外部设备或器件的通信。在本实施例中,计算设备900用于实现图5所示的计算节点121的功能时,通信接口940用于获取处理请求数据,计算节点121根据内存操作指令对所述全局内存池进行所述处理请求数据的内存操作。
总线920可以包括一通路,用于在上述组件(如处理器910、内存单元950和存储器930)之间传送信息。总线920除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线920。总线920可以是快捷外围部件互连标准(Peripheral Component Interconnect Express,PCIe)总线,或扩展工业标准结构(extended industry standard architecture,EISA)总线、统一总线(unified bus,Ubus或UB)、计算机快速链接(compute express link,CXL)、缓存一致互联协议(cache coherent interconnect for accelerators,CCIX)等。总线920可以分为地址总线、数据总线、控制总线等。
作为一个示例,计算设备900可以包括多个处理器。处理器可以是一个多核(multi-CPU)处理器。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的计算单元。在本实施例中,计算设备900用于实现图5所示的计算节点121的功能 时,处理器910可以根据内存操作指令对所述全局内存池进行所述处理请求数据的内存操作。
值得说明的是,图9中仅以计算设备900包括1个处理器910和1个存储器930为例,此处,处理器910和存储器930分别用于指示一类器件或设备,具体实施例中,可以根据业务需求确定每种类型的器件或设备的数量。
内存单元950可以对应上述方法实施例中用于存储处理后数据等信息的全局内存池。内存单元950可以是易失性存储器池或非易失性存储器池,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data date SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
存储器930可以对应上述方法实施例中用于存储计算机指令、内存操作指令、存储策略等信息的存储介质,例如,磁盘,如机械硬盘或固态硬盘。
上述计算设备900可以是一个通用设备或者是一个专用设备。例如,计算设备900可以是边缘设备(例如,携带具有处理能力芯片的盒子)等。可选地,计算设备900也可以是服务器或其他具有计算能力的设备。
应理解,根据本实施例的计算设备900可对应于本实施例中的数据处理装置800,并可以对应于执行根据图5或图6中任一方法中的相应主体,并且数据处理装置800中的各个模块的上述和其它操作和/或功能分别为了实现图5或图6中的各个方法的相应流程,为了简洁,在此不再赘述。
本实施例中的方法步骤可以通过硬件的方式来实现,也可以由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器(random access memory,RAM)、闪存、只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于计算设备中。当然,处理器和存储介质也可以作为分立组件存在于计算设备中。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机程序或指令。在计算机上加载和执行所述计算机程序或指令时,全部或部分地执行本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其它可编程装置。所述计算机程序或指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例 如,所述计算机程序或指令可以从一个网站站点、计算机、服务器或数据中心通过有线或无线方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是集成一个或多个可用介质的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,例如,软盘、硬盘、磁带;也可以是光介质,例如,数字视频光盘(digital video disc,DVD);还可以是半导体介质,例如,固态硬盘(solid state drive,SSD)。以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (22)

  1. 一种融合系统的数据处理方法,其特征在于,所述融合系统包括计算节点和存储节点,所述计算节点通过网络与所述存储节点连接,所述计算节点的存储介质和所述存储节点的存储介质经过统一编址构成全局内存池;所述方法包括:
    获取处理请求数据;
    根据内存操作指令对所述全局内存池进行所述处理请求数据的内存操作。
  2. 根据权利要求1所述的方法,其特征在于,内存操作指令包括内存分配、内存设置、内存复制、内存移动、内存释放和内存比较中至少一种。
  3. 根据权利要求2所述的方法,其特征在于,所述全局内存池的存储介质包括内存、内存服务器和存储级内存SCM。
  4. 根据权利要求1-3中任一项所述的方法,其特征在于,根据内存操作指令对所述全局内存池进行所述处理请求数据的内存操作,包括:
    从所述全局内存池读取待处理数据;
    根据所述处理请求数据使用所述内存操作指令对所述待处理数据进行处理得到处理后数据,将所述处理后数据写入所述全局内存池中第一地址指示的存储空间。
  5. 根据权利要求4所述的方法,其特征在于,将所述处理后数据写入所述全局内存池中第一地址指示的存储空间,包括:
    根据用户需求和存储介质特征确定将所述处理后数据写入所述全局内存池中第一地址指示的存储空间,所述存储介质特征包括写时延、读时延、总存储容量、可用存储容量、存取速度、中央处理器CPU消耗、能耗比和可靠性中至少一个。
  6. 根据权利要求4或5所述的方法,其特征在于,所述第一地址指示的存储空间包括所述计算节点的存储介质提供的存储空间和所述存储节点的存储介质提供的存储空间中一个。
  7. 根据权利要求4-6中任一项所述的方法,其特征在于,将所述处理后数据写入所述全局内存池中第一地址指示的存储空间之后,所述方法还包括:
    根据所述第一地址从所述全局内存池读取所述处理后数据。
  8. 根据权利要求7所述的方法,其特征在于,根据所述第一地址从所述全局内存池读取所述处理后数据,包括:
    根据所述第一地址从所述全局内存池读取所述处理后数据,将所述处理后数据写入所述存储节点。
  9. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    根据所述内存操作指令从所述存储节点预取数据,存储到所述全局内存池。
  10. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    根据数据冷热特性,基于所述内存操作指令在所述全局内存池与所述存储节点之间进行数据的内存操作。
  11. 一种融合系统的数据处理装置,其特征在于,所述融合系统包括计算节点和存储节点,所述计算节点通过网络与所述存储节点连接,所述计算节点的存储介质和所述存储节点的存储介质经过统一编址构成全局内存池;所述装置包括:
    通信模块,用于获取处理请求数据;
    数据处理模块,用于根据内存操作指令对所述全局内存池进行所述处理请求数据的内存操作。
  12. 根据权利要求11所述的装置,其特征在于,内存操作指令包括内存分配、内存设置、 内存复制、内存移动、内存释放和内存比较中至少一种。
  13. 根据权利要求12所述的装置,其特征在于,所述全局内存池的存储介质包括内存、内存服务器和存储级内存SCM。
  14. 根据权利要求11-13中任一项所述的装置,其特征在于,所述数据处理模块根据内存操作指令对所述全局内存池进行所述处理请求数据的内存操作时,具体用于:
    从所述全局内存池读取待处理数据;
    根据所述处理请求数据使用所述内存操作指令对所述待处理数据进行处理得到处理后数据,将所述处理后数据写入所述全局内存池中第一地址指示的存储空间。
  15. 根据权利要求14所述的装置,其特征在于,所述数据处理模块将所述处理后数据写入所述全局内存池中第一地址指示的存储空间时,具体用于:
    根据用户需求和存储介质特征确定将所述处理后数据写入所述全局内存池中第一地址指示的存储空间,所述存储介质特征包括写时延、读时延、总存储容量、可用存储容量、存取速度、中央处理器CPU消耗、能耗比和可靠性中至少一个。
  16. 根据权利要求14或15所述的装置,其特征在于,所述第一地址指示的存储空间包括所述计算节点的存储介质提供的存储空间和所述存储节点的存储介质提供的存储空间中一个。
  17. 根据权利要求14-16中任一项所述的装置,其特征在于,
    所述数据处理模块,还用于根据所述第一地址从所述全局内存池读取所述处理后数据。
  18. 根据权利要求17所述的装置,其特征在于,所述数据处理模块根据所述第一地址从所述全局内存池读取所述处理后数据时,具体用于:
    根据所述第一地址从所述全局内存池读取所述处理后数据,将所述处理后数据写入所述存储节点。
  19. 根据权利要求11所述的装置,其特征在于,
    所述数据处理模块,还用于根据所述内存操作指令从所述存储节点预取数据,存储到所述全局内存池。
  20. 根据权利要求11所述的装置,其特征在于,
    所述数据处理模块,还用于根据数据冷热特性,基于所述内存操作指令在所述全局内存池与所述存储节点之间进行数据的内存操作。
  21. 一种计算设备,其特征在于,所述计算设备包括存储器和至少一个处理器,所述存储器用于存储一组计算机指令;当所述处理器执行所述一组计算机指令时,执行上述权利要求1-10中任一项所述的方法的操作步骤。
  22. 一种系统,其特征在于,所述系统包括存储节点和计算节点,所述计算节点通过网络与所述存储节点连接,所述计算节点的存储介质和所述存储节点的存储介质经过统一编址构成全局内存池,所述计算节点用于执行上述权利要求1-10中任一项所述的方法的操作步骤。
PCT/CN2023/087164 2022-04-08 2023-04-08 融合系统的数据处理方法、装置、设备和系统 WO2023193814A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202210369243.7 2022-04-08
CN202210369243 2022-04-08
CN202210633694.7 2022-06-06
CN202210633694.7A CN116932196A (zh) 2022-04-08 2022-06-06 融合系统的数据处理方法、装置、设备和系统

Publications (1)

Publication Number Publication Date
WO2023193814A1 true WO2023193814A1 (zh) 2023-10-12

Family

ID=88244126

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/087164 WO2023193814A1 (zh) 2022-04-08 2023-04-08 融合系统的数据处理方法、装置、设备和系统

Country Status (1)

Country Link
WO (1) WO2023193814A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117791877A (zh) * 2024-02-23 2024-03-29 北京智芯微电子科技有限公司 配电物联网控制方法、装置、设备及介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101577716A (zh) * 2009-06-10 2009-11-11 中国科学院计算技术研究所 基于InfiniBand网络的分布式存储方法和系统
US20120233409A1 (en) * 2011-03-11 2012-09-13 Microsoft Corporation Managing shared memory used by compute nodes
CN110795206A (zh) * 2018-08-02 2020-02-14 阿里巴巴集团控股有限公司 用于促进集群级缓存和内存空间的系统和方法
CN111435943A (zh) * 2019-01-14 2020-07-21 阿里巴巴集团控股有限公司 数据处理方法、设备、系统及存储介质
CN113220693A (zh) * 2021-06-02 2021-08-06 北京字节跳动网络技术有限公司 计算存储分离系统及其数据访问方法、介质和电子设备
CN113568562A (zh) * 2020-04-28 2021-10-29 华为技术有限公司 一种存储系统、内存管理方法和管理节点

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101577716A (zh) * 2009-06-10 2009-11-11 中国科学院计算技术研究所 基于InfiniBand网络的分布式存储方法和系统
US20120233409A1 (en) * 2011-03-11 2012-09-13 Microsoft Corporation Managing shared memory used by compute nodes
CN110795206A (zh) * 2018-08-02 2020-02-14 阿里巴巴集团控股有限公司 用于促进集群级缓存和内存空间的系统和方法
CN111435943A (zh) * 2019-01-14 2020-07-21 阿里巴巴集团控股有限公司 数据处理方法、设备、系统及存储介质
CN113568562A (zh) * 2020-04-28 2021-10-29 华为技术有限公司 一种存储系统、内存管理方法和管理节点
CN113220693A (zh) * 2021-06-02 2021-08-06 北京字节跳动网络技术有限公司 计算存储分离系统及其数据访问方法、介质和电子设备

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117791877A (zh) * 2024-02-23 2024-03-29 北京智芯微电子科技有限公司 配电物联网控制方法、装置、设备及介质

Similar Documents

Publication Publication Date Title
CN110663019B (zh) 用于叠瓦式磁记录(smr)的文件系统
US20220057940A1 (en) Method and Apparatus for SSD Storage Access
JP5932043B2 (ja) 不揮発性記憶装置セットの揮発メモリ表現
US8819335B1 (en) System and method for executing map-reduce tasks in a storage device
US10108450B2 (en) Mechanism for SSDs to efficiently manage background activity with notify
JP2017021805A (ja) 不揮発性メモリ装置内でデータ属性基盤データ配置を利用可能にするインターフェイス提供方法及びコンピュータ装置
CN112632069B (zh) 哈希表数据存储管理方法、装置、介质和电子设备
JP2017528794A (ja) 負荷に基づく動的統合
WO2023193814A1 (zh) 融合系统的数据处理方法、装置、设备和系统
CN103324466A (zh) 一种数据相关性序列化io的并行处理方法
US11500822B2 (en) Virtualized append-only interface
CN110569112B (zh) 日志数据写入方法及对象存储守护装置
CN116185300A (zh) 一种在主机端基于深度学习完成固态硬盘高效垃圾回收的软硬件实现办法
US20210357134A1 (en) System and method for creating on-demand virtual filesystem having virtual burst buffers created on the fly
WO2023066248A1 (zh) 数据处理方法、装置、设备和系统
US11507402B2 (en) Virtualized append-only storage device
WO2023237115A1 (zh) 数据处理方法、装置、设备和系统
CN116932196A (zh) 融合系统的数据处理方法、装置、设备和系统
Lu et al. Cost-aware software-defined hybrid object-based storage system
CN116069739A (zh) 数据处理方法、装置、设备和系统
CN109918355A (zh) 实现基于对象存储服务的nas的虚拟元数据映射系统和方法
WO2024036940A1 (zh) 一种容器管理方法及相关设备
US11734121B2 (en) Systems and methods to achieve effective streaming of data blocks in data backups
WO2024012153A1 (zh) 一种数据处理方法及装置
CN117251259A (zh) 数据处理方法、装置、设备和系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23784376

Country of ref document: EP

Kind code of ref document: A1