CN114296640A - Data driving method, apparatus, device and storage medium for accelerating computation - Google Patents

Data driving method, apparatus, device and storage medium for accelerating computation Download PDF

Info

Publication number
CN114296640A
CN114296640A CN202111520962.6A CN202111520962A CN114296640A CN 114296640 A CN114296640 A CN 114296640A CN 202111520962 A CN202111520962 A CN 202111520962A CN 114296640 A CN114296640 A CN 114296640A
Authority
CN
China
Prior art keywords
data
message queue
reading
memory
position information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111520962.6A
Other languages
Chinese (zh)
Other versions
CN114296640B (en
Inventor
孙忠祥
张闯
任智新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202111520962.6A priority Critical patent/CN114296640B/en
Publication of CN114296640A publication Critical patent/CN114296640A/en
Application granted granted Critical
Publication of CN114296640B publication Critical patent/CN114296640B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The present application relates to data driven methods, apparatus, devices and storage media for accelerated computing. The driving method includes: the method comprises the steps of obtaining at least one first data and a data writing request corresponding to the first data, writing the first data into a memory in sequence, and storing the data writing request in a first message queue in sequence, wherein the data writing request comprises: location information of the first data; reading the first data and the corresponding position information thereof, compiling the first data to obtain second data, and writing the second data into the memory; storing the location information of the second data in the second message queue; according to the data reading request, the position information of the second data is read from the second message queue, and the second data is read from the memory according to the position information of the second data, so that the calculation real-time performance can be improved, and the time loss can be reduced.

Description

Data driving method, apparatus, device and storage medium for accelerating computation
Technical Field
The present invention relates to the field of data acceleration computing technology, and in particular, to a data driving method, apparatus, device, and storage medium for acceleration computing.
Background
The data acceleration computing technology can economically and effectively obtain high-performance computing power, has good expandability and high utilization rate of computing resources, and has become one of research hotspots in the field of distributed computing. At present, data to be accelerated is sent from a host to an acceleration end through a driving end to perform accelerated calculation, and then a calculation result is transmitted to the host. However, there is no transceiving interaction mechanism in the design of the driving end at present, and when the host initiates an acceleration request in blocks due to too large data amount of data to be calculated, it is necessary to wait for the previous acceleration calculation to complete and obtain the calculation result before initiating a new request, which has the problems of long time consumption, poor real-time performance, and the like.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a data driving method, apparatus, device and storage medium for accelerated computing, which can improve the problem of poor data driving performance in accelerated computing.
In one aspect, a data-driven method for accelerating computations is provided, the data-driven method for accelerating computations comprising:
establishing a first message queue and a second message queue at a driving end, acquiring at least one first data and a corresponding data writing request from a host end through the driving end, sequentially writing the first data into a memory, and sequentially storing the data writing request in the first message queue, wherein the data writing request comprises: location information of the first data;
reading the first message queue to obtain the first data and the corresponding position information thereof, so that the acceleration end compiles the first data to obtain second data, writes the second data into the memory, and writes the position information of the second data into a register;
reading the register through the driving end, acquiring the position information of the second data, and storing the position information of the second data in the second message queue;
and acquiring a data reading request from a host end through the driving end, reading the position information of the second data from the second message queue, and reading the second data from the memory according to the position information of the second data.
In one embodiment, the steps of obtaining at least one first data and a data write request corresponding to the first data, sequentially writing the first data into a memory, and sequentially storing the data write request in a first message queue include:
the driving end acquires at least one data writing request from the host end through an interface function;
the driving end obtains at least one first data through a writing function of an operating system, and sequentially writes the first data into the memory through a memory access writing function;
and sequentially storing the position information of the first data in the first message queue.
In one embodiment, the step of reading the first data and the corresponding location information thereof, compiling the first data, obtaining second data, and writing the second data into the memory includes:
sequentially reading the position information of the first data from the position information of the first data in the first message queue, wherein the position information of the first data comprises: a first data address, a first data length;
writing the first data address into a first data address register, and writing the first data length into a first data length register;
reading the first data address register and the first data length register to obtain the first data address and the first data length, so that an acceleration end reads the first data from the memory, performs accelerated compilation on the first data to obtain the second data, writes the second data into the memory, and writes the position information of the second data into a register, where the register includes: a second data address register and a second data length register.
In one embodiment, the step of storing the location information of the second data in the second message queue comprises:
reading a second data length register through the driving end to obtain a second data length;
and sequentially storing the second data length in the second message queue.
In one embodiment, the step of storing the location information of the second data in the second message queue further comprises:
reading a second data length register and a second data address register through the driving end to obtain a second data length and a second data address;
and sequentially storing the second data length and the second data address in the second message queue.
In one embodiment, the step of reading the location information of the second data from the second message queue according to the data read request, and the step of reading the second data from the memory according to the location information of the second data includes:
the drive end acquires the data reading request from the host end through an interface function;
reading the position information of the second data from the second message queue according to the data reading request;
and according to the position information of the second data, reading the second data from the memory through a memory access read function.
In one embodiment, the method further comprises the following steps:
recording a time node of starting acceleration calculation by an acceleration end as a first time parameter;
recording a time node for the acceleration end to finish acceleration calculation as a second time parameter;
and the driving end transmits the first time parameter and the second time parameter to a host end through an interface function.
In another aspect, there is provided a data driving apparatus for accelerating calculation, the data driving apparatus for accelerating calculation comprising:
the first message queue processing module is configured to establish a first message queue and a second message queue at a driving end, acquire at least one first data and a data write request corresponding to the first data from a host through the driving end, sequentially write the first data into a memory, and sequentially store the data write request in the first message queue, where the data write request includes: location information of the first data;
the first data processing module is used for reading the first message queue to obtain the first data and the corresponding position information thereof, so that the acceleration end compiles the first data to obtain second data, and writes the second data into the memory;
the second message queue processing module is used for reading the register through the driving end, acquiring the position information of the second data and storing the position information of the second data in the second message queue;
and the second data reading module is used for acquiring a data reading request from a host end through the driving end, reading the position information of the second data from the second message queue, and reading the second data from the memory according to the position information of the second data.
In yet another aspect, an apparatus is provided that includes a memory, a processor, and a program stored on the memory and executable on the processor, the processor implementing the following steps when executing the program:
establishing a first message queue and a second message queue at a driving end, acquiring at least one first data and a corresponding data writing request from a host end through the driving end, sequentially writing the first data into a memory, and sequentially storing the data writing request in the first message queue, wherein the data writing request comprises: location information of the first data;
reading the first message queue to obtain the first data and the corresponding position information thereof, so that the acceleration end compiles the first data to obtain second data, writes the second data into the memory, and writes the position information of the second data into a register;
reading the register through the driving end, acquiring the position information of the second data, and storing the position information of the second data in the second message queue;
and acquiring a data reading request from a host end through the driving end, reading the position information of the second data from the second message queue, and reading the second data from the memory according to the position information of the second data.
In yet another aspect, a readable storage medium is provided, on which a program is stored, which program, when executed by a processor, performs the steps of:
establishing a first message queue and a second message queue at a driving end, acquiring at least one first data and a corresponding data writing request from a host end through the driving end, sequentially writing the first data into a memory, and sequentially storing the data writing request in the first message queue, wherein the data writing request comprises: location information of the first data;
reading the first message queue to obtain the first data and the corresponding position information thereof, so that the acceleration end compiles the first data to obtain second data, writes the second data into the memory, and writes the position information of the second data into a register;
reading the register through the driving end, acquiring the position information of the second data, and storing the position information of the second data in the second message queue;
and acquiring a data reading request from a host end through the driving end, reading the position information of the second data from the second message queue, and reading the second data from the memory according to the position information of the second data.
According to the data driving method, the device, the equipment and the storage medium for accelerated calculation, the first data are sequentially written into the memory through the driving end, the data writing request is sequentially stored in the first message queue, so that the heterogeneous acceleration end performs accelerated calculation on the first data to obtain the second data, the position information of the second data is stored in the second message queue, and finally the second data are sequentially read from the corresponding position of the memory, so that the calculation real-time performance is improved, and the time loss is reduced.
Drawings
FIG. 1 is a diagram of an application environment for a data-driven approach to accelerated computing in one embodiment;
FIG. 2 is a flow diagram of a data-driven method for accelerating computations in one embodiment;
FIG. 3 is a flow diagram illustrating storage of a data write request in a first message queue, according to one embodiment;
FIG. 4 is a flow diagram illustrating a process for writing second data to memory according to one embodiment;
FIG. 5 is a flow diagram illustrating storing location information for second data in one embodiment;
FIG. 6 is a flow chart illustrating a process of storing location information of second data according to another embodiment;
FIG. 7 is a flow diagram illustrating a process for reading second data from memory according to one embodiment;
FIG. 8 is a schematic diagram of a process for obtaining a time parameter according to an embodiment;
FIG. 9 is a block diagram of a data driving apparatus for accelerating computations in one embodiment;
FIG. 10 is a diagram showing an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In the following embodiments, the first data comprises: data to be accelerated, the second data comprising: the calculation of data has been accelerated.
The data-driven method for accelerating the computation provided by the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. For example, the data driving method for accelerated computation provided by the application can be applied to a scenario of accelerated computation of data by using an acceleration board, for example, the method is suitable for a heterogeneous accelerated computation scene of a PostgreSQL database, and when a block initiates a write request due to the fact that the data size of the data to be accelerated computed at a host end is large, the new acceleration request can be initiated only after the last acceleration calculation is finished and the calculation result is obtained, so that the problems of long time consumption, poor real-time performance and the like exist, the first data are sequentially written into the memory through the driving end, and the data writing requests are sequentially stored in the first message queue, the heterogeneous acceleration end performs acceleration calculation on the first data to acquire second data, the position information of the second data is stored in the second message queue, and finally the second data is sequentially read from the corresponding position of the memory, so that the calculation real-time performance is improved, and the time loss is reduced. In some implementation processes, the processor structure may adopt an organization mode of a Central Processing Unit (CPU) and a Field Programmable Gate Array (FPGA), the CPU is used as a main processor of the host, and the FPGA is used as an acceleration end to accelerate a large number of data sets from a database of the host, so as to release CPU resources and achieve an effect of hardware heterogeneous acceleration.
In one embodiment, as shown in FIG. 2, a data-driven method for accelerating computations is provided, comprising the steps of:
s1: establishing a first message queue and a second message queue at a driving end, acquiring at least one first data and a corresponding data writing request from a host end through the driving end, sequentially writing the first data into a memory, and sequentially storing the data writing request in the first message queue, wherein the data writing request comprises: location information of the first data;
s2: reading the first message queue to obtain the first data and the corresponding position information thereof, so that the acceleration end compiles the first data to obtain second data, writes the second data into the memory, and writes the position information of the second data into a register;
s3: reading the register through the driving end, acquiring the position information of the second data, and storing the position information of the second data in the second message queue;
s4: and acquiring a data reading request from a host end through the driving end, reading the position information of the second data from the second message queue, and reading the second data from the memory according to the position information of the second data.
Through the steps, the problem that when the host application terminal initiates an acceleration request in blocks due to too large data volume of data to be accelerated, a new acceleration request can be initiated only after the previous acceleration calculation process is completed and an acceleration calculation result is obtained, and the problems of long consumed time, poor real-time performance and the like can be solved, and the data driving method for accelerating calculation in the embodiment can be applied to a Linux operating system, and the flow comprises the following steps: the method comprises the steps of initializing an expansion bus of the high-speed serial computer, driving and registering, adding equipment, obtaining resources, accessing a cache request by a direct memory, enabling interruption and initializing a read-write channel of the direct memory access, establishing a first message queue and a second message queue at a driving end, sequentially writing first data into the memory through the driving end, sequentially storing the data write request in the first message queue so as to enable a heterogeneous acceleration end to perform accelerated calculation on the first data, obtaining second data, storing position information of the second data in the second message queue, and finally sequentially reading the second data from corresponding positions of the memory, so that the calculation instantaneity is improved, and the time loss is reduced.
When the data size of the first data is too large, the host may split the first data to form a plurality of first data, then write the first data block into the memory, initiate a write request, and initiate a new acceleration request after the acceleration calculation of the first data is completed and the second data is obtained, so that the complete process needs to be repeated to obtain the second data each time the acceleration calculation of the first data is performed, in order to accelerate the process, in step S1, illustratively, a first message queue and a second message queue are established at the driving end, the first data and the corresponding data write request from the host are obtained, the first data are sequentially written into the memory, and the data write requests are sequentially stored in the first message queue, for example, the host may split the data to be accelerated to form a plurality of first data with different initial addresses, and writing a plurality of data blocks into the memory in sequence according to the first address and the length of the data, simultaneously initiating a plurality of write requests, and sequentially storing the plurality of write requests in the first message queue, so that a subsequent acceleration end sequentially reads the write requests and then completes accelerated calculation of the data as required, and simultaneously simplifying a data transmission process, wherein the memory comprises but is not limited to: double rate synchronous dynamic random access memory.
In order to read the first data from the memory for performing the accelerated computation and reasonably store the second data, in step S2, illustratively, the location information of the first data is read by the acceleration end, the first data is read from the memory and compiled to obtain the second data, and the second data is written into the memory, for example, after the heterogeneous acceleration end obtains the address and the length of the first data, the first data is read from the memory and the accelerated computation processing is performed on the first data to obtain the second data, and then the second data is written into the memory, and in case that there are a plurality of first data in the memory, the addresses and the lengths of the plurality of first data are sequentially obtained, the plurality of first data are sequentially read from the memory, then the accelerated computation processing is performed on the plurality of first data sequentially to obtain a plurality of second data, and finally the plurality of second data are sequentially written into the memory, therefore, the result of simultaneously storing a plurality of second data is facilitated, and the data storage performance is improved.
In order to transmit the location information of the second data to the host, in step S3, the driving end exemplarily illustrates that the driving end stores the location information of the second data in the second message queue, for example, the length of the second data is stored in the second message queue, so that the subsequent driving end reads the second data from the memory according to the length of the second data, and for a case that there are a plurality of second data, the lengths of the plurality of second data may be sequentially stored in the second message queue, so that the subsequent driving end sequentially reads the plurality of second data from the memory according to the lengths of the plurality of second data.
To acquire the second data, in step S4, it is exemplarily illustrated that, according to the data read request, the location information of the second data is acquired from the second message queue, and the second data is read from the memory, for example, when the driving end acquires the data read request sent by the host end through the interface function, the location information of the second data is read from the second message queue, so as to acquire the length of the second data, and the second data is read from the memory according to the length of the second data and the storage address of the second data.
Before performing accelerated computation on first data, first data needs to be acquired, and information of the first data needs to be stored, as shown in fig. 3, the step of acquiring, by the driving end, at least one first data from the host end and a data write request corresponding to the first data, sequentially writing the first data into a memory, and sequentially storing the data write request in the first message queue includes:
s11: the driving end acquires at least one data writing request from the host end through an interface function;
s12: the driving end obtains at least one first data through a writing function of an operating system, and sequentially writes the first data into the memory through a memory access writing function;
s13: and sequentially storing the position information of the first data in the first message queue.
As shown in fig. 3, in step S11, it is exemplarily illustrated that the data write request is obtained through an interface function, for example, the driving end obtains the data write request sent by the host end through an input/output interface control function, and the first data address and the first data length can be obtained through the data write request, for a case that a plurality of first data exist, the host end sequentially sends the plurality of data write requests, and the driving end sequentially obtains the plurality of first data addresses and the plurality of first data lengths through the input/output interface control function, and is used for representing location information of the plurality of first data, so as to write the first data into the memory according to the corresponding location and transmit the location information of the first data.
As shown in fig. 3, in step S12, it is exemplarily illustrated that after the first data is obtained, the first data is written into the memory through the memory access write function, for example, the host sends the service data to be accelerated to the driving end as the first data through the writing function of the operating system, after the driving end acquires the first data, the first data can be written into the memory according to the first data address and the first data length through the direct memory access write function, in the case where there are a plurality of first data, the address and length of each first data are taken as a set of position information, according to the multiple sets of position information, obtaining corresponding positions where the multiple first data should be stored in the memory, sequentially writing the multiple first data into the memory, and facilitating accelerated calculation of the first data by a subsequent acceleration end, wherein in some implementation processes, the position information of the first data further includes: the method includes the steps that first data addresses, first data lengths and second data addresses are preset, namely addresses where second data acquired after accelerated calculation are stored in a memory are preset before the first data are accelerated, and when a plurality of pieces of first data exist, intervals of a plurality of second data addresses can be set according to the first data lengths, for example, the intervals of the plurality of second data addresses are set to be one fourth of the first data lengths, for example, when the first data lengths are N megabytes, the intervals of the plurality of second data addresses can be set to be N/4 megabytes.
As shown in fig. 3, in step S13, it is exemplarily illustrated that the location information of the first data is stored in the first message queue, for example, after the data write request is obtained and the first data is stored in the memory at the corresponding location, the location information of the first data is stored in the first message queue, and in case that there are a plurality of first data, the location information of the plurality of first data in the plurality of data write requests is sequentially stored in the first message queue for the subsequent acceleration of the first data, and in some implementations, the location information of the first data includes: in some implementations, the first data address and the first data length further include, in the location information of the first data: the first data address, the first data length and the second data address are addresses where the second data acquired after the acceleration calculation is completed are stored in the memory, which are preset before the first data is accelerated.
As shown in fig. 4, in some embodiments, in order to use the first data for the acceleration end to perform acceleration calculation after the driving end writes the first data into the memory and stores the request item in the first message queue, the step of obtaining the first data and the corresponding location information by reading the first message queue, so that the acceleration end compiles the first data to obtain the second data, and writes the second data into the memory includes:
s21: sequentially reading the position information of the first data from the position information of the first data in the first message queue, wherein the position information of the first data comprises: a first data address, a first data length;
s22: writing the first data address into a first data address register, and writing the first data length into a first data length register;
s23: reading the first data address register and the first data length register to obtain the first data address and the first data length, so that an acceleration end reads the first data from the memory, performs accelerated compilation on the first data to obtain the second data, writes the second data into the memory, and writes the position information of the second data into a register, where the register includes: a second data address register and a second data length register.
As shown in fig. 4, in step S21, it is exemplarily illustrated that the location information of the first data in the first message queue is sequentially read, for example, the location information of the first data is sequentially read from the first data write request in the first message queue until all the location information is obtained, where the location information of the first data includes: the first data address, the first data length, and in some implementations, the location information of the first data further includes: the acceleration processing method comprises the steps of obtaining a first data address, a first data length and a second data address, namely presetting an address where second data obtained after acceleration calculation is completed is stored in a memory before the first data is accelerated, so that a subsequent acceleration end can perform positioning according to position information of the first data, and reading the first data from the memory for acceleration calculation.
As shown in fig. 4, in step S22, it is exemplarily illustrated that the location information of the first data is stored in corresponding registers, for example, the first data address is written into the first data address register, the first data length is written into the first data length register, in some implementations, the second data length is also written into the second data length register, and for the case that there are multiple first data, the multiple first data addresses are sequentially written into the first data address register, the multiple first data lengths are written into the first data length register, and the multiple second data addresses are written into the second data address register, so that the acceleration end sequentially reads the contents of the multiple registers according to the sequence, and sequentially reads the first data from the memory according to the location information stored in the first data, wherein there are six base address registers in the high-speed serial computer expansion bus hardware, in this embodiment, two base address registers are used, the first base address register is used to store registers such as a first data address, a first data length, a second data address, a second data length, a first time parameter, and a second time parameter, which are self-defined, and the second base address register is used to store a control status register related to direct memory access.
As shown in fig. 4, in step S23, illustratively, the first data is read from the memory, and the first data is accelerated and compiled to obtain the second data, and the second data is written into the memory, and the location information of the second data is written into the corresponding register, for example, the acceleration end reads the first data address register and the first data length register to obtain the first data address and the first data length, reads the first data from the corresponding location of the memory, and performs accelerated computation on the first data to obtain the second data, in some implementations, the acceleration module may employ a heterogeneous acceleration board to perform accelerated computation on the first data according to an acceleration computation logic unit to obtain the second data, so as to save the computation resource of the cpu, where the acceleration computation logic unit includes, but is not limited to: OpenCL, Kernel; after the second data is obtained, calculating a second data length, writing the second data into the memory according to a second data address, writing the second data address into the second data address register, and writing the second data length into the second data length register, in some implementation processes, the second data address may be obtained from the location information of the first data, in other implementation processes, the second data address may also be set according to the first data address, the first data length, and the second data length, for example, in the case of a plurality of first data, the plurality of first data are subjected to accelerated calculation, the plurality of second data are obtained, the address of the last first data is added with the corresponding length thereof to obtain the address of the first second data, then the actual length of the first second data is added to obtain the address of the second data until all the addresses of the second data are obtained, and finally, writing the addresses of the plurality of second data into a second data register, and writing the lengths of the plurality of second data into a second length register.
After the location information of the second data has been written into the corresponding register, the content of the register needs to be read, so as to obtain the location information of the second data, as shown in fig. 5, the step of storing the location information of the second data in the second message queue includes:
s31: reading a second data length register through the driving end to obtain a second data length;
s32: and sequentially storing the second data length in the second message queue.
Through the steps, the position information of the second data can be obtained and stored in the second message queue in sequence, so that the driving end can read the second data from the memory in sequence according to the position information of the second data.
As shown in fig. 5, in step S31, it is exemplarily illustrated that the second data length register is read to obtain the second data length, for example, the driving end is connected to the heterogeneous acceleration board via the 3.0 version of the high-speed serial computer expansion bus, the second data length register is read to obtain the second data length, where the number of link channels of the high-speed serial computer expansion bus may be 16, and thus the 16-channel bidirectional bandwidth may be up to 32GB/S, and the Interrupt mode used in the driving may be a Message Signaled Interrupt (MSI) mode, and in case of multiple second data, the second data length register is read in sequence to obtain multiple second data lengths, so as to sequentially store the second data lengths in the second Message queue, and in other embodiments, the Interrupt mode used in the driving may also be a Message Signaled Interrupt eXtended (Message Signaled Interrupt eXtended, MSI-X) mode.
As shown in fig. 5, in step S32, it is exemplarily illustrated that the second data lengths are sequentially stored in the second message queue, for example, the driving end stores the second data lengths in the second message queue, and for a case that there are a plurality of second data lengths, the plurality of second data lengths are sequentially stored in the second message queue, so that after the subsequent host end sends a read request, the driving end reads the second data from the memory through the direct memory access read function.
In other implementation processes, the address and the length of the second data may also be obtained at the same time, as shown in fig. 6, the step of storing the location information of the second data in the second message queue further includes:
s33: reading a second data length register and a second data address register through the driving end to obtain a second data length and a second data address;
s34: and sequentially storing the second data length and the second data address in the second message queue.
Through the steps, the position information of the second data can be obtained and stored in the second message queue in sequence, so that the driving end can read the second data from the memory in sequence according to the position information of the second data.
As shown in fig. 6, in step S33, it is exemplarily illustrated that the second data length register and the second data address register are read to obtain the second data length and the second data address, for example, the driving end is connected to the heterogeneous acceleration board via the high-speed serial computer expansion bus, the second data length register and the second data address register are read to obtain the second data length and the second data address, and for the case that there are a plurality of second data, the second data length register and the second data address register are sequentially read to obtain a plurality of second data lengths and second data addresses, so that the second data length and the second data address are sequentially stored in the second message queue.
As shown in fig. 6, in step S34, it is exemplarily illustrated that the second data length and the second data address are sequentially stored in the second message queue, for example, the driving end stores the second data length and the second data address in the second message queue, and for a case that there are a plurality of second data, the plurality of second data lengths and the plurality of second data addresses are sequentially stored in the second message queue, so that after the subsequent host end sends a read request, the driving end reads the second data from the memory through the direct memory access read function.
After the location information of the second data is stored in the second message queue, the content stored in the second message queue needs to be read at a proper time, so as to read the second data from the memory in sequence, as shown in fig. 7, the step of obtaining a data read request from a host through the driving end, reading the location information of the second data from the second message queue, and reading the second data from the memory according to the location information of the second data includes:
s41: the drive end acquires the data reading request from the host end through an interface function;
s42: reading the position information of the second data from the second message queue according to the data reading request;
s43: and according to the position information of the second data, reading the second data from the memory through a memory access read function.
Through the steps, the position information of the second data can be obtained from the second message queue according to the data reading request of the host side, and the second data can be sequentially read from the corresponding positions in the memory according to the position information of the second data.
As shown in fig. 7, in step S41, it is exemplarily illustrated that the data read request sent by the host is obtained through an interface function, for example, the drive end obtains the data read request sent by the host through an input/output interface control function, so as to obtain the location information of the second data subsequently.
As shown in fig. 7, in step S42, it is exemplarily illustrated that the location information of the second data is read from the second message queue according to the data read request, for example, after the driver obtains the data read request sent by the host, the location information of the second data is obtained from the second message queue, where the location information of the second data includes: the length of the second data is convenient for reading the second data from the corresponding position in the memory according to the length of the acquired second data and a preset second data address, and in other implementation processes, the position information of the second data further includes: and the address of the second data is used for reading the second data from the corresponding position in the memory through the read second data address and the second data length, and for the condition that a plurality of second data exist, the position information of the second data is sequentially obtained from the second message queue.
In order to obtain the time consumed for heterogeneous acceleration, the time consumed by the whole heterogeneous acceleration process can be obtained by subtracting the time from the time when the write data is initiated to the time when the read data is terminated through the host, but the time obtained above includes the software system calling time and the time consumed by the heterogeneous acceleration calculation, and the accurate time consumed by the heterogeneous acceleration hardware acceleration calculation cannot be obtained, so as shown in fig. 8, the data driving method for acceleration calculation further includes:
s51: recording a time node of starting acceleration calculation by an acceleration end as a first time parameter;
s52: recording a time node for the acceleration end to finish acceleration calculation as a second time parameter;
s53: and the driving end transmits the first time parameter and the second time parameter to a host program end through an interface function.
Through the steps, the time spent on simultaneously acquiring the software system calling time and the heterogeneous acceleration calculation can be avoided, and the accurate time consumed by the heterogeneous acceleration hardware acceleration calculation can be independently acquired.
As shown in fig. 8, in step S51, it is exemplarily described that the time node at which the acceleration end starts performing the acceleration calculation is recorded, for example, when the acceleration end acquires the first data and starts the acceleration calculation process, the time node at the current time is recorded as the first time parameter, and the first time parameter is written into the first time parameter register, so as to accurately record the hardware acceleration consumed time of the heterogeneous acceleration end.
As shown in fig. 8, in step S52, it is exemplarily illustrated that the time node when the acceleration end finishes the acceleration calculation is recorded, for example, when the acceleration end obtains the last first data and finishes the acceleration calculation process, the time node at the current time is recorded as the second time parameter, and the second time parameter is written into the second time parameter register, so as to accurately record the hardware acceleration consumption time of the heterogeneous acceleration end.
As shown in fig. 8, in step S53, for example, the host reads the values of the first time parameter register and the second time parameter register through the i/o control interface mechanism, obtains the first time parameter and the second time parameter, and then subtracts the value of the first time parameter from the value of the second time parameter, that is, obtains the precise time consumed by the acceleration calculation of the heterogeneous acceleration hardware, so as to evaluate the acceleration performance.
In one embodiment, as shown in fig. 9, there is provided a data driving apparatus for accelerating calculation, the data driving apparatus for accelerating calculation comprising:
the first message queue processing module is configured to establish a first message queue and a second message queue at a driving end, acquire at least one first data and a data write request corresponding to the first data from a host through the driving end, sequentially write the first data into a memory, and sequentially store the data write request in the first message queue, where the data write request includes: location information of the first data;
the first data processing module is used for reading the first message queue to obtain the first data and the corresponding position information thereof, so that the acceleration end compiles the first data to obtain second data, and writes the second data into the memory;
the second message queue processing module is used for reading the register through the driving end, acquiring the position information of the second data and storing the position information of the second data in the second message queue;
and the second data reading module is used for acquiring a data reading request from a host end through the driving end, reading the position information of the second data from the second message queue, and reading the second data from the memory according to the position information of the second data.
In the first message queue processing module, it is exemplarily described that the first data and the data write request corresponding to the first data are obtained from the host, the first data are sequentially written into the memory, and the data write request is sequentially stored in the first message queue, for example, the host may split the data to be accelerated and calculated to form a plurality of data blocks with different head addresses, and sequentially write the plurality of data blocks into the memory according to the head addresses and the length of the data, and simultaneously initiate a plurality of write requests, and sequentially store the plurality of write requests in the first message queue, so that the subsequent acceleration end sequentially reads the write requests and then completes the accelerated calculation of the data as needed, and simultaneously simplifies the data transmission flow.
In the first data processing module, it is exemplarily illustrated that the location information of the first data is read, the first data is read from the memory, compiling the first data to obtain second data, writing the second data into the memory, for example, after the accelerator obtains the address of the first data and the length of the first data, the accelerator reads the first data from the memory, and the first data is accelerated to obtain the second data, then the second data is written into the memory, when the memory has a plurality of first data, the addresses and lengths of the plurality of first data are sequentially obtained, the plurality of first data are sequentially read from the memory, and then sequentially carrying out accelerated calculation processing on the plurality of first data to obtain a plurality of second data, and finally sequentially writing the plurality of second data into the memory so as to simultaneously store the results of the plurality of second data and improve the data storage performance.
In the second message queue processing module, it is exemplarily described that the location information of the second data is stored in the second message queue, for example, the length of the second data is stored in the second message queue, so that the subsequent driving end reads the second data from the memory according to the length of the second data, and in case that a plurality of second data exist, the lengths of the plurality of second data may be sequentially stored in the second message queue, so that the subsequent driving end sequentially reads the plurality of second data from the memory according to the lengths of the plurality of second data.
In the second data reading module, it is exemplarily illustrated that, according to the data reading request, the location information of the second data is obtained from the second message queue, and the second data is read from the memory, for example, when the driving end obtains the data reading request sent by the host end through the interface function, the location information of the second data is read from the second message queue, so as to obtain the length of the second data, and the second data is read from the memory according to the length of the second data and the storage address of the second data.
The device can be applied to a Linux operating system, and the process comprises the following steps: the method comprises the steps of initializing an expansion bus of a high-speed serial computer, driving and registering, adding equipment, obtaining resources, applying for a direct memory access cache request, enabling MSI interruption, initializing a direct memory access read-write channel and the like.
For specific limitations of the data driving apparatus for accelerating the calculation, reference may be made to the above limitations of the data driving method for accelerating the calculation, which are not described herein again. The modules in the data driving apparatus for accelerating the computation may be wholly or partially implemented by software, hardware, or a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 10. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing driving data for data acceleration. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a data-driven method for accelerating computations.
Those skilled in the art will appreciate that the architecture shown in fig. 10 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
establishing a first message queue and a second message queue at a driving end, acquiring at least one first data and a corresponding data writing request from a host end through the driving end, sequentially writing the first data into a memory, and sequentially storing the data writing request in the first message queue, wherein the data writing request comprises: location information of the first data;
reading the first message queue to obtain the first data and the corresponding position information thereof, so that the acceleration end compiles the first data to obtain second data, writes the second data into the memory, and writes the position information of the second data into a register;
reading the register through the driving end, acquiring the position information of the second data, and storing the position information of the second data in the second message queue;
and acquiring a data reading request from a host end through the driving end, reading the position information of the second data from the second message queue, and reading the second data from the memory according to the position information of the second data.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
establishing a first message queue and a second message queue at a driving end, acquiring at least one first data and a corresponding data writing request from a host end through the driving end, sequentially writing the first data into a memory, and sequentially storing the data writing request in the first message queue, wherein the data writing request comprises: location information of the first data;
reading the first message queue to obtain the first data and the corresponding position information thereof, so that the acceleration end compiles the first data to obtain second data, writes the second data into the memory, and writes the position information of the second data into a register;
reading the register through the driving end, acquiring the position information of the second data, and storing the position information of the second data in the second message queue;
and acquiring a data reading request from a host end through the driving end, reading the position information of the second data from the second message queue, and reading the second data from the memory according to the position information of the second data.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A data-driven method for accelerating computations, comprising:
establishing a first message queue and a second message queue at a driving end, acquiring at least one first data and a corresponding data writing request from a host end through the driving end, sequentially writing the first data into a memory, and sequentially storing the data writing request in the first message queue, wherein the data writing request comprises: location information of the first data;
reading the first message queue to obtain the first data and the corresponding position information thereof, so that the acceleration end compiles the first data to obtain second data, writes the second data into the memory, and writes the position information of the second data into a register;
reading the register through the driving end, acquiring the position information of the second data, and storing the position information of the second data in the second message queue;
and acquiring a data reading request from a host end through the driving end, reading the position information of the second data from the second message queue, and reading the second data from the memory according to the position information of the second data.
2. The data driving method for accelerating computation according to claim 1, wherein the steps of obtaining at least one first data and a data write request corresponding thereto from a host, sequentially writing the first data into a memory, and sequentially storing the data write request in the first message queue include:
the driving end acquires at least one data writing request from the host end through an interface function;
the driving end obtains at least one first data through a writing function of an operating system, and sequentially writes the first data into the memory through a memory access writing function;
and sequentially storing the position information of the first data in the first message queue.
3. The data driving method for accelerating computation according to claim 1, wherein the step of obtaining the first data and the corresponding location information thereof by reading the first message queue, so that the accelerating end compiles the first data to obtain second data, and writes the second data into the memory comprises:
sequentially reading the position information of the first data from the position information of the first data in the first message queue, wherein the position information of the first data comprises: a first data address, a first data length;
writing the first data address into a first data address register, and writing the first data length into a first data length register;
reading the first data address register and the first data length register to obtain the first data address and the first data length, so that an acceleration end reads the first data from the memory, performs accelerated compilation on the first data to obtain the second data, writes the second data into the memory, and writes the position information of the second data into a register, where the register includes: a second data address register and a second data length register.
4. The data driving method for accelerating computation according to claim 1, wherein the step of reading the register through the driving end, acquiring the location information of the second data, and storing the location information of the second data in the second message queue includes:
reading a second data length register through the driving end to obtain a second data length;
and sequentially storing the second data length in the second message queue.
5. The data driving method for accelerating computation according to claim 1, wherein the step of reading the register by the driving end, obtaining the location information of the second data, and storing the location information of the second data in the second message queue further includes:
reading a second data length register and a second data address register through the driving end to obtain a second data length and a second data address;
and sequentially storing the second data length and the second data address in the second message queue.
6. The data driving method for accelerating computation according to claim 1, wherein the step of obtaining, by the driving end, a data read request from a host end, reading the location information of the second data from the second message queue, and reading the second data from the memory according to the location information of the second data includes:
the drive end acquires the data reading request from the host end through an interface function;
reading the position information of the second data from the second message queue according to the data reading request;
and according to the position information of the second data, reading the second data from the memory through a memory access read function.
7. The data-driven method for accelerating computations according to claim 1, further comprising:
recording a time node of starting acceleration calculation by an acceleration end as a first time parameter;
recording a time node for the acceleration end to finish acceleration calculation as a second time parameter;
and the driving end transmits the first time parameter and the second time parameter to a host end through an interface function.
8. A data driven apparatus for accelerating computations, comprising:
the first message queue processing module is configured to establish a first message queue and a second message queue at a driving end, acquire at least one first data and a data write request corresponding to the first data from a host through the driving end, sequentially write the first data into a memory, and sequentially store the data write request in the first message queue, where the data write request includes: location information of the first data;
the first data processing module is used for reading the first message queue to obtain the first data and the corresponding position information thereof, so that the acceleration end compiles the first data to obtain second data, and writes the second data into the memory;
the second message queue processing module is used for reading the register through the driving end, acquiring the position information of the second data and storing the position information of the second data in the second message queue;
and the second data reading module is used for acquiring a data reading request from a host end through the driving end, reading the position information of the second data from the second message queue, and reading the second data from the memory according to the position information of the second data.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the data-driven method for accelerating a computation of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the data-driven method for accelerating a computation of any one of claims 1 to 7.
CN202111520962.6A 2021-12-13 2021-12-13 Data driving method, apparatus, device and storage medium for accelerating computation Active CN114296640B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111520962.6A CN114296640B (en) 2021-12-13 2021-12-13 Data driving method, apparatus, device and storage medium for accelerating computation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111520962.6A CN114296640B (en) 2021-12-13 2021-12-13 Data driving method, apparatus, device and storage medium for accelerating computation

Publications (2)

Publication Number Publication Date
CN114296640A true CN114296640A (en) 2022-04-08
CN114296640B CN114296640B (en) 2023-08-15

Family

ID=80967317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111520962.6A Active CN114296640B (en) 2021-12-13 2021-12-13 Data driving method, apparatus, device and storage medium for accelerating computation

Country Status (1)

Country Link
CN (1) CN114296640B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017012096A1 (en) * 2015-07-22 2017-01-26 华为技术有限公司 Computer device and data read-write method for computer device
CN111104232A (en) * 2019-11-09 2020-05-05 苏州浪潮智能科技有限公司 Method, device and medium for accelerating message writing of message queue
CN113094296A (en) * 2021-04-29 2021-07-09 深圳忆联信息系统有限公司 SSD read acceleration implementation method and device, computer equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017012096A1 (en) * 2015-07-22 2017-01-26 华为技术有限公司 Computer device and data read-write method for computer device
CN111104232A (en) * 2019-11-09 2020-05-05 苏州浪潮智能科技有限公司 Method, device and medium for accelerating message writing of message queue
CN113094296A (en) * 2021-04-29 2021-07-09 深圳忆联信息系统有限公司 SSD read acceleration implementation method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN114296640B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
US20230259468A1 (en) Multi-core processing system and inter-core communication method therefor, and storage medium
WO2023123849A1 (en) Method for accelerated computation of data and related apparatus
US9542122B2 (en) Logical block addresses used for executing host commands
CN110716845B (en) Log information reading method of Android system
US11977500B2 (en) Apparatus and method and computer program product for executing host input-output commands
CN109542346A (en) Dynamic data cache allocation method, device, computer equipment and storage medium
CN116225992A (en) NVMe verification platform and method supporting virtualized simulation equipment
CN113094296B (en) SSD read acceleration realization method, SSD read acceleration realization device, computer equipment and storage medium
CN116627867B (en) Data interaction system, method, large-scale operation processing method, equipment and medium
CN117453318B (en) IOMMU-based DSP firmware using method, system chip and vehicle machine
CN114625584A (en) Test verification method and device for dynamic conversion of data transmission rate of solid state disk
CN110515872B (en) Direct memory access method, device, special computing chip and heterogeneous computing system
CN114296640A (en) Data driving method, apparatus, device and storage medium for accelerating computation
WO2024113680A1 (en) Firmware interaction method and apparatus, and server and storage medium
CN116360925A (en) Paravirtualization implementation method, device, equipment and medium
CN116225314A (en) Data writing method, device, computer equipment and storage medium
CN112764897B (en) Task request processing method, device and system and computer readable storage medium
CN116312730A (en) UFS storage device monomer test driving method and device based on MT6891 platform
US20230393782A1 (en) Io request pipeline processing device, method and system, and storage medium
CN115563021A (en) Method and device for improving repeated reading performance based on solid state disk and computer equipment
CN109284260B (en) Big data file reading method and device, computer equipment and storage medium
CN112764673A (en) Storage rate optimization method and device, computer equipment and storage medium
CN113448517B (en) Solid state disk big data writing processing method, device, equipment and medium
CN117539802B (en) Cache operation method, system and related device
CN116954676A (en) Remote upgrade system, method and equipment for automobile controller and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant