CN115934768A - Data processing method, display adapter, electronic device and storage medium - Google Patents

Data processing method, display adapter, electronic device and storage medium Download PDF

Info

Publication number
CN115934768A
CN115934768A CN202211536798.2A CN202211536798A CN115934768A CN 115934768 A CN115934768 A CN 115934768A CN 202211536798 A CN202211536798 A CN 202211536798A CN 115934768 A CN115934768 A CN 115934768A
Authority
CN
China
Prior art keywords
data
operation result
processed
host
display adapter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211536798.2A
Other languages
Chinese (zh)
Inventor
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Moore Threads Technology Co Ltd
Original Assignee
Moore Threads Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Moore Threads Technology Co Ltd filed Critical Moore Threads Technology Co Ltd
Priority to CN202211536798.2A priority Critical patent/CN115934768A/en
Publication of CN115934768A publication Critical patent/CN115934768A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The present disclosure relates to a data processing method, a display adapter, an electronic device, and a storage medium, where the processing method is applied to a display adapter, the display adapter includes a display memory and a cache, the display adapter is connected to a host, and the processing method includes: acquiring data to be processed sent by a host and an operation instruction corresponding to the data to be processed; the operation instruction is used for representing operation content of the data to be processed; calculating the data to be processed according to the operation instruction to obtain a target operation result, and storing the target operation result into the cache; and responding to the final operation result which is the target operation result, and sending the final operation result to the host through the display memory. The embodiment of the disclosure can reduce the storage frequency of the operation result, and is beneficial to improving the processing efficiency of the data to be processed.

Description

Data processing method, display adapter, electronic device and storage medium
Technical Field
The present disclosure relates to the field of information processing technologies, and in particular, to a data processing method, a display adapter, an electronic device, and a storage medium.
Background
With the continuous development of the electronic industry, the performance of a display card (or called display adapter) is concerned by more and more users, the display adapter is not only suitable for an image processing scene, but also suitable for a training scene of an artificial intelligence model, and therefore the data processing efficiency of the display adapter is directly related to the completion efficiency of upper-layer tasks in a host. Therefore, how to improve the data processing efficiency is a technical problem that developers need to solve urgently.
Disclosure of Invention
The present disclosure provides a data processing technical solution.
According to an aspect of the present disclosure, a data processing method is provided, where the data processing method is applied to a display adapter, where the display adapter includes a display memory and a cache, the display adapter is connected to a host, and the data processing method includes: acquiring data to be processed sent by a host and an operation instruction corresponding to the data to be processed; the operation instruction is used for representing the operation content of the data to be processed; calculating the data to be processed according to the operation instruction to obtain a target operation result, and storing the target operation result into the cache; the target operation result is an intermediate operation result or a final operation result, and under the condition that the storage state is the first state, the final storage position corresponding to the intermediate operation result is the cache, and the final storage position is the last stored position before the intermediate operation result is released; and responding to the final operation result which is the target operation result, and sending the final operation result to the host through the display memory.
In one possible implementation, the operation instruction comprises an atomic operation instruction; and the atomic operation instruction is used for carrying out atomic operation on the data to be processed.
In a possible implementation manner, the performing an operation on the data to be processed according to the operation instruction to obtain a target operation result includes: allocating a thread bundle for the data to be processed; the thread bundle is used for carrying out operation processing on the data to be processed acquired by the display adapter; the thread bundle comprises a plurality of threads which are arranged in sequence, and the plurality of threads carry out parallel and/or serial operation processing on the data to be processed; sequentially determining a current thread in the multiple threads, and determining reference data corresponding to the current thread; wherein the reference data is the data to be processed or the intermediate operation data; and according to the operation instruction, the reference data is operated through the current thread to obtain the target operation result.
In a possible implementation, the host further includes a driver, and the processing method further includes: changing the storage state to a second state in response to a change instruction of a processing manner for an intermediate operation result in the drive; and when the storage state is the second state, the final storage position corresponding to the intermediate operation result is the host.
In a possible implementation, the data to be processed includes any one of image data and training data related to artificial intelligence.
According to an aspect of the present disclosure, there is provided a display adapter, where the display adapter includes a display core, a display memory, and a cache, the display adapter is connected to a host, and the display core is configured to: acquiring data to be processed sent by a host and an operation instruction corresponding to the data to be processed; the operation instruction is used for representing operation content of the data to be processed; calculating the data to be processed according to the operation instruction to obtain a target operation result, and storing the target operation result into the cache; the target operation result is an intermediate operation result or a final operation result, and under the condition that the storage state is the first state, the final storage position corresponding to the intermediate operation result is the cache, and the final storage position is the last stored position before the intermediate operation result is released; and responding to the final operation result which is the target operation result, and sending the final operation result to the host through the display memory.
In one possible implementation, the operation instruction comprises an atomic operation instruction; and the atomic operation instruction is used for carrying out atomic operation on the data to be processed.
In one possible implementation, the display core is further configured to: allocating a thread bundle for the data to be processed; the thread bundle is used for carrying out operation processing on the data to be processed acquired by the display adapter; the thread bundle comprises a plurality of threads which are arranged in sequence, and the plurality of threads carry out parallel and/or serial operation processing on the data to be processed; sequentially determining a current thread in the multiple threads, and determining reference data corresponding to the current thread; wherein the reference data is the data to be processed or the intermediate operation data; and according to the operation instruction, the reference data is operated through the current thread to obtain the target operation result.
In one possible implementation, the host further includes a driver, and the display core is further configured to: changing the storage state to a second state in response to a change instruction of a processing manner for an intermediate operation result in the driving; and when the storage state is the second state, the final storage position corresponding to the intermediate operation result is the host.
In a possible implementation, the data to be processed includes any one of image data and training data related to artificial intelligence.
According to an aspect of the present disclosure, there is provided an electronic device including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to invoke the memory-stored instructions to perform the above-described method.
According to an aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method.
In the embodiment of the disclosure, to-be-processed data sent by a host and an operation instruction corresponding to the to-be-processed data may be acquired, then, the to-be-processed data is operated according to the operation instruction to obtain a target operation result, the target operation result is stored in the cache, and finally, in response to that the target operation result is the final operation result, the final operation result is sent to the host through the display memory. According to the embodiment of the disclosure, the final storage position of the intermediate operation result is set to be the cache mode, so that the storage frequency of the operation result can be reduced, and the processing efficiency of the data to be processed can be improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure. Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.
Fig. 1 shows a flow chart of a method of processing data provided according to an embodiment of the present disclosure.
FIG. 2 illustrates a block diagram of a display adapter provided in accordance with an embodiment of the present disclosure.
Fig. 3 shows a block diagram of an electronic device provided in accordance with an embodiment of the present disclosure.
Detailed Description
Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of a, B, C, and may mean including any one or more elements selected from the group consisting of a, B, and C.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
In the related art, the data processing flow of the display adapter usually involves the generation of intermediate operation results, the intermediate operation results are related to the final operation results, but are usually not concerned by the host, and the upper layer task in the host usually only needs the final operation results. In the related art, each intermediate operation result is stored in a buffer memory of the display adapter, and then transferred to a display memory for storage, and then transmitted to the host through a PCIE (Peripheral Component Interconnect Express) interface. The whole process is complicated, and the generated cycle is long, so that the processing time of the display adapter on the data to be processed is long.
In view of this, an embodiment of the present disclosure provides a data processing method, which may obtain data to be processed sent by a host and an operation instruction corresponding to the data to be processed, perform an operation on the data to be processed according to the operation instruction to obtain a target operation result, store the target operation result in the cache, and send the final operation result to the host through the display memory in response to the target operation result being the final operation result. According to the embodiment of the disclosure, the final storage position of the intermediate operation result is set as the cache mode, so that the storage frequency of the operation result can be reduced, and the processing efficiency of the data to be processed can be improved.
In one possible implementation manner, the display adapter according to the embodiment of the present disclosure may include: the display device comprises a display core used for processing instructions sent by a host, a display memory used for storing data, a cache, a PCIE interface used for interacting with the host and other related components. The host may include a central processing unit for processing instructions in the host, a host memory (e.g., cache, memory, etc.) for storing data, a PCIE interface for interfacing with the display adapter, a driver module for invoking each hardware in the display adapter, and other related components.
Referring to fig. 1, fig. 1 is a flowchart illustrating a processing method of data according to an embodiment of the present disclosure, and in conjunction with fig. 1, the processing method may be applied to a display adapter, where the display adapter includes a display memory and a cache. The display adapter can be connected with the host computer to process upper-layer tasks in an application program in the host computer (for example, rendering images, training artificial intelligent models and the like). The processing method comprises the following steps: step S100, acquiring data to be processed sent by a host and an operation instruction corresponding to the data to be processed. The operation instruction is used for representing operation content of the data to be processed. For example, when the host executes the application program, the data that the application program needs to process may be sent to the display adapter as the data to be processed, for example: image data needing to be rendered in a game or a video, sample features needing to be extracted in an artificial intelligence model and the like. The embodiment of the present disclosure does not limit the specific data type of the data to be processed, and the display adapter may process the data. In one example, the operation instruction may include a specific operation function and a reference value. The operation function is used to indicate the operation type of the data to be processed, for example: summing, multiplying, adding by atomic operations, etc., embodiments of the disclosure are not limited herein. The reference value is used for representing a target object when the data to be processed is subjected to the operation. Taking the operation type as an atomic operation addition as an example, the operation instruction may be atomic add (value, add _ num), where atomic add () represents an operation function corresponding to the atomic operation addition, and add _ num represents the reference value for adding an intermediate operation result corresponding to the value or value of the data to be processed. It should be understood that the above-described operational instructions may also be used to represent other types of operations. In a possible implementation, the data to be processed includes any one of image data and training data related to artificial intelligence. In one possible implementation, the operation instruction may comprise an atomic operation instruction. And the atomic operation instruction is used for carrying out atomic operation on the data to be processed. Illustratively, the atomic operation instruction may include: the specific inclusion contents include the following related technologies, and details are not described herein in this disclosure. In one example, the atomic operations described above may be used to represent operations that are not interrupted by a thread scheduling mechanism.
And step S200, operating the data to be processed according to the operation instruction to obtain a target operation result, and storing the target operation result to the cache. The target operation result is an intermediate operation result or a final operation result, and when the storage state is the first state, the final saving position corresponding to the intermediate operation result is the cache, and the final saving position is the last position stored before the intermediate operation result is released. Illustratively, each intermediate operation result in the related art goes through the following process: and storing the data into a cache, transferring the data into a display memory, and sending the data into a host through a PCIE interface. In the processing method provided by the embodiment of the disclosure, the intermediate operation result does not need to be stored in the display memory or sent to the host, so that the storage cost of each intermediate operation result can be saved, and the processing time of the data to be processed on the whole can be reduced.
In one possible implementation, step S200 may include: and allocating thread bundles for the data to be processed. The thread bundle is used for carrying out operation processing on the data to be processed acquired by the display adapter. The thread bundle comprises a plurality of threads which are arranged in sequence, and the plurality of threads carry out parallel and/or serial operation processing on the data to be processed. For example, a display core in the display adapter may allocate an idle thread bundle for data to be processed, and the specific implementation manner of the embodiment of the present disclosure is not limited herein. In one example, the thread bundle described above may represent warp in the related art, which may include 32 threads. It should be understood that the data to be processed may also be processed via multiple threads, and the disclosed embodiments are not limited thereto. And then sequentially determining a current thread in the multiple threads and determining reference data corresponding to the current thread. Wherein the reference data is the data to be processed or the intermediate operation data. Each thread may correspond to a unique number, and the display core may determine the current thread by the unique number. Then, according to the operation instruction, the reference data is operated by the current thread to obtain the target operation result, for example, 32 threads in one thread bundle in the display adapter perform atomic operation addition, where it is assumed that an execution order is 0 to 31 (in some examples, the execution order may also be concurrent execution), a start value of 0 (or to-be-processed data) is stored in a cache, the thread 0 performs 1 addition and updates the value to be 1, the value is stored in the cache, and after the processing of the 32 threads, a value of 32 is finally obtained (where values 11, 15, 30, and the like may be regarded as the intermediate operation result, a value of 11 may be regarded as reference data required for calculating the value to be 12, and a value of 32 obtained by the last operation may be regarded as a final operation result). Assuming that the display core generates an operation completion instruction after 32 threads have been executed, the operation is stopped, and the value obtained by the thread 31 is used as the final operation result. And the final result is transferred and stored to a display memory from the cache, and then returned to the host through the PCIE interface. In the related art, the intermediate result obtained each time needs to be sent to the host, and the steps are complicated, so the embodiment of the disclosure can improve the processing efficiency of the data to be processed. Furthermore, bandwidth usage between the display adapter and the host may also be reduced as there is less interaction with the host than with related art. In the case that the data to be processed includes floating point numbers, the intermediate result of each thread will be represented as a floating point value, and precision alignment is required for each floating point value, and a rounding mode (for example, the rounding mode may include different rounding to nearest even, rounding to zero, and the like) is usually selected to round the floating point value, that is, each thread further needs to perform a rounding operation to obtain an intermediate operation result, and if the number of threads is large, the calculation cost is also increased. In the embodiment of the present disclosure, since the intermediate operation result is stored in the cache, the rounding operation is not required before sending to the video memory, so that the number of times of executing the rounding operation in the whole operation process can be saved. In addition, the embodiment of the disclosure is also beneficial to improving the precision of the data to be processed in the whole data processing flow due to the reduction of the number of rounding operations.
In one possible embodiment, the host further comprises a driver, and the method further comprises: and changing the storage state to a second state in response to a change instruction of a processing method for an intermediate operation result in the driving. And when the storage state is the second state, the final storage position corresponding to the intermediate operation result is the host. Illustratively, a user can change the processing mode of the display adapter for the intermediate operation result through driving through a visual interface in the host, which is beneficial to meeting the operation requirements of different users. In an example, the driver may also allocate a different processing manner of the intermediate operation result to each application program for different application programs in the host, so as to further meet a specific requirement of a user, and the embodiment of the present disclosure is not limited herein.
Continuing to refer to fig. 1, in step S300, in response to the target operation result being the final operation result, the final operation result is sent to the host through the display memory. For example, after the host obtains the final operation result, the host may call the final operation result through an application program running in the host, so that the application program may start to perform a preset next operation.
It is understood that the above-mentioned embodiments of the method of the present disclosure can be combined with each other to form a combined embodiment without departing from the principle logic, which is limited by the space, and the detailed description of the present disclosure is omitted. Those skilled in the art will appreciate that in the above methods of the specific embodiments, the specific order of execution of the steps should be determined by their function and possibly their inherent logic.
In addition, the present disclosure also provides a display adapter, an electronic device, a computer-readable storage medium, and a program, which can be used to implement any data processing method provided by the present disclosure, and the descriptions and corresponding descriptions of the corresponding technical solutions and the corresponding descriptions in the method section are omitted for brevity.
Referring to fig. 2, fig. 2 is a block diagram illustrating a display adapter provided according to an embodiment of the present disclosure, and in conjunction with fig. 2, the display adapter 100 includes: a display core 110, a display memory 120, and a cache 130, where the display adapter is connected to the host, and the display core is configured to: acquiring data to be processed sent by a host and an operation instruction corresponding to the data to be processed; the operation instruction is used for representing operation content of the data to be processed; calculating the data to be processed according to the operation instruction to obtain a target operation result, and storing the target operation result to the cache; the target operation result is an intermediate operation result or a final operation result, and under the condition that the storage state is the first state, the final storage position corresponding to the intermediate operation result is the cache, and the final storage position is the last stored position before the intermediate operation result is released; and responding to the final operation result which is the target operation result, and sending the final operation result to the host through the display memory.
In one possible implementation, the operation instruction comprises an atomic operation instruction; and the atomic operation instruction is used for carrying out atomic operation on the data to be processed.
In one possible implementation, the display core is further configured to: allocating thread bundles for the data to be processed; the thread bundle is used for carrying out operation processing on the data to be processed acquired by the display adapter; the thread bundle comprises a plurality of threads which are arranged in sequence, and the plurality of threads carry out parallel and/or serial operation processing on the data to be processed; sequentially determining a current thread in the multiple threads, and determining reference data corresponding to the current thread; wherein the reference data is the data to be processed or the intermediate operation data; and according to the operation instruction, the reference data is operated through the current thread to obtain the target operation result.
In one possible implementation, the host further includes a driver, the display core is further configured to: changing the storage state to a second state in response to a change instruction of a processing manner for an intermediate operation result in the drive; and when the storage state is the second state, the final storage position corresponding to the intermediate operation result is the host.
In a possible implementation, the data to be processed includes any one of image data and training data related to artificial intelligence.
The method has specific technical relevance with the internal structure of the computer system, and can solve the technical problems of how to improve the hardware operation efficiency or the execution effect (including reducing data storage capacity, reducing data transmission capacity, improving hardware processing speed and the like), thereby obtaining the technical effect of improving the internal performance of the computer system according with the natural law.
In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, will not be described again here.
Embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the above-mentioned method. The computer readable storage medium may be a volatile or non-volatile computer readable storage medium.
An embodiment of the present disclosure further provides an electronic device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to invoke the memory-stored instructions to perform the above-described method.
The disclosed embodiments also provide a computer program product comprising computer readable code or a non-transitory computer readable storage medium carrying computer readable code, which when run in a processor of an electronic device, the processor in the electronic device performs the above method.
The electronic device may be provided as a server, a terminal device, or other modality of device.
Referring to fig. 3, fig. 3 illustrates a block diagram of an electronic device 1900 provided according to an embodiment of the present disclosure. For example, the electronic device 1900 may be provided as a server or terminal device. Referring to fig. 3, electronic device 1900 includes a processing component 1922 further including one or more processors and memory resources, represented by memory 1932, for storing instructions, e.g., applications, that are executable by processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the methods described above.
The electronic device 1900 may further include a power component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input-output interface 1958. The electronic device 1900 may operate based on an operating system, such as a Microsoft Server operating system (Windows Server), stored in the memory 1932 TM ) Apple Inc. of the present application based on the graphic user interface operating System (Mac OS X) TM ) Multi-user, multi-process computer operating system (Unix) TM ) Free and open native code Unix-like operating System (Linux) TM ) Open native code Unix-like operating System (FreeBSD) TM ) Or the like.
In an exemplary embodiment, a non-transitory computer readable storage medium, such as the memory 1932, is also provided that includes computer program instructions executable by the processing component 1922 of the electronic device 1900 to perform the above-described methods.
The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.
The computer-readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as a punch card or an in-groove protruding structure with instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the disclosure are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The computer program product may be embodied in hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.
The foregoing description of the various embodiments is intended to highlight different aspects of the various embodiments that are the same or similar, which can be referenced with one another and therefore are not repeated herein for brevity.
It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.
If the technical scheme of the application relates to personal information, a product applying the technical scheme of the application clearly informs personal information processing rules before processing the personal information, and obtains personal independent consent. If the technical scheme of the application relates to sensitive personal information, a product applying the technical scheme of the application obtains individual consent before processing the sensitive personal information, and simultaneously meets the requirement of 'express consent'. For example, at a personal information collection device such as a camera, a clear and significant identifier is set to inform that the personal information collection range is entered, the personal information is collected, and if the person voluntarily enters the collection range, the person is regarded as agreeing to collect the personal information; or on the device for processing the personal information, under the condition of informing the personal information processing rule by using obvious identification/information, obtaining personal authorization by modes of popping window information or asking a person to upload personal information of the person by himself, and the like; the personal information processing rule may include information such as a personal information processor, a personal information processing purpose, a processing method, and a type of personal information to be processed.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or improvements to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (12)

1. A data processing method is characterized in that the method is applied to a display adapter, the display adapter comprises a display memory and a cache, the display adapter is connected with a host, and the processing method comprises the following steps:
acquiring data to be processed sent by a host and an operation instruction corresponding to the data to be processed; the operation instruction is used for representing operation content of the data to be processed;
calculating the data to be processed according to the operation instruction to obtain a target operation result, and storing the target operation result to the cache; the target operation result is an intermediate operation result or a final operation result, and under the condition that the storage state is the first state, the final storage position corresponding to the intermediate operation result is the cache, and the final storage position is the position stored last time before the intermediate operation result is released;
and responding to the final operation result which is the target operation result, and sending the final operation result to the host through the display memory.
2. The processing method of claim 1, wherein the operation instruction comprises an atomic operation instruction; and the atomic operation instruction is used for carrying out atomic operation on the data to be processed.
3. The processing method of claim 2, wherein said operating on the data to be processed according to the operation instruction to obtain a target operation result comprises:
allocating thread bundles for the data to be processed; the thread bundle is used for carrying out operation processing on the data to be processed acquired by the display adapter; the thread bundle comprises a plurality of threads which are arranged in sequence, and the plurality of threads carry out parallel and/or serial operation processing on the data to be processed;
sequentially determining a current thread in the multiple threads, and determining reference data corresponding to the current thread; wherein the reference data is the data to be processed or the intermediate operation data;
and according to the operation instruction, the reference data is operated through the current thread to obtain the target operation result.
4. The process of claim 1, wherein the host further comprises a driver, the process further comprising:
changing the storage state to a second state in response to a change instruction of a processing manner for an intermediate operation result in the drive; and when the storage state is the second state, the final storage position corresponding to the intermediate operation result is the host.
5. The processing method according to any one of claims 1 to 4, wherein the data to be processed includes any one of image data and artificial intelligence related training data.
6. A display adapter, comprising a display core, a display memory, and a cache, wherein the display adapter is connected to a host, and the display core is configured to:
acquiring data to be processed sent by a host and an operation instruction corresponding to the data to be processed; the operation instruction is used for representing operation content of the data to be processed;
calculating the data to be processed according to the operation instruction to obtain a target operation result, and storing the target operation result to the cache; the target operation result is an intermediate operation result or a final operation result, and under the condition that the storage state is the first state, the final storage position corresponding to the intermediate operation result is the cache, and the final storage position is the last stored position before the intermediate operation result is released;
and responding to the final operation result which is the target operation result, and sending the final operation result to the host through the display memory.
7. The display adapter of claim 6, wherein the operation instruction comprises an atomic operation instruction; and the atomic operation instruction is used for carrying out atomic operation on the data to be processed.
8. The display adapter as described in claim 7, wherein the display core is further configured to:
allocating a thread bundle for the data to be processed; the thread bundle is used for carrying out operation processing on the data to be processed acquired by the display adapter; the thread bundle comprises a plurality of threads which are arranged in sequence, and the plurality of threads carry out parallel and/or serial operation processing on the data to be processed;
sequentially determining a current thread in the multiple threads, and determining reference data corresponding to the current thread; wherein the reference data is the data to be processed or the intermediate operation data;
and according to the operation instruction, the reference data is operated through the current thread to obtain the target operation result.
9. The display adapter as described in claim 6, wherein the host further comprises a driver, the display core further configured to:
changing the storage state to a second state in response to a change instruction of a processing manner for an intermediate operation result in the drive; and when the storage state is the second state, the final storage position corresponding to the intermediate operation result is the host.
10. The display adapter according to any one of claims 6 to 9, wherein the data to be processed comprises any one of image data, artificial intelligence related training data.
11. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to invoke the memory-stored instructions to perform a method of processing data according to any one of claims 1 to 5.
12. A computer-readable storage medium on which computer program instructions are stored, which computer program instructions, when executed by a processor, implement a method of processing data according to any one of claims 1 to 5.
CN202211536798.2A 2022-12-01 2022-12-01 Data processing method, display adapter, electronic device and storage medium Pending CN115934768A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211536798.2A CN115934768A (en) 2022-12-01 2022-12-01 Data processing method, display adapter, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211536798.2A CN115934768A (en) 2022-12-01 2022-12-01 Data processing method, display adapter, electronic device and storage medium

Publications (1)

Publication Number Publication Date
CN115934768A true CN115934768A (en) 2023-04-07

Family

ID=86700347

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211536798.2A Pending CN115934768A (en) 2022-12-01 2022-12-01 Data processing method, display adapter, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN115934768A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102298567A (en) * 2010-06-28 2011-12-28 安凯(广州)微电子技术有限公司 Mobile processor architecture integrating central operation and graphic acceleration
CN107992329A (en) * 2017-07-20 2018-05-04 上海寒武纪信息科技有限公司 A kind of computational methods and Related product
CN109791519A (en) * 2016-10-28 2019-05-21 西部数据技术公司 The optimization purposes of Nonvolatile memory system and local fast storage with integrated computing engines
CN111488177A (en) * 2020-04-14 2020-08-04 腾讯科技(深圳)有限公司 Data processing method, data processing device, computer equipment and storage medium
CN111899150A (en) * 2020-08-28 2020-11-06 Oppo广东移动通信有限公司 Data processing method and device, electronic equipment and storage medium
CN111898137A (en) * 2020-06-30 2020-11-06 深圳致星科技有限公司 Private data processing method, equipment and system for federated learning
CN112801856A (en) * 2021-02-04 2021-05-14 西安万像电子科技有限公司 Data processing method and device
CN114330689A (en) * 2021-12-29 2022-04-12 北京字跳网络技术有限公司 Data processing method and device, electronic equipment and storage medium
CN114356529A (en) * 2022-01-10 2022-04-15 深圳市商汤科技有限公司 Image processing method and device, electronic equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102298567A (en) * 2010-06-28 2011-12-28 安凯(广州)微电子技术有限公司 Mobile processor architecture integrating central operation and graphic acceleration
CN109791519A (en) * 2016-10-28 2019-05-21 西部数据技术公司 The optimization purposes of Nonvolatile memory system and local fast storage with integrated computing engines
CN107992329A (en) * 2017-07-20 2018-05-04 上海寒武纪信息科技有限公司 A kind of computational methods and Related product
CN111488177A (en) * 2020-04-14 2020-08-04 腾讯科技(深圳)有限公司 Data processing method, data processing device, computer equipment and storage medium
CN111898137A (en) * 2020-06-30 2020-11-06 深圳致星科技有限公司 Private data processing method, equipment and system for federated learning
CN111899150A (en) * 2020-08-28 2020-11-06 Oppo广东移动通信有限公司 Data processing method and device, electronic equipment and storage medium
CN112801856A (en) * 2021-02-04 2021-05-14 西安万像电子科技有限公司 Data processing method and device
CN114330689A (en) * 2021-12-29 2022-04-12 北京字跳网络技术有限公司 Data processing method and device, electronic equipment and storage medium
CN114356529A (en) * 2022-01-10 2022-04-15 深圳市商汤科技有限公司 Image processing method and device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
宋杰;孙宗哲;毛克明;鲍玉斌;于戈;: "MapReduce大数据处理平台与算法研究进展", 软件学报, no. 03, pages 514 - 543 *

Similar Documents

Publication Publication Date Title
CN110058936B (en) Method, apparatus and computer program product for determining an amount of resources of a dedicated processing resource
JP7012689B2 (en) Command execution method and device
CN111190741B (en) Scheduling method, equipment and storage medium based on deep learning node calculation
KR102420661B1 (en) Data processing method and apparatus for neural network
CN111985831A (en) Scheduling method and device of cloud computing resources, computer equipment and storage medium
CN114746871A (en) Neural network training using dataflow graphs and dynamic memory management
CN111966361A (en) Method, device and equipment for determining model to be deployed and storage medium thereof
CN116467061A (en) Task execution method and device, storage medium and electronic equipment
WO2022100439A1 (en) Workflow patching
CN113204412A (en) Method, electronic device, and computer storage medium for task scheduling
US11409564B2 (en) Resource allocation for tuning hyperparameters of large-scale deep learning workloads
CN113407343A (en) Service processing method, device and equipment based on resource allocation
CN116701143A (en) Performance analysis method, device, system, computing equipment and storage medium
US11157243B2 (en) Client-side source code dependency resolution in language server protocol-enabled language server
CN116662009A (en) GPU resource allocation method and device, electronic equipment and storage medium
US10831638B2 (en) Automated analytics for improving reuse of application solutions
CN115934768A (en) Data processing method, display adapter, electronic device and storage medium
CN113792869B (en) Video processing method and device based on neural network chip and electronic equipment
US10375206B1 (en) Entity-component architecture with components having multiple configurations
CN110825461B (en) Data processing method and device
CN114201727A (en) Data processing method, processor, artificial intelligence chip and electronic equipment
CN111813407B (en) Game development method, game running device and electronic equipment
US10360137B2 (en) Adaptive testing using dynamically determined system resources of a computer system
US10603583B1 (en) Entity-component architecture with components having multiple configurations
CN112041817A (en) Method and node for managing requests for hardware acceleration by means of an accelerator device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination