CN115712580B - Memory address allocation method, memory address allocation device, computer equipment and storage medium - Google Patents

Memory address allocation method, memory address allocation device, computer equipment and storage medium Download PDF

Info

Publication number
CN115712580B
CN115712580B CN202211490098.4A CN202211490098A CN115712580B CN 115712580 B CN115712580 B CN 115712580B CN 202211490098 A CN202211490098 A CN 202211490098A CN 115712580 B CN115712580 B CN 115712580B
Authority
CN
China
Prior art keywords
vertex
target
memory
marking information
memory address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211490098.4A
Other languages
Chinese (zh)
Other versions
CN115712580A (en
Inventor
李清
肖恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Glenfly Tech Co Ltd
Original Assignee
Glenfly Tech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Glenfly Tech Co Ltd filed Critical Glenfly Tech Co Ltd
Priority to CN202211490098.4A priority Critical patent/CN115712580B/en
Publication of CN115712580A publication Critical patent/CN115712580A/en
Application granted granted Critical
Publication of CN115712580B publication Critical patent/CN115712580B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application relates to a memory address allocation method, a memory address allocation device, computer equipment and a storage medium. The method comprises the following steps: acquiring first vertex marking information of a vertex to be detected, wherein the first vertex marking information is used for distinguishing different vertices; the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison information; if the comparison information represents that the target vertexes with the second vertex marking information being the same as the first vertex marking information exist in the memory, judging that the vertexes to be detected are repeated vertexes; and determining the memory address of the target vertex as the memory address corresponding to the vertex to be detected according to the comparison information. In the method, a parallel comparison test mode is adopted, so that the comparison result of the vertex to be tested and all the vertices in the memory can be obtained in one clock period, the memory address of the target vertex can be quickly found out, the vertex detection efficiency is improved, and the calculation amount of the process is reduced.

Description

Memory address allocation method, memory address allocation device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of GPU rendering technologies, and in particular, to a memory address allocation method, apparatus, computer device, and storage medium.
Background
When a Graphics Processor (GPU) performs rendering, the generation process of pixels is as follows: and determining vertexes, wherein the vertexes are combined into primitives (points, line segments, triangles or polygons), the primitives are divided into primitives after rasterization, and the primitives are converted into pixel data after testing (such as depth testing). Taking a primitive as a triangle for example, the first three vertices are designated for creating a triangle, and each subsequent vertex together with the two vertices of the triangle preceding the vertex form the next triangle. Each triangle (after the original triangle) will automatically rearrange to ensure consistency of triangle wrap.
When the GPU performs rendering, a large number of triangle vertices are received, and then these vertices are assembled into a process, assuming that 128 vertices are processed by a process that operates according to a prior programming, and the calculation results are stored in a memory unit. Therefore, if there are a large number of repeated vertices in the process, the computing resources and the storage resources are greatly consumed, and the computing efficiency is reduced.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a memory address allocation method, apparatus, computer device, and storage medium capable of improving vertex detection efficiency and reducing the amount of computation of processes.
In a first aspect, the present application provides a memory address allocation method. The method comprises the following steps:
acquiring first vertex marking information of a vertex to be detected, wherein the first vertex marking information is used for distinguishing different vertices;
the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison information; if the comparison information represents that the target vertexes with the second vertex marking information being the same as the first vertex marking information exist in the memory, judging that the vertexes to be detected are repeated vertexes;
and determining the memory address of the target vertex as the memory address corresponding to the vertex to be detected according to the comparison information.
In one embodiment, the comparison information includes comparison result flag information, target process number flag information, and offset flag information; the first vertex marking information and the second vertex marking information are digital numbers; the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertices stored in the memory in parallel to obtain comparison information, and the method comprises the following steps:
the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison result marking information, and the target vertexes with the same second vertex marking information and the first vertex marking information in the memory are determined according to the comparison result marking information;
Generating target process number marking information to which the target vertex belongs according to the mapping relation between the target vertex and the process;
and generating offset marking information of the target vertex according to the difference value between the second vertex marking information of the target process and the vertex marking information of the first vertex which needs to be processed by the target process aiming at the target process corresponding to the target process number marking information.
In one embodiment, determining, according to the comparison information, the memory address of the target vertex as the memory address corresponding to the vertex to be detected includes:
determining the offset of the target vertex relative to the first vertex required to be processed by the target process according to the offset marking information;
and calculating the memory address of the target vertex in the memory unit corresponding to the target process according to the offset.
In one embodiment, calculating the memory address of the target vertex in the memory unit corresponding to the target process according to the offset includes:
aiming at a target storage space of a target vertex in a memory unit corresponding to a target process, determining a target offset address of the target vertex relative to a head address of the target storage space according to the offset;
and searching the memory address corresponding to the target offset address in the memory unit corresponding to the target process in parallel to be used as the memory address of the target vertex.
In one embodiment, for a target storage space of a target vertex in a memory unit corresponding to a target process, determining a target offset address of the target vertex relative to a head address of the target storage space according to an offset includes:
the offset is respectively compared with the maximum second vertex marking information stored in each storage space of the memory unit in parallel, and the target storage space of the memory unit corresponding to the target process of the target vertex is determined;
and determining a target offset address of the target vertex relative to the head address of the target storage space according to the difference value between the offset and the minimum second vertex mark information stored in the target storage space.
In one embodiment, searching the memory address corresponding to the target offset address in parallel in the memory unit corresponding to the target process as the memory address of the target vertex includes:
adding the base address of the target process with the offset address corresponding to each vertex stored in the target storage space in parallel to obtain the memory address corresponding to each vertex in the target storage space;
and selecting a target memory address corresponding to the target offset address from the plurality of memory addresses as the memory address of the target vertex.
In one embodiment, adding the base address of the target process to the offset address corresponding to each vertex stored in the target storage space in parallel to obtain the memory address corresponding to each vertex in the target storage space, and further includes:
and comparing the maximum memory address corresponding to the target storage space with a plurality of memory addresses in parallel, and if the memory address exceeds the maximum memory address corresponding to the target storage space, taking the address difference between the memory address and the maximum memory address corresponding to the target storage space as the final memory address corresponding to the vertex.
In one embodiment, the method further comprises:
if the target vertex with the second vertex marking information identical to the first vertex marking information of the vertex to be detected does not exist in the memory, the vertex to be detected is judged to be a new vertex, a target process and a target storage space corresponding to the vertex to be detected are determined, and the vertex to be detected is stored in the target storage space.
In a second aspect, the present application further provides a memory address allocation apparatus. The device comprises:
the acquisition module is used for acquiring first vertex marking information of the vertices to be detected, wherein the first vertex marking information is used for distinguishing different vertices;
The comparison test module is used for respectively comparing the first vertex marking information with second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison information; if the comparison information represents that the target vertexes with the second vertex marking information being the same as the first vertex marking information exist in the memory, judging that the vertex to be detected is a repeated vertex;
and the memory address allocation is used for determining the memory address of the target vertex according to the comparison information and taking the memory address as the memory address corresponding to the vertex to be detected.
In one embodiment, the comparison information includes comparison result flag information, target process number flag information, and offset flag information; the memory address allocation module further comprises: an addressing module;
the addressing module is used for determining the offset of the target vertex relative to the first vertex required to be processed by the target process according to the offset marking information; and calculating the memory address of the target vertex in the memory unit corresponding to the target process according to the offset.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:
Acquiring first vertex marking information of a vertex to be detected, wherein the first vertex marking information is used for distinguishing different vertices;
the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison information; if the comparison information represents that the target vertexes with the second vertex marking information being the same as the first vertex marking information exist in the memory, judging that the vertexes to be detected are repeated vertexes;
and determining the memory address of the target vertex as the memory address corresponding to the vertex to be detected according to the comparison information.
In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:
acquiring first vertex marking information of a vertex to be detected, wherein the first vertex marking information is used for distinguishing different vertices;
the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison information; if the comparison information represents that the target vertexes with the second vertex marking information being the same as the first vertex marking information exist in the memory, judging that the vertexes to be detected are repeated vertexes;
And determining the memory address of the target vertex as the memory address corresponding to the vertex to be detected according to the comparison information.
In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:
acquiring first vertex marking information of a vertex to be detected, wherein the first vertex marking information is used for distinguishing different vertices;
the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison information; if the comparison information represents that the target vertexes with the second vertex marking information being the same as the first vertex marking information exist in the memory, judging that the vertexes to be detected are repeated vertexes;
and determining the memory address of the target vertex as the memory address corresponding to the vertex to be detected according to the comparison information.
The memory address allocation method, the memory address allocation device, the computer equipment and the storage medium are used for respectively carrying out parallel comparison on the first vertex marking information and the second vertex marking information corresponding to the vertexes stored in the memory to obtain comparison information; if the comparison information represents that the target vertexes with the second vertex marking information being the same as the first vertex marking information exist in the memory, judging that the vertexes to be detected are repeated vertexes; by adopting a parallel comparison test mode, the comparison result of the vertex to be tested and all the vertices in the memory can be obtained in one clock period, and meanwhile, the operation result stored in the memory address of the target vertex is used as the operation result of the vertex to be detected, so that the memory address of the target vertex can be quickly searched, the vertex detection efficiency is improved, and the calculation amount of the process is reduced.
Drawings
FIG. 1 is a diagram of an application environment of a memory address allocation method in one embodiment;
FIG. 2 is a flow diagram of GPU processor rendering in one embodiment;
FIG. 3 is a flow chart illustrating a memory address allocation method according to an embodiment;
FIG. 4 is a schematic diagram of triangle vertices in one embodiment;
FIG. 5 is a flowchart of a method for obtaining comparison information according to another embodiment;
FIG. 6 is a block diagram of a comparison test module in one embodiment;
FIG. 7 is a flowchart illustrating a method for determining a memory address of a vertex to be detected according to one embodiment;
FIG. 8 is a block diagram of a decode module in one embodiment;
FIG. 9 is a flowchart of a method for calculating a memory address of a target vertex in a memory cell corresponding to a target process according to one embodiment;
FIG. 10 is a block diagram of a memory address allocation module in one embodiment;
FIG. 11 is a block diagram of an output module in one embodiment;
FIG. 12 is a detailed flowchart of a memory address allocation method according to one embodiment;
FIG. 13 is a block diagram illustrating an embodiment of a memory address allocation apparatus;
fig. 14 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
The memory address allocation method provided by the embodiment of the application can be applied to an application environment as shown in fig. 1. The computer device 102 obtains first vertex marking information of the vertices to be detected, wherein the first vertex marking information is used for distinguishing different vertices; the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison information; if the comparison information represents that the target vertexes with the second vertex marking information being the same as the first vertex marking information exist in the memory, judging that the vertexes to be detected are repeated vertexes; and determining the memory address of the target vertex as the memory address corresponding to the vertex to be detected according to the comparison information. The data storage system may store vertices to be detected and detected vertices that are required to be detected by computer device 102. The data storage system may be integrated on the computer device 102 or may be located on a cloud or other network server. The computer device 102 may be, but not limited to, a terminal, a server, etc., and the terminal may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, and the internet of things devices may be smart televisions, smart vehicle devices, etc. The portable wearable device may be a smart watch, a headset, or the like. The server may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.
In the pipeline of Graphics Processor (GPU) rendering, for example Microsoft DirectX, which aims to make Windows-based computers an ideal platform for running and displaying applications with rich multimedia elements (e.g. full color graphics, video, 3D animation and rich audio), both Vertex Shader (VS, vertex Shader) and Domain Shader (DS, domain Shader) have a large number of Vertex inputs in Microsoft DirectX, where VS inputs are primitive triangle vertices and DS faces a large number of subdivided triangle vertices obtained by tessellation (TS, tesselator), and these vertices have a large number of coincident vertices, which is computationally intensive if all vertices are involved in the operation. In order to reduce the calculation amount and improve the performance, the vertexes are detected, and if the vertexes are the same, the vertexes can be prevented from participating in subsequent calculation. Thus, the same vertex detection technique is introduced in both the VS and DS stages.
As shown in fig. 2, for the VS stage, the pipeline is initially input to the assembly module (IA) to prepare the vertex data and then output to the VS. After receiving the triangle vertices, the VS first detects the triangle vertices by the same vertex detection technique, and then different points can participate in subsequent operations. The VS stage is divided into front-end and back-end portions. The Front End portion (VSFE, VSfront End) is mainly responsible for the assembly of the process; the Back End part (VSBE, VS Back End) is mainly responsible for computation, storage. The same vertex detection technique mentioned in this embodiment is implemented in the VSFE, so the following will describe the relevant background knowledge of the same vertex detection technique in the VSFE.
First, the VSFE receives a large number of triangle vertices and then assembles the process with these vertices. Assuming 128 vertexes are used as a process, a process number is obtained, then the assembled process information is sent to VSBE to operate according to a preset programming, and the calculation result is stored in a memory unit. Therefore, if there are a large number of repeated vertices in the process, this will result in a great consumption of computing resources and storage resources of the VSBE, and if the VSFE can reject these repeated vertices, the number of processes will be reduced, so that the load of the VSBE will be reduced, and the resource consumption will be reduced, so that more different vertex operation results can be calculated on the premise of the same resource consumption.
To avoid these repeated vertex participation calculations during the VSBE phase, it is first necessary to save the vertices of several processes, and when a new vertex is received by the VSFE, it is compared with the vertex number of the memory, i.e., a Comparison Test (CT). After CT processing, it can know whether the NEW vertex and the vertex of the memory are coincident, if so, the vertex is marked as an OLD vertex, and if not, the vertex is marked as a NEW vertex. In order to avoid the operation and storage resources of VSBE consumed by OLD vertices, the present embodiment uses only the NEW vertex assembly process. In this way, subsequent processes can be guaranteed to be all composed of NEW vertices, which can greatly reduce the number of processes, thereby reducing the load of the VSBE. For GPUs to render more complex scenes, the amount of data that is culled by the same vertex detection technique is very considerable.
After obtaining the information of the OLD vertex or the NEW vertex, the same vertex detection technology has an important processing procedure, namely, calculating the memory address corresponding to the OLD vertex or the NEW vertex, where the memory address is used to store the result of processing the VSBE part of the NEW vertex.
It should be noted that: in the VS stage, the present embodiment simply does not let the OLD vertex participate in the operation in VSBE, but the operation result of the OLD vertex still needs to be sent to the Hull Shader (HS, hull Shader) shown in fig. 2. Since the OLD vertex coincides with a NEW vertex, only the memory address of the OLD vertex needs to be obtained in order to obtain the operation result of the OLD vertex.
As shown in fig. 2, for the DS stage, fig. 2 shows that the DS needs to face a large number of triangle vertices after the TS, and in order to obtain a finer rendering effect, the original triangle is often split into a larger number of smaller triangles, so that a larger number of vertices are obtained. As with the VS stage, the DS stage needs to be divided into front-end and back-end processing sections. The Front part is denoted as DSFE (DSFE, DS Front End) and the Back part is denoted as DSBE (DSBE, VS Back End). In DSFE, CT processing is carried out on the NEW vertex, then OLD vertex or NEW vertex information is obtained, the DS assembling process is carried out by using the NEW vertex, and then related information is sent to DSBE for subsequent processing. This procedure is very similar to the VS phase and will not be described in detail here.
In one embodiment, as shown in fig. 3, a memory address allocation method is provided, and the method is applied to the computer device in fig. 1 for illustration, and includes the following steps:
step 302, obtaining first vertex marking information of vertices to be detected, where the first vertex marking information is used to distinguish different vertices.
When a Graphics Processing Unit (GPU) performs rendering, the generation process of pixels is as follows: and determining vertexes, wherein the vertexes are combined into primitives (points, line segments, triangles or polygons), the primitives are divided into primitives after rasterization, and the primitives are converted into pixel data after testing (such as depth testing). Taking a primitive as a triangle for example, the first three vertices are designated for creating a triangle, and each subsequent vertex together with the two vertices of the triangle preceding the vertex form the next triangle. Each triangle (after the original triangle) will automatically rearrange to ensure consistency of triangle wrap.
As shown in fig. 4, the 6 triangles are all triangles rotated clockwise, and the input triangles are Δv0v1v2, Δv1v3v2, Δv2v3v4, Δv3v5v4, Δv4v5v6, and Δv5v7v6. If the same vertex detection processing is not performed, all 18 vertices of the 6 triangles need to participate in the operation of the VSBE, however, after the processing of the same point detection technology, the vertices which need to participate in the subsequent operation of the VSBE are only 8: v0, V1, V2, V3, V4, V5, V6 and V7. It follows that the amount of computation by CT reduction is very considerable when the number of triangles becomes very large.
The vertices to be detected are vertices input by the input assembly module (IA) in fig. 2, or may be vertices of a large number of subdivided triangles that need to be faced by the DS after the TS in fig. 2, or may be vertices of a polygon primitive. The vertex to be detected in this embodiment is the vertex input by the input assembly module (IA) in fig. 2. The first vertex marking information is used to distinguish between different vertices, such as vertex numbers, each assigned a vertex number, such as vertex coordinates, prior to performing the calculation.
Optionally, the computer device receives vertex data output by an input assembly module (IA) used by the GPU processor in the rendering process, where the vertex data includes information such as a vertex number and a vertex coordinate, and obtains first vertex marking information of the vertex to be detected according to the vertex to be detected.
Step 304, the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison information; if the comparison information represents that the target vertex with the same second vertex marking information as the first vertex marking information exists in the memory, the vertex to be detected is judged to be the repeated vertex.
Wherein the second vertex marking information and the first vertex marking information are both based on ordered sequence codes for marking vertices, such as vertex numbers, established under the same standard. The memory is used to store vertices processed by each process, each process processing multiple vertices, e.g., one process may process 128 vertices.
The traditional same point detection technology adopts a serial circuit structure mode, and can be completed only by consuming a plurality of clock cycles, so that the vertex test efficiency is reduced. Therefore, in order to solve the above-mentioned problem, the present embodiment adopts a parallel comparison test method, and simultaneously compares all vertices in the memory, so as to obtain a comparison result of the vertex to be tested and all vertices in the memory in one clock cycle, thereby improving vertex detection efficiency.
The comparison information comprises comparison result marking information, target process number marking information and offset marking information, wherein the comparison result marking information is used for marking whether the vertex to be detected is a repeated vertex, the vertex to be detected can be marked with '1' to be the repeated vertex, and the vertex to be detected is marked with '0' to be a new vertex; the target process number marking information is used for marking a target process where the target vertex is located; the offset marking information is used for marking the position of the target vertex in the target process.
Optionally, the computer device compares the first vertex marking information of the vertex to be detected with the second vertex marking information corresponding to the vertex stored in the memory in parallel to obtain comparison information, and if the comparison information represents that the target vertex with the second vertex marking information identical to the first vertex marking information of the vertex to be detected exists in the memory, determines that the vertex to be detected is a repeated vertex.
Step 306, determining the memory address of the target vertex as the memory address corresponding to the vertex to be detected.
In the VS stage, the OLD vertex is not involved in the operation in VSBE, but the operation result of the OLD vertex still needs to be sent to the Hull Shader (HS, hull Shader) shown in fig. 2. Since the OLD vertex coincides with a NEW vertex, only the memory address of the OLD vertex needs to be obtained in order to obtain the operation result of the OLD vertex.
Optionally, the computer device searches the memory address of the target vertex in the memory, and uses the memory address as the memory address corresponding to the vertex to be detected.
In the memory address allocation method, the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison information; if the comparison information represents that the target vertexes with the second vertex marking information being the same as the first vertex marking information exist in the memory, judging that the vertexes to be detected are repeated vertexes; by adopting a parallel comparison test mode, the comparison result of the vertex to be tested and all the vertices in the memory can be obtained in one clock period, and meanwhile, the operation result stored in the memory address of the target vertex is used as the operation result of the vertex to be detected, so that the memory address of the target vertex can be quickly searched, the vertex detection efficiency is improved, and the calculation amount of the process is reduced.
In one embodiment, assuming that the memory has N processes, each process includes M vertices, when performing the same vertex test, each input vertex number needs to be compared with n×m vertices in the memory, where in the conventional scheme, n×m clock cycles are required to complete the same vertex test in a serial manner, and many clock cycles are required to complete the vertex test, so that the vertex test efficiency is reduced. Therefore, to solve the above problem, the present embodiment adopts a parallel comparison method to complete the comparison test of all vertices in the memory within one clock cycle. As shown in fig. 5, the comparison information includes comparison result flag information, target process number flag information, and offset flag information; the first vertex marking information and the second vertex marking information are digital numbers; the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertices stored in the memory in parallel to obtain comparison information, and the method comprises the following steps:
step 502, the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison result marking information, and the target vertexes with the same second vertex marking information and the first vertex marking information in the memory are determined according to the comparison result marking information.
The memory is used for storing the operation results of the vertices required to be processed by the processes, and may be an external memory of the computer device or an internal memory of the computer device, and the type of the memory is not limited herein.
In this embodiment, a comparison test module as shown in fig. 6 is used to compare the first vertex marking information with the second vertex marking information corresponding to the vertices stored in the memory in parallel. The memory shown in fig. 6 includes a memory space storing 4×128×16bit vertex numbers and a memory address space storing 4×16bit vertices for storing a first NEW vertex processed by each process, and 4 processes are stored in the memory shown in fig. 6, each process processes 128 vertices, and each vertex is assigned a vertex number according to a sequence number of 0-127.
The number of the first comparators shown in fig. 6 is greater than or equal to the number of vertices stored in the memory, and the output result of the first comparator is 0 or 1, where "0" indicates that the two inputs of the current first comparator are not equal, and "1" indicates that the two inputs of the current first comparator are equal. In this embodiment, the first vertex marking information of the vertices to be tested is compared with the second vertex marking information of the 4×128 vertices stored in the memory by using 4×128 first comparators in parallel, and in one clock period, the 4×128 first comparators execute the comparison operation in parallel to obtain a plurality of 1-bit comparison results, if one of the plurality of 1-bit comparison results is "1", the comparison result marking information is 1, which indicates that there is a target vertex with the same second vertex marking information as the first vertex marking information in the memory, and the vertex corresponding to the second vertex marking information input to the first comparator is the target vertex.
For example, the ranges of vertex numbers processed by the 4 processes stored in the memory shown in fig. 6 are 0-127, 128-255, 256-383 and 384-511, respectively, the vertex number of the vertex to be detected is 125, and after parallel comparison by 4×128 first comparators, it is determined that one vertex with the vertex number of 125 exists in the first process, the first process is determined to be the target process of the vertex to be detected, the vertex with the vertex number of 125 in the first process is determined to be the target vertex, and the vertex to be detected is determined to be the repeated vertex.
It should be noted that: if the comparison result mark information indicates that the vertex to be detected is not the repeated vertex, namely the vertex to be detected is a new vertex, the vertex to be detected needs to be stored into a memory unit corresponding to the target process in the storage.
Optionally, the computer device inputs the first vertex marking information to the first input ends of the first comparators, inputs the second vertex marking information corresponding to each vertex stored in the memory to the second input ends of the first comparators, obtains output results of the first comparators, if the output results represent that the two inputs of the current first comparators are equal, the comparison result marking information represents a target vertex with the second vertex marking information identical to the first vertex marking information in the memory, and the vertex corresponding to the second vertex marking information input to the current first comparator is the target vertex.
Step 504, generating the target process number mark information to which the target vertex belongs according to the mapping relation between the target vertex and the process.
Each vertex is allocated to a corresponding process, so there is a mapping relationship between the vertex number of each vertex and the process number, for example, a vertex that has a vertex number of 0-127 and a vertex that needs to be processed by a process having a process number of 1.
For example, as shown in fig. 6, the vertex number of the vertex to be detected is 125, after parallel comparison, it is determined that there is a vertex with the vertex number of 125 in the first process, and then the first process is the target process of the vertex to be detected, that is, the process number of the target process to which the target vertex with the vertex number of 125 belongs is 1, and the target process number marking information to which the target vertex belongs is generated according to the target process number.
Optionally, after determining the vertex number of the target vertex, the computer device determines the target process number to which the target vertex belongs according to the vertex number of the target vertex, namely the second vertex marking information of the target vertex, and a preset mapping relation, and generates the target process number marking information according to the target process number.
Step 506, for the target process corresponding to the target process number mark information, generating offset mark information of the target vertex according to the difference between the second vertex mark information of the target process and the vertex mark information of the first vertex to be processed by the target process.
The offset refers to an offset between the target vertex and the first vertex of the process to which the target process belongs, and the offset may be an offset vertex number or an offset address.
For example, the offset between the vertex number of the target vertex shown in fig. 6 and the vertex number of the first vertex in the target process is 125, and offset mark information is generated according to the offset.
Optionally, the computer device determines a target process corresponding to the target vertex according to the target process number marking information, and a vertex number range processed by the target process, and generates offset marking information of the target vertex according to a difference value between the second vertex marking information of the target process and vertex marking information of the first vertex required to be processed by the target process.
In this embodiment, the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertices stored in the memory in parallel, a comparison test of all the vertices in the memory is completed in one clock period, and comparison result marking information for marking whether the vertex to be detected is a repeated vertex or not is obtained in one clock period, target process number marking information for marking a target process where the target vertex is located is used for marking the offset of the position of the target vertex in the target process, and a data basis is provided for the subsequent calculation of the memory address of the repeated vertex when the vertex to be detected is the repeated vertex.
In one embodiment, as shown in fig. 7, determining, according to the comparison information, a memory address of a target vertex as a memory address corresponding to a vertex to be detected includes:
step 702, determining the offset of the target vertex relative to the first vertex to be processed by the target process according to the offset marking information.
In this embodiment, the comparison result flag information is 1bit data, the target process flag information is 4bit data, the offset flag information is 128bit data, in order to address the process number where the target vertex is located, a decoding module as shown in fig. 8 is adopted, and a 4bit input and 2bit output decoder is used to decode the target process flag information, so as to obtain a 2bit process number. In order to address the position of the target process where the target vertex is located, the offset marking information is decoded by a decoder with 128bit input and 7bit output, so as to obtain the 7bit offset.
Optionally, the computer device decodes the target process number marking information through a 4bit input and a 2bit output decoder to obtain a target process number of a target process corresponding to the target vertex, and determines the target process corresponding to the target vertex according to the target process number; the computer equipment decodes the offset marking information through a 128bit input and 7bit output decoder to obtain the offset of the target vertex relative to the first vertex processed by the target process.
Step 704, calculating the memory address of the target vertex in the memory unit corresponding to the target process according to the offset.
The memory address is equal to the sum of the base address and the offset address of the target process, and the base address of the target process can be obtained by inquiring a lookup table of the memory; the offset address pointer is used for offset addresses of the target vertexes relative to the head addresses of the target storage spaces of the memory units corresponding to the target processes in the target storage spaces.
Optionally, the computer device determines the base address of the target process through the lookup table of the memory, and calculates the target storage space for the memory unit corresponding to the target process for the target vertex according to the offset, and the offset address of the target vertex relative to the head address of the target storage space, and uses the sum of the base address and the offset address as the memory address of the memory unit corresponding to the target process for the target vertex.
In this embodiment, the offset of the target vertex relative to the first vertex to be processed by the target process is determined according to the offset marking information, and then the memory address of the target vertex in the memory unit corresponding to the target process is calculated according to the offset, so that the memory address of the target vertex can be calculated in parallel, the operation result stored in the memory address of the target vertex is used as the operation result of the vertex to be detected, the memory address of the target vertex can be found out quickly, and the calculation amount of the process is reduced.
In one embodiment, the existing method generally stores the memory address of each vertex, and when the vertex to be detected is a repeated vertex, the memory of the vertex to be detected can be obtained directly through a searching mode, but this mode consumes the register resource of n×m×16bit, where N is the number of processes stored in the memory, and M is the number of vertices processed by each process. Therefore, in order to save the chip area, the present embodiment adopts a calculation mode to obtain the memory address of the target vertex. Specifically, as shown in fig. 9, according to the offset, the memory address of the target vertex in the memory unit corresponding to the target process is calculated, which includes the following steps:
in step 902, for a target storage space of the target vertex in the memory unit corresponding to the target process, a target offset address of the target vertex with respect to a head address of the target storage space is determined according to the offset.
The memory unit corresponding to the target process allocates memory addresses for the vertexes according to the processing sequence, the memory unit corresponding to the target process is divided into a plurality of storage spaces, and each storage space allocates memory addresses for the vertexes of the storage space according to the processing sequence. The target storage space refers to the storage position of the target vertex in the memory unit corresponding to the target process. The first address of the target storage space refers to the memory address corresponding to the vertex processed first by the target storage space.
Optionally, in the foregoing embodiment, the computer device obtains the target process number after the target process number is processed by the decoder, and obtains the base address of the memory unit of the target process by looking up the table through the target process number, determines, according to the offset, the target vertex relative to the target storage space, performs memory range inspection in the target storage space, and finally obtains the offset address.
In some embodiments, for a target storage space of a target vertex in a memory unit corresponding to a target process, determining a target offset address of the target vertex relative to a head address of the target storage space according to an offset, including the following steps:
and step 1, comparing the offset with the maximum second vertex mark information stored in each storage space of the memory unit in parallel to determine the target storage space of the memory unit corresponding to the target process by the target vertex.
Wherein, each storage space stores different vertexes, and the vertexes stored in each storage space are stored according to the ordering of the vertex numbers, therefore, each storage space corresponds to a storage range of the vertex numbers, if the vertex number of the target vertex is not in the storage range of the vertex number of the storage space, the target vertex is not in the storage space, and the specific comparison mode is to compare the offset with the maximum second vertex mark information stored in each storage space of the memory unit in parallel, and if the offset of the target vertex is smaller than the maximum second vertex mark information stored in the storage space, the storage space is the target storage space of the target vertex.
Alternatively, a memory address allocation module as shown in fig. 10 is adopted, the offset is respectively input to the first input ends of the plurality of second comparators, the maximum second vertex marking information stored in each storage space is input to the second input ends of the plurality of second comparators, the output results of the plurality of second comparators are obtained, and if the output results represent that the first input of the current second comparator is smaller than the second input, the storage space to which the maximum second vertex marking information corresponding to the second input of the second comparator belongs is taken as the target storage space.
And step 2, determining a target offset address of the target vertex relative to the head address of the target storage space according to the difference value between the offset and the minimum second vertex mark information stored in the target storage space.
The difference between the offset and the maximum second vertex mark information stored in the target storage space is a vertex number difference, for example, the memory space corresponding to the target process has 16 storage spaces, each storage space stores the operation result of 8 vertices, the offset of the target vertex is 125, the target storage space is 16 th storage space, the maximum second vertex mark information stored in the target storage space is 127, the minimum second vertex mark information stored in the target storage space is 120, and the difference between the offset and the minimum second vertex mark information stored in the target storage space is 5, and then the memory address 5 addresses from the first address of the target storage space is determined to be the offset address.
Step 904, the memory addresses corresponding to the target offset addresses are searched in parallel in the memory units corresponding to the target processes, and the memory addresses are used as the memory addresses of the target vertices.
The conventional method adopts a progressive scanning mode to search the memory address of the target vertex, and the progressive scanning mode can reduce the time sequence efficiency of hardware, so that the embodiment adopts a parallel scanning mode to search the memory address of the target vertex. Because the memory address is equal to the sum of the base address and the offset address, the base address of the same memory unit is the same, and therefore, the parallel scanning mode firstly calculates the memory address corresponding to each vertex stored in the target storage space in parallel, and then searches the memory address corresponding to the offset address as the memory address of the target vertex.
For example, the memory space corresponding to the target process has 16 memory spaces, and the target memory space stores the operation results of 8 vertices, and in this embodiment, the memory addresses of the target vertices are searched by adopting an 8-memory address parallel scanning mode.
In some embodiments, the memory address corresponding to the target offset address is searched in parallel in the memory unit corresponding to the target process, and is used as the memory address of the target vertex, and specifically includes the following steps:
And step 1, adding the base address of the target process and the offset address corresponding to each vertex stored in the target storage space in parallel to obtain the memory address corresponding to each vertex in the target storage space.
Wherein the base address of the target process may be obtained by querying a lookup table of the memory. The addition of the base address and the offset address of the target process may be handled by an adder.
Optionally, as shown in fig. 10, the computer device obtains the base address of the target process by querying a lookup table of the memory, determines the offset address corresponding to each vertex stored in the target storage space, uses the base address as a first input of the plurality of adders, uses the offset address corresponding to each vertex stored in the target storage space as a second input of the plurality of adders, and the plurality of adders output the memory address corresponding to each vertex stored in the target storage space.
It should be noted that: the memory address corresponding to each vertex is not obtained by simply adding the base address and the offset address, and the memory address range needs to be judged, if the memory address exceeds the memory address range of the memory unit, the memory address of the vertex needs to be determined again.
The specific steps of redefining the memory address of the vertex are as follows: and comparing the maximum memory address corresponding to the target storage space with a plurality of memory addresses in parallel, and if the memory address exceeds the maximum memory address corresponding to the target storage space, taking the address difference between the memory address and the maximum memory address corresponding to the target storage space as the final memory address corresponding to the vertex.
The address difference between the memory address and the maximum memory address corresponding to the target storage space is processed by the subtracter. As shown in fig. 10, taking the maximum memory address corresponding to the target storage space as a first input of a third comparator, taking the memory address of each vertex in the calculated target storage space as a second input of the third comparator, outputting the size between the first input and the second input by the third comparator, and if the second input is smaller than the first input, not processing; if the second input is larger than the first input, calculating an address difference between the memory address and the maximum memory address corresponding to the target storage space through the subtracter.
For example, if the maximum memory address corresponding to the target storage space is 127, when the memory address of each vertex in the target storage space is calculated in parallel, the calculated memory address is 128, and the memory address of the vertex exceeds the maximum memory address corresponding to the target storage space, at this time, the address difference between the calculated memory address and the maximum memory address corresponding to the target storage space is 1, that is, the memory address corresponding to the first vertex in the target storage space is taken as the final memory address of the vertex.
And 2, selecting a target memory address corresponding to the target offset address from the memory addresses as the memory address of the target vertex.
When the memory address of the repeated vertex is calculated, the target storage space where the repeated point is located is required to be judged step by step, so that the time sequence for calculating and judging the memory address is increased, and the memory allocation efficiency of the repeated vertex cannot be improved.
In this embodiment, a target memory address corresponding to the target offset address is selected from a plurality of memory addresses by selecting from among a plurality of memory addresses. As shown in fig. 10, the offset address of the target vertex is taken as the gate end of the 16-choice 1 selector, the memory addresses corresponding to 16 judgment results of whether the memory addresses of 8 vertices in the target storage space exceed the maximum memory address corresponding to the target storage space are taken as the input of the 16-choice 1 selector, and the 16-choice 1 selector selects the target memory address corresponding to the target offset address from the multiple memory addresses and outputs the target memory address of the target vertex.
In this embodiment, the memory address of each vertex in the target storage space is obtained by adopting a parallel computing mode, and the target memory address corresponding to the target offset address is selected from the multiple memory addresses by adopting a parallel non-priority scanning mode to serve as the memory address of the target vertex, so that on one hand, the time sequence of computing and judging the memory address can be reduced, the memory allocation efficiency of the repeated vertex can be improved, on the other hand, the consumption of register resources can be reduced, the chip area can be saved, and the memory address of the target vertex can be searched.
In one embodiment, if the vertex to be tested is a new vertex, the vertex number of the vertex to be tested is stored in the corresponding memory unit, and if the vertex to be tested is the first vertex in the process, the memory address corresponding to the vertex to be tested is also stored and used as the base address corresponding to the process. Specifically, if the target vertex with the second vertex marking information identical to the first vertex marking information of the vertex to be detected does not exist in the memory, the vertex to be detected is judged to be a new vertex, a target process and a target storage space corresponding to the vertex to be detected are determined, and the vertex to be detected is stored in the target storage space.
For example, for a new vertex, the memory address of the new vertex can be obtained quickly by directly using the pointer of the memory allocation plus 1. In this embodiment, the output module shown in fig. 11 is adopted, the comparison result marking information is used as the gating end of the 2-choice 1 selector, the two input ends of the 2-choice 1 selector respectively input the memory address of the new vertex and the memory address of the repeated vertex, and when the comparison result marking information represents that the vertex to be detected is the repeated vertex, the 2-choice 1 selector outputs the memory address of the repeated vertex stored in the memory; when the comparison result marking information indicates that the vertex to be detected is a new vertex, the 2-out-of-1 selector outputs a memory address reassigned by the memory for the new vertex.
In this embodiment, if the vertex to be tested is the first new vertex in the process, the memory address corresponding to the vertex to be tested is also stored and used as the base address corresponding to the process, so that the loss of the register can be reduced.
In one embodiment, as shown in fig. 12, the memory address allocation method of the present embodiment specifically includes the following steps:
step 1202, obtaining first vertex marking information of a vertex to be detected, wherein the first vertex marking information is used for distinguishing different vertices.
Step 1204, comparing the first vertex marking information with the second vertex marking information corresponding to the vertices stored in the memory in parallel to obtain comparison information; the comparison information includes comparison result flag information, target process number flag information, and offset flag information.
Step 1206, determining whether the vertex to be tested is a new vertex according to the comparison result marking information, if so, executing step 1208; if the vertex is new, go to step 1224;
the method comprises the following specific steps: if the comparison result mark information represents that the target vertex with the second vertex mark information identical with the first vertex mark information exists in the memory, the vertex to be detected is judged to be the repeated vertex.
Step 1208, determining the target process number corresponding to the target vertex according to the target process number marking information.
Step 1210, determining the offset of the target vertex relative to the first vertex to be processed by the target process according to the offset marking information;
step 1212, comparing the offset with the maximum second vertex mark information stored in each storage space of the memory unit in parallel, and determining a target storage space of the memory unit corresponding to the target process;
in step 1214, a target offset address of the target vertex relative to the first address of the target storage space is determined based on the difference between the offset and the minimum second vertex marking information stored in the target storage space.
Step 1216, adding the base address of the target process and the offset address corresponding to each vertex stored in the target storage space in parallel to obtain the memory address corresponding to each vertex in the target storage space;
step 1218, comparing the maximum memory address corresponding to the target memory space with the plurality of memory addresses in parallel, and if the memory address exceeds the maximum memory address corresponding to the target memory space, executing step 1220; if the memory address does not exceed the maximum memory address corresponding to the target memory space, step 1222 is performed.
In step 1220, the address difference between the memory address and the maximum memory address corresponding to the target storage space is used as the final memory address corresponding to the vertex.
Step 1222, selecting a target memory address corresponding to the target offset address from the plurality of memory addresses as the memory address of the target vertex, and executing step 1226.
Step 1224, determining the vertex to be detected as a new vertex, determining a target process and a target storage space corresponding to the vertex to be detected, and storing the vertex to be detected in the target storage space.
Step 1226, outputting the memory address of the vertex to be detected according to the comparison result marking information.
In this embodiment, when the same vertex detection is performed, all vertices are compared at the same time, so that vertex detection efficiency can be improved; when the repeated vertex memory address is calculated, a hardware circuit structure can be well optimized in a non-priority mode, so that the circuit structure becomes simpler, and the overall operation frequency of the hardware circuit is improved.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiment of the application also provides a memory address allocation device for implementing the above related memory address allocation method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation of one or more embodiments of the memory address allocation device provided below may refer to the limitation of the memory address allocation method described above, and will not be repeated here.
In one embodiment, as shown in fig. 13, there is provided a memory address allocation apparatus, including: the device comprises an acquisition module, a comparison test module and memory address allocation, wherein:
the obtaining module 100 is configured to obtain first vertex marking information of a vertex to be detected, where the first vertex marking information is used to distinguish different vertices.
The comparison test module 200 is configured to compare the first vertex marking information with second vertex marking information corresponding to vertices stored in the memory in parallel, so as to obtain comparison information; and if the comparison information represents that the target vertex with the second vertex marking information identical to the first vertex marking information exists in the memory, judging that the vertex to be detected is a repeated vertex.
And the memory address allocation 300 is configured to determine, according to the comparison information, a memory address of the target vertex, as a memory address corresponding to the vertex to be detected.
In one embodiment, the comparison information includes comparison result flag information, target process number flag information, and offset flag information; the first vertex marking information and the second vertex marking information are digital numbers; the comparison test module 200 is also for: the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertices stored in the memory in parallel to obtain comparison information, and the method comprises the following steps:
the first vertex marking information is respectively compared with the second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison result marking information, and the target vertexes with the same second vertex marking information and the first vertex marking information in the memory are determined according to the comparison result marking information;
generating target process number marking information to which the target vertex belongs according to the mapping relation between the target vertex and the process;
and generating offset marking information of the target vertex according to the difference value between the second vertex marking information of the target process and the vertex marking information of the first vertex which needs to be processed by the target process aiming at the target process corresponding to the target process number marking information.
It should be noted that: the hardware structure of the comparison test module 200 is shown in fig. 6, and the working principle of the comparison test module 200 is shown in the above-mentioned embodiment, and will not be described here.
In one embodiment, the memory address allocation module 300 is further configured to determine, according to the offset marking information, an offset of the target vertex with respect to a first vertex to be processed by the target process; and calculating the memory address of the target vertex in the memory unit corresponding to the target process according to the offset.
It should be noted that: the hardware structure of the memory address allocation module 300 is shown in fig. 10, and the working principle of the memory address allocation module 300 is shown in the above embodiment, which is not described here.
In one embodiment, the memory address allocation module 300 is further configured to determine, for a target storage space of the target vertex in a memory unit corresponding to the target process, a target offset address of the target vertex relative to a head address of the target storage space according to the offset;
and searching the memory address corresponding to the target offset address in the memory unit corresponding to the target process in parallel to be used as the memory address of the target vertex.
In one embodiment, the memory address allocation module 300 is further configured to compare the offset with the maximum second vertex mark information stored in each storage space of the memory unit in parallel, and determine that the target vertex is in the target storage space of the memory unit corresponding to the target process;
And determining a target offset address of the target vertex relative to the head address of the target storage space according to the difference value between the offset and the minimum second vertex mark information stored in the target storage space.
In one embodiment, the memory address allocation module 300 is further configured to add the base address of the target process to the offset address corresponding to each vertex stored in the target storage space in parallel, so as to obtain a memory address corresponding to each vertex in the target storage space;
and selecting a target memory address corresponding to the target offset address from the plurality of memory addresses as the memory address of the target vertex.
In one embodiment, the memory address allocation module 300 is further configured to compare the maximum memory address corresponding to the target memory space with the plurality of memory addresses in parallel, and if the memory address exceeds the maximum memory address corresponding to the target memory space, use the address difference between the memory address and the maximum memory address corresponding to the target memory space as the final memory address corresponding to the vertex.
In one embodiment, the comparison test module 200 is further configured to determine that the vertex to be detected is a new vertex if the target vertex whose second vertex marking information is the same as the first vertex marking information of the vertex to be detected does not exist in the memory, determine a target process and a target storage space corresponding to the vertex to be detected, and store the vertex to be detected in the target storage space.
The above-mentioned respective modules in the memory address allocation device may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a terminal, and the internal structure thereof may be as shown in fig. 14. The computer device includes a processor, a memory, an input/output interface, a communication interface, a display unit, and an input means. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface, the display unit and the input device are connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a memory address allocation method. The display unit of the computer device is used for forming a visual picture, and can be a display screen, a projection device or a virtual reality imaging device. The display screen can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be a key, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in fig. 14 is merely a block diagram of a portion of the structure associated with the present application and is not limiting of the computer device to which the present application applies, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, implements the steps of the method embodiments described above.
In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
It should be noted that, the user information (including, but not limited to, user equipment information, user personal information, etc.) and the data (including, but not limited to, data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data are required to comply with the related laws and regulations and standards of the related countries and regions.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims (10)

1. A memory address allocation method, the method comprising:
acquiring first vertex marking information of a vertex to be detected, wherein the first vertex marking information is used for distinguishing different vertices;
the first vertex marking information is respectively compared with second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison information;
If the comparison information represents that the target vertexes with the second vertex marking information being the same as the first vertex marking information exist in the memory, judging that the vertex to be detected is a repeated vertex;
determining the memory address of the target vertex according to the comparison information, and taking the memory address as the memory address corresponding to the vertex to be detected;
the comparison information comprises comparison result marking information, target process number marking information and offset marking information; the first vertex marking information and the second vertex marking information are digital numbers; the parallel comparison of the first vertex marking information and the second vertex marking information corresponding to the vertices stored in the memory is performed to obtain comparison information, and the parallel comparison comprises the following steps:
the first vertex marking information is respectively compared with second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison result marking information, and a target vertex with the same second vertex marking information as the first vertex marking information in the memory is determined according to the comparison result marking information;
generating target process number marking information to which the target vertex belongs according to the mapping relation between the target vertex and the process;
And generating offset marking information of the target vertex according to the difference value between the first vertex marking information and vertex marking information of the first vertex which is required to be processed by the target process aiming at the target process corresponding to the target process number marking information.
2. The method according to claim 1, wherein determining, according to the comparison information, the memory address of the target vertex as the memory address corresponding to the vertex to be detected includes:
determining the offset of the target vertex relative to the first vertex required to be processed by the target process according to the offset marking information;
and according to the offset, calculating the memory address of the target vertex in the memory unit corresponding to the target process.
3. The method of claim 2, wherein the calculating, according to the offset, the memory address of the target vertex in the memory unit corresponding to the target process includes:
determining a target offset address of the target vertex relative to a head address of a target storage space according to the offset aiming at the target storage space of the target vertex in a memory unit corresponding to the target process;
And searching a memory address corresponding to the target offset address in a memory unit corresponding to the target process in parallel to be used as the memory address of the target vertex.
4. The method of claim 3, wherein the determining, for the target storage space of the target vertex in the memory unit corresponding to the target process, the target offset address of the target vertex with respect to the head address of the target storage space according to the offset amount comprises:
the offset is respectively compared with the maximum second vertex marking information stored in each storage space of the memory unit in parallel, and the target storage space of the memory unit corresponding to the target process of the target vertex is determined;
and determining a target offset address of the target vertex relative to the head address of the target storage space according to the difference value between the offset and the minimum second vertex mark information stored in the target storage space.
5. The method of claim 3, wherein the concurrently searching the memory address corresponding to the target offset address in the memory unit corresponding to the target process as the memory address of the target vertex comprises:
Adding the base address of the target process and the offset address corresponding to each vertex stored in the target storage space in parallel to obtain the memory address corresponding to each vertex in the target storage space;
and selecting a target memory address corresponding to the target offset address from a plurality of memory addresses as the memory address of the target vertex.
6. The method of claim 5, wherein the adding the base address of the target process to the offset address corresponding to each vertex stored in the target storage space in parallel to obtain the memory address corresponding to each vertex in the target storage space, further comprises:
and respectively comparing the maximum memory address corresponding to the target storage space with a plurality of memory addresses in parallel, and if the memory address exceeds the maximum memory address corresponding to the target storage space, taking the address difference between the memory address and the maximum memory address corresponding to the target storage space as the final memory address corresponding to the vertex.
7. The method according to claim 1, wherein the method further comprises:
If the target vertex with the same second vertex marking information as the first vertex marking information of the vertex to be detected does not exist in the memory, judging the vertex to be detected as a new vertex, determining a target process and a target storage space corresponding to the vertex to be detected, and storing the vertex to be detected in the target storage space.
8. A memory address allocation apparatus for use in the method of claim 1, said apparatus comprising:
the acquisition module is used for acquiring first vertex marking information of the vertices to be detected, wherein the first vertex marking information is used for distinguishing different vertices;
the comparison test module is used for respectively comparing the first vertex marking information with second vertex marking information corresponding to the vertexes stored in the memory in parallel to obtain comparison information; if the comparison information represents that the target vertexes with the second vertex marking information being the same as the first vertex marking information exist in the memory, judging that the vertex to be detected is a repeated vertex;
and the memory address allocation is used for determining the memory address of the target vertex according to the comparison information and taking the memory address as the memory address corresponding to the vertex to be detected.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
CN202211490098.4A 2022-11-25 2022-11-25 Memory address allocation method, memory address allocation device, computer equipment and storage medium Active CN115712580B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211490098.4A CN115712580B (en) 2022-11-25 2022-11-25 Memory address allocation method, memory address allocation device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211490098.4A CN115712580B (en) 2022-11-25 2022-11-25 Memory address allocation method, memory address allocation device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115712580A CN115712580A (en) 2023-02-24
CN115712580B true CN115712580B (en) 2024-01-30

Family

ID=85234711

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211490098.4A Active CN115712580B (en) 2022-11-25 2022-11-25 Memory address allocation method, memory address allocation device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115712580B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102227752A (en) * 2008-12-09 2011-10-26 高通股份有限公司 Discarding of vertex points during two-dimensional graphics rendering using three-dimensional graphics hardware
CN106683162A (en) * 2017-01-18 2017-05-17 天津大学 Rear vertex Cache design method of multi-shader architecture for embedded GPU
CN106780686A (en) * 2015-11-20 2017-05-31 网易(杭州)网络有限公司 The merging rendering system and method, terminal of a kind of 3D models
CN111338988A (en) * 2020-02-20 2020-06-26 西安芯瞳半导体技术有限公司 Memory access method and device, computer equipment and storage medium
CN111754383A (en) * 2020-05-13 2020-10-09 中国科学院信息工程研究所 Detection method of strong connection graph based on warp reuse and colored partition on GPU
CN111773688A (en) * 2020-06-30 2020-10-16 完美世界(北京)软件科技发展有限公司 Flexible object rendering method and device, storage medium and electronic device
CN114503084A (en) * 2020-08-27 2022-05-13 清华大学 Parallel program expandability bottleneck detection method and computing device
CN114565708A (en) * 2020-11-13 2022-05-31 华为技术有限公司 Method, device and equipment for selecting anti-aliasing algorithm and readable storage medium
CN115375822A (en) * 2022-08-15 2022-11-22 网易(杭州)网络有限公司 Cloud model rendering method and device, storage medium and electronic device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160379331A1 (en) * 2015-06-23 2016-12-29 Freescale Semiconductor, Inc. Apparatus and method for verifying the integrity of transformed vertex data in graphics pipeline processing

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102227752A (en) * 2008-12-09 2011-10-26 高通股份有限公司 Discarding of vertex points during two-dimensional graphics rendering using three-dimensional graphics hardware
CN106780686A (en) * 2015-11-20 2017-05-31 网易(杭州)网络有限公司 The merging rendering system and method, terminal of a kind of 3D models
CN106683162A (en) * 2017-01-18 2017-05-17 天津大学 Rear vertex Cache design method of multi-shader architecture for embedded GPU
CN111338988A (en) * 2020-02-20 2020-06-26 西安芯瞳半导体技术有限公司 Memory access method and device, computer equipment and storage medium
CN111754383A (en) * 2020-05-13 2020-10-09 中国科学院信息工程研究所 Detection method of strong connection graph based on warp reuse and colored partition on GPU
CN111773688A (en) * 2020-06-30 2020-10-16 完美世界(北京)软件科技发展有限公司 Flexible object rendering method and device, storage medium and electronic device
CN114503084A (en) * 2020-08-27 2022-05-13 清华大学 Parallel program expandability bottleneck detection method and computing device
CN114565708A (en) * 2020-11-13 2022-05-31 华为技术有限公司 Method, device and equipment for selecting anti-aliasing algorithm and readable storage medium
CN115375822A (en) * 2022-08-15 2022-11-22 网易(杭州)网络有限公司 Cloud model rendering method and device, storage medium and electronic device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Vertex Deformation Algorithm of Skeleton Animation Based on Programmable GPU;Ailin Zeng;《2015 Sixth International Conference on Intelligent Systems Design and Engineering Applications (ISDEA)》;全文 *
一种面向55nm工艺的可扩展统一架构图形处理器设计与实现;黄亮;秦信刚;武玲娟;熊庭刚;;计算机工程与科学(12);全文 *
复杂网格模型仿射变换的GPU加速计算;钟庆;华顺刚;;光电技术应用(01);全文 *
面积带宽优化的嵌入式GPU可编程着色器体系结构研究;常轶松;《中国博士学位论文全文数据库 (信息科技辑)》(第12期);全文 *

Also Published As

Publication number Publication date
CN115712580A (en) 2023-02-24

Similar Documents

Publication Publication Date Title
US9904977B2 (en) Exploiting frame to frame coherency in a sort-middle architecture
US10242481B2 (en) Visibility-based state updates in graphical processing units
US20120050303A1 (en) Lossless frame buffer color compression
US20120176386A1 (en) Reducing recurrent computation cost in a data processing pipeline
US20090295816A1 (en) Video graphics system and method of pixel data compression
JP5684089B2 (en) Graphic system using dynamic relocation of depth engine
US8605104B1 (en) Threshold-based lossy reduction color compression
EP1768059A2 (en) Method and apparatus for encoding texture information
KR101609079B1 (en) Instruction culling in graphics processing unit
KR20120096119A (en) Graphic processor and early visibility testing method
US9449419B2 (en) Post tessellation edge cache
US9665958B2 (en) System, method, and computer program product for redistributing a multi-sample processing workload between threads
KR20170088687A (en) Computing system and method for performing graphics pipeline of tile-based rendering thereof
US9214008B2 (en) Shader program attribute storage
US9262841B2 (en) Front to back compositing
US10445902B2 (en) Fetch reduction for fixed color and pattern sub-frames
CN115712580B (en) Memory address allocation method, memory address allocation device, computer equipment and storage medium
US20140267356A1 (en) Multi-sample surface processing using sample subsets
US20230377265A1 (en) Systems for Efficiently Rendering Vector Objects
US20180040095A1 (en) Dynamic compressed graphics state references
US20170316540A1 (en) Constant multiplication with texture unit of graphics processing unit
US9449420B2 (en) Reducing the domain shader/tessellator invocations
CN115829822A (en) Video rendering method and device, storage medium and computer equipment
CN117456079A (en) Scene rendering method, device, equipment, storage medium and program product
KR20240042090A (en) Foveated binned rendering associated with sample spaces

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant