CN112328511B - Data processing method, computing device and readable storage medium - Google Patents

Data processing method, computing device and readable storage medium Download PDF

Info

Publication number
CN112328511B
CN112328511B CN202110000639.XA CN202110000639A CN112328511B CN 112328511 B CN112328511 B CN 112328511B CN 202110000639 A CN202110000639 A CN 202110000639A CN 112328511 B CN112328511 B CN 112328511B
Authority
CN
China
Prior art keywords
data object
data
absolute value
starting address
bit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110000639.XA
Other languages
Chinese (zh)
Other versions
CN112328511A (en
Inventor
杨堃
刘昌辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Uniontech Software Technology Co Ltd
Original Assignee
Uniontech Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Uniontech Software Technology Co Ltd filed Critical Uniontech Software Technology Co Ltd
Priority to CN202110000639.XA priority Critical patent/CN112328511B/en
Priority to CN202110367882.5A priority patent/CN113064841B/en
Publication of CN112328511A publication Critical patent/CN112328511A/en
Application granted granted Critical
Publication of CN112328511B publication Critical patent/CN112328511B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Executing Machine-Instructions (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method which is suitable for being executed in computing equipment, wherein a numerical data object is stored in a memory of the computing equipment according to a preset storage structure, the structure information of the preset storage structure comprises a data starting address, a data bit width and a data symbol of the data object, the data starting address is the starting address of the absolute value of the data object in the memory, the data object comprises a first data object and a second data object to be processed, and the method comprises the following steps: acquiring a first start address storing structure information of a first data object and a second start address storing structure information of a second data object; obtaining a symbol of a first data object according to the first starting address, and obtaining a symbol of a second data object according to the second starting address; the first data object and the second data object are operated on using SIMD instructions based on at least the signs of the first data object and the second data object. The data processing method can improve the data processing efficiency.

Description

Data processing method, computing device and readable storage medium
Technical Field
The present invention relates to the field of computers, and in particular, to a data processing method, a computing device, and a readable storage medium.
Background
When data is processed in practical application, the situation that the number of the values participating in the operation is large or the requirement on the operation precision is very high is often encountered. For example, in astronomy, when the volume and area of some stars are calculated, to reduce the error, it is necessary to make the circumferential ratio pi value accurate to several million bits or even higher. In this case we are concerned with large number operations. The large number operation refers to a data operation in which the number value participating in the operation is large or the accuracy requirement on the operation result is high.
At present, most of arithmetic is widely applied to various fields such as cryptography, scientific computing, astronomy, weather prediction and the like, and plays an important role in scientific research in each field. However, the efficiency of majority operations is low at present. Therefore, how to improve the efficiency of the majority operation becomes important.
Disclosure of Invention
To this end, the present invention provides a data processing method, a computing device and a readable storage medium in an attempt to solve or at least alleviate the problems presented above.
According to an aspect of the present invention, there is provided a data processing method, adapted to be executed in a computing device, a memory of the computing device storing data objects of a numeric type according to a predetermined storage structure, structure information of the predetermined storage structure including a data start address, a data bit width, and a data symbol of the data object, the data start address being a start address of an absolute value of the data object in the memory, the data object including a first data object and a second data object to be processed, the method including: acquiring a first start address storing structure information of a first data object and a second start address storing structure information of a second data object; acquiring a data symbol of a first data object according to the first starting address, and acquiring a data symbol of a second data object according to the second starting address; and performing operations on the first data object and the second data object by using the SIMD instruction based on at least the data symbols of the first data object and the second data object, the operations including at least one of a comparison operation, an addition operation, a subtraction operation, and a multiplication operation, wherein when the addition operation, the subtraction operation, or the multiplication operation is performed, the operations are performed in groups of N-bit numbers, and N is the bit width of a vector register in a processor of the computing device.
Optionally, in the data processing method according to the present invention, the step of performing a comparison operation on the first data object and the second data object includes: judging whether the data symbols of the first data object and the second data object are the same; if the data symbols of the first data object and the second data object are the same, acquiring the data bit width of the first data object according to the first starting address, and acquiring the data bit width of the second data object according to the second starting address; if the data bit width of the first data object is the same as that of the second data object, acquiring a data starting address of the first data object according to the first starting address, and acquiring a data starting address of the second data object according to the second starting address; and sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the upper bits to the lower bits respectively by taking N-bit numbers as a group according to the data starting addresses of the first data object and the second data object, extracting a group of data from the absolute value of the first data object and the absolute value of the second data object each time, and performing comparison operation by using a SIMD (single instruction multiple data) instruction until a comparison result of the absolute value of the first data object and the absolute value of the second data object is obtained.
Optionally, in the data processing method according to the present invention, the step of performing a comparison operation on the first data object and the second data object further includes: if the data symbols of the first data object and the second data object are different, the sizes of the first data object and the second data object are determined.
Optionally, in the data processing method according to the present invention, the step of performing a comparison operation on the first data object and the second data object further includes: if the data bit widths of the first data object and the second data object are different, the sizes of the first data object and the second data object are determined by combining the data symbols of the first data object and the second data object.
Optionally, in the data processing method according to the present invention, the step of performing an addition operation on the first data object and the second data object includes: judging whether the data symbols of the first data object and the second data object are the same; if the data symbols of the first data object and the second data object are the same, taking the data symbol of the first data object as the data symbol of the sum of the first data object and the second data object; acquiring a data initial address of a first data object according to the first initial address, and acquiring a data initial address of a second data object according to the second initial address; and sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data starting addresses of the first data object and the second data object, performing addition operation by using a SIMD (single instruction multiple data) instruction after extracting a group of data from the absolute value of the first data object and the absolute value of the second data object, and obtaining the sum of the absolute value of the first data object and the absolute value of the second data object. And storing the summation result of the first data object and the second data object into a memory according to a preset storage structure.
Optionally, in the data processing method according to the present invention, the step of performing an addition operation on the first data object and the second data object further includes: if the data symbols of the first data object and the second data object are different, comparing the absolute value of the first data object with the absolute value of the second data object, and taking the data symbol of the data object with the larger absolute value in the first data object and the second data object as the data symbol of the sum of the first data object and the second data object; converting an addition operation of the first data object and the second data object into a subtraction operation of the absolute value of the first data object and the absolute value of the second data object, wherein the greater of the absolute value of the first data object and the absolute value of the second data object is taken as a subtree; acquiring a data initial address of a first data object according to the first initial address, and acquiring a data initial address of a second data object according to the second initial address; sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data starting addresses of the first data object and the second data object, performing subtraction operation by using a SIMD (single instruction multiple data) instruction every time one group of data is extracted from the absolute value of the first data object and the absolute value of the second data object, and obtaining the difference between the absolute value of the first data object and the absolute value of the second data object; and storing the summation result of the first data object and the second data object into a memory according to a preset storage structure.
Optionally, in the data processing method according to the present invention, the step of performing a subtraction operation on the first data object and the second data object includes: judging whether the data symbols of the first data object and the second data object are the same; if the data symbols of the first data object and the second data object are different, converting the subtraction operation of the first data object and the second data object into the addition operation of the absolute value of the first data object and the absolute value of the second data object, and taking the subtracted symbol as the data symbol of the difference of the first data object and the second data object; acquiring a data initial address of a first data object according to the first initial address, and acquiring a data initial address of a second data object according to the second initial address; sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data starting addresses of the first data object and the second data object, performing addition operation by using a SIMD (single instruction multiple data) instruction after extracting a group of data from the absolute value of the first data object and the absolute value of the second data object, and obtaining the sum of the absolute value of the first data object and the absolute value of the second data object; and storing the difference result of the first data object and the second data object into a memory according to a preset storage structure.
Optionally, in the data processing method according to the present invention, the step of performing a subtraction operation on the first data object and the second data object further includes: if the data symbols of the first data object and the second data object are the same, comparing the sizes of the first data object and the second data object, and determining the data symbol of the difference of the first data object and the second data object; converting the subtraction operation of the first data object and the second data object into a subtraction operation of an absolute value of the first data object and an absolute value of the second data object, wherein the larger of the absolute value of the first data object and the absolute value of the second data object is taken as a subtracted number; acquiring a data initial address of a first data object according to the first initial address, and acquiring a data initial address of a second data object according to the second initial address; sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data starting addresses of the first data object and the second data object, performing subtraction operation by using a SIMD (single instruction multiple data) instruction every time one group of data is extracted from the absolute value of the first data object and the absolute value of the second data object, and obtaining the difference between the absolute value of the first data object and the absolute value of the second data object; and storing the difference result of the first data object and the second data object into a memory according to a preset storage structure.
Optionally, in the data processing method according to the present invention, the step of performing a multiplication operation on the first data object and the second data object includes: determining a data symbol of a product of the first data object and the second data object based on the data symbols of the first data object and the second data object; acquiring a data initial address of a first data object according to the first initial address, and acquiring a data initial address of a second data object according to the second initial address; sequentially extracting each numerical value in the absolute values of the first data object from the lower order to the upper order according to the data starting address of the first data object until all the numerical values in the absolute values of the first data object are extracted; extracting a numerical value i from the absolute value of the first data object, sequentially extracting data in the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data start address of the second data object, and performing multiplication operation on the group of data and the numerical value i by using a SIMD (single instruction multiple data) instruction until the absolute value of the second data object is multiplied by the numerical value i to obtain the product of the absolute value of the second data object and the numerical value i; using a SIMD instruction, carrying out bitwise addition on the product of each bit value in the absolute value of the first data object and the absolute value of the second data object to obtain the product of the absolute value of the first data object and the absolute value of the second data object; and storing the product result of the first data object and the second data object into a memory according to a preset storage structure.
According to yet another aspect of the present invention, there is provided a data storage method adapted to be executed in a computing device, the method comprising: receiving a data object to be stored, and acquiring a data bit width and a data symbol of the data object to be stored; storing the absolute value of the data object to be stored into an internal memory according to the data bit width of the data object to be stored, and acquiring a data initial address of the data object to be stored, wherein the data initial address is an initial address of the absolute value of the data object to be stored in the internal memory; the method comprises the steps of storing structural information of a data object to be stored into an internal memory, recording a starting address of the stored structural information so as to obtain the data object according to the structural information for operation, and performing operation by taking N-bit numbers as a group in the operation process, wherein the structural information comprises the data starting address, the data bit width and the data symbol of the data object to be stored, and N is the bit width of a vector register in a processor of the computing equipment.
Optionally, in the data storage method according to the present invention, the step of storing the structure information of the data object to be stored in the memory includes: determining a data start address, a data bit width and a data symbol number which need to be occupied for storing a data object to be stored, and a data start address, a data bit width and a data symbol sequence for storing the data object to be stored; and storing the structural information of the data object to be stored into the memory according to the determined data start address, the data bit width and the number of bytes occupied by the data symbols for storing the data object to be stored and the sequence.
Optionally, in the data storage method according to the present invention, the data object to be stored includes a large number, where the large number is a number exceeding a maximum value that can be represented by a single register in a CPU architecture of the computing device.
According to yet another aspect of the invention, there is provided a computing device comprising: at least one processor; and a memory storing program instructions, wherein the program instructions are configured to be executed by the at least one processor, the program instructions comprising instructions for performing any of the methods above.
According to yet another aspect of the present invention, there is provided a readable storage medium storing program instructions which, when read and executed by a computing device, cause the computing device to perform any of the above methods.
According to the data storage method, after the data to be stored is received, the data bit width and the data symbol of the data are firstly acquired. And then, storing the numerical value of the data to be stored in the memory. And finally, storing the initial address of the numerical value of the data to be stored in the memory, the data bit width and the data symbol of the data to be stored in the other segment of storage space of the memory. When the data stored based on the method is processed, the symbol, bit width and actual numerical value of the data can be directly extracted from the memory, and the data does not need to be analyzed and processed, so that the processing efficiency of the data can be improved.
Drawings
To the accomplishment of the foregoing and related ends, certain illustrative aspects are described herein in connection with the following description and the annexed drawings, which are indicative of various ways in which the principles disclosed herein may be practiced, and all aspects and equivalents thereof are intended to be within the scope of the claimed subject matter. The above and other objects, features and advantages of the present disclosure will become more apparent from the following detailed description read in conjunction with the accompanying drawings. Throughout this disclosure, like reference numerals generally refer to like parts or elements.
FIG. 1 shows a block diagram of a computing device 100, according to one embodiment of the invention;
FIG. 2 illustrates a flow diagram of a data storage method 200 according to one embodiment of the invention;
FIG. 3 shows a schematic diagram of a data storage structure according to one embodiment of the invention;
FIG. 4 shows a flow diagram of a data processing method 400 according to one embodiment of the invention;
FIG. 5 illustrates a schematic diagram of a data comparison operation according to one embodiment of the invention;
FIG. 6 shows a schematic diagram of absolute value addition of data according to one embodiment of the invention;
FIG. 7 shows a schematic diagram of absolute value subtraction of data according to one embodiment of the invention;
FIG. 8 shows a schematic diagram of data multiplication according to one embodiment of the invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
FIG. 1 shows a block diagram of a computing device 100, according to one embodiment of the invention. It should be noted that the computing device 100 shown in fig. 1 is only an example, and in practice, the computing device for implementing the data storage and processing method of the present invention may be any type of device, and the hardware configuration thereof may be the same as the computing device 100 shown in fig. 1 or different from the computing device 100 shown in fig. 1. In practice, the computing device implementing the data storage and processing method of the present invention may add or delete hardware components of the computing device 100 shown in fig. 1, and the present invention does not limit the specific hardware configuration of the computing device.
As shown in FIG. 1, in a basic configuration 102, a computing device 100 typically includes a system memory 106 and one or more processors 104. A memory bus 108 may be used for communication between the processor 104 and the system memory 106.
Depending on the desired configuration, the processor 104 may be any type of processing, including but not limited to: a microprocessor (μ P), a microcontroller (μ C), a digital information processor (DSP), or any combination thereof. The processor 104 may include one or more levels of cache, such as a level one cache 110 and a level two cache 112, a processor core 114, and registers 116. The example processor core 114 may include an Arithmetic Logic Unit (ALU), a Floating Point Unit (FPU), a digital signal processing core (DSP core), or any combination thereof. The example memory controller 118 may be used with the processor 104, or in some implementations the memory controller 118 may be an internal part of the processor 104.
Depending on the desired configuration, system memory 106 may be any type of memory, including but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. The physical memory in the computing device is usually referred to as a volatile memory RAM, and data in the disk needs to be loaded into the physical memory to be read by the processor 104. System memory 106 may include an operating system 120, one or more applications 122, and program data 124. In some implementations, the application 122 can be arranged to execute instructions on an operating system with program data 124 by one or more processors 104. Operating system 120 may be, for example, Linux, Windows, etc., which includes program instructions for handling basic system services and performing hardware dependent tasks. The application 122 includes program instructions for implementing various user-desired functions, and the application 122 may be, for example, but not limited to, a browser, instant messenger, a software development tool (e.g., an integrated development environment IDE, a compiler, etc.), and the like. When the application 122 is installed into the computing device 100, a driver module may be added to the operating system 120.
When the computing device 100 is started, the processor 104 reads program instructions of the operating system 120 from the memory 106 and executes them. The application 122 runs on top of the operating system 120, utilizing the operating system 120 and interfaces provided by the underlying hardware to implement various user-desired functions. When the user starts the application 122, the application 122 is loaded into the memory 106, and the processor 104 reads the program instructions of the application 122 from the memory 106 and executes the program instructions.
The computing device 100 also includes a storage device 132, the storage device 132 including removable storage 136 and non-removable storage 138, the removable storage 136 and the non-removable storage 138 each connected to the storage interface bus 134.
Computing device 100 may also include an interface bus 140 that facilitates communication from various interface devices (e.g., output devices 142, peripheral interfaces 144, and communication devices 146) to the basic configuration 102 via the bus/interface controller 130. The example output device 142 includes a graphics processing unit 148 and an audio processing unit 150. They may be configured to facilitate communication with various external devices, such as a display or speakers, via one or more a/V ports 152. Example peripheral interfaces 144 may include a serial interface controller 154 and a parallel interface controller 156, which may be configured to facilitate communication with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.) via one or more I/O ports 158. An example communication device 146 may include a network controller 160, which may be arranged to facilitate communications with one or more other computing devices 162 over a network communication link via one or more communication ports 164.
A network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, and may include any information delivery media, such as carrier waves or other transport mechanisms, in a modulated data signal. A "modulated data signal" may be a signal that has one or more of its data set or its changes made in such a manner as to encode information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or private-wired network, and various wireless media such as acoustic, Radio Frequency (RF), microwave, Infrared (IR), or other wireless media. The term computer readable media as used herein may include both storage media and communication media.
In a computing device 100 according to the present invention, the application 122 includes instructions for performing the data storage method 200 and the processing method 400 of the present invention, which may instruct the processor 104 to perform the data storage method and the processing method of the present invention. It will be appreciated by those skilled in the art that the application 122 may include other applications 126 for implementing other functions in addition to the instructions for performing the data storage method 200 and the processing method 400.
When a data is very large and exceeds the range that all basic data types in a computer can represent (for example, a 100-bit 10-ary number), the data is called a large number. Operations performed on such large numbers are referred to as large number operations.
Most of the numbers exceed the range that basic data types can be represented in a computer, and therefore cannot be directly stored in the computer. In view of this, in the prior art, a large number is often stored in a character string manner. However, when storing a large number in a character string manner, if the large number is operated, the large number needs to be analyzed first to determine the sign, the digit number and the actual numerical value of the large number, and then the large number needs to be operated correspondingly. Obviously, when storing a large number in a character string manner, the large number needs to be analyzed in advance each time the large number is operated, thereby reducing the operation efficiency.
In order to solve the above problems, the present invention provides a data storage method. After receiving a large number to be stored, firstly, the bit width and the symbol of the large number are obtained. The majority of the values are then stored in a contiguous segment of memory. And finally, storing the initial address of the memory for storing the large number value, the bit width and the symbol of the large number in another continuous storage space of the memory. Thus, when the large number is operated, the analysis processing on the large number is not needed, and the efficiency of the large number operation is improved.
FIG. 2 illustrates a flow diagram of a data storage method 200, according to one embodiment of the invention, the method 200 being suitable for execution in a computing device (e.g., the computing device 100 shown in FIG. 1). As shown in fig. 2, the method 200 begins at step S210.
In step S210, a data object to be stored is received, and a data bit width and a data symbol of the data object to be stored are obtained. The data object to be stored comprises a large number, and the large number is a number which exceeds the maximum value which can be represented by a single register in the CPU architecture of the computing equipment. When data to be stored (namely, a data object to be stored) is received, the computing device firstly judges whether the data to be stored exceeds the maximum value which can be expressed by a single register in the CPU framework of the computing device. If so, the computing device recognizes the data to be stored as a large number and reads the data bit width and data symbols of the large number. According to one embodiment of the present invention, setting a value of 0 indicates that the sign of the data is a positive number and setting a value of 1 indicates that the sign of the data is a negative number. In a specific embodiment, a person skilled in the art may set a symbolic representation manner of data according to actual needs, and is not limited specifically herein.
And then, step S220 is performed, the absolute value of the data object to be stored is stored in the memory according to the data bit width of the data object to be stored, and a data start address of the data object to be stored is obtained, where the data start address is a start address of the absolute value of the data object to be stored in the memory. Specifically, according to the bit width of the data to be stored (i.e., the data bit width), the size of the storage space required to store the data is determined. Then, a continuous segment of storage space is allocated in the memory, and the absolute value of the data to be stored is stored in the segment of memory (i.e., the segment of memory only stores the numerical value of the data to be stored). At the same time, the start address of the segment of memory storing the absolute value of the data to be stored is obtained. When the absolute value of the data to be stored is stored in the memory, the data can be sequentially stored in the memory from a high order to a low order, and also can be sequentially stored in the memory from the low order to the high order. The specific way of storing the absolute value of one datum is not limited herein. In the specific embodiment, a person skilled in the art can set the setting according to actual needs.
And then, step S230 is executed, in which the structural information of the data object to be stored is stored in the memory, and the start address of the stored structural information is recorded, so that the data object is obtained according to the structural information to perform operation, and in the operation process, the operation is performed by taking N-bit numbers as a group, where the structural information includes the data start address, the data bit width and the data symbol of the data object to be stored, and N is the bit width of a vector register in a processor of the computing device.
According to an embodiment of the present invention, storing the structure information of the data object to be stored in the memory includes: firstly, determining a data start address, a data bit width and a byte number occupied by a data symbol for storing a data object to be stored, and a sequence for storing the data start address, the data bit width and the data symbol of the data object to be stored. And then, storing the structural information of the data object to be stored into the memory according to the determined data start address, the data bit width and the number of bytes occupied by the data symbols for storing the data object to be stored and the sequence.
As an example, the data start address, the data bit width, and the data symbol of the data object to be stored need w bytes each, and the sequence stored in the memory is the data start address, the data bit width, and the data symbol of the data object to be stored. After acquiring the data start address, the data bit width and the data symbol of the data object to be stored, the computing device allocates continuous storage spaces of 3w bytes in the memory, and sequentially stores the data start address, the data bit width and the data symbol of the data object to be stored in the 3w bytes. Meanwhile, the start address of the first byte in the 3w bytes, that is, the start address of the structure information in the memory, is recorded, and the specific storage structure can be seen in fig. 3. Thus, when processing the data, the data start address, the data bit width and the data symbol of the data object can be obtained first through the start address of the 3w bytes. Then, the absolute value of the data object is obtained according to the data start address of the data object.
According to the data storage method, the big number is stored, and when the big number is processed, the bit width, the symbol and the numerical value of the big number can be directly extracted according to the initial address of the structural information of the data to be stored. The large number does not need to be analyzed firstly, and then the symbols, the digits and the actual numerical values of the large number are taken. Therefore, the data storage method of the invention is used for storing large numbers, and the processing efficiency of the large numbers can be improved.
When large numbers are operated, in order to further improve efficiency, the invention provides that Single Instruction Multiple Data (SIMD) is used to operate the large numbers on the basis of storing the large numbers by using the method.
SIMD is a set of instructions that can copy multiple operands and pack them into large registers. Simply, multiple identical instructions are executed simultaneously. For example, after the single-instruction single-data CPU decodes the addition instruction, the execution unit accesses the memory to obtain the first operand, then accesses the memory again to obtain the second operand, and then performs the summation operation. In the SIMD type CPU, several execution units access the memory at the same time after the instruction decoding, and all operands are obtained at one time for operation. That is, the execution contents of the previous single-instruction single-dataflow instruction can be completed by a single SIMD instruction, so that the efficiency of program execution can be improved. Wherein the operation source and target of the SIMD instruction are vector registers.
Currently, when a SIMD instruction is used to operate on data, the data length is generally selected to be 32 × 4=128 bits, that is, one SIMD instruction processes 4 data with a length of 32 bits in parallel. Also described using addition as an example, data d and data f are added using a SIMD instruction. After the instruction is decoded, the lower 128 bits of the data d and the data f are obtained at one time, and 4 groups of data with the length of 32 bits are added in parallel to complete the addition of the lower 128 bits of the data d and the data f. Specifically, it is assumed that d1 and f1 are the 1 st to 32 th bits of data d and f, respectively, d2 and f2 are the 33 th to 64 th bits of data d and f, respectively, d3 and f3 are the 65 th to 96 th bits of data d and f, respectively, and d4 and f4 are the 97 th to 128 th bits of data d and f, respectively. After the lower 128 bits of the data d and the data f are obtained, the 4 sets of data d1 and f1, d2 and f2, d3 and f3, d4 and f4 are simultaneously added, thereby obtaining a result of adding the lower 128 bits of the data d and the data f. Then, the next set of data (128 bits) is extracted from the data d and the data f for addition until the addition of the data d and the data f is completed.
FIG. 4 illustrates a flow diagram of a data processing method 400 according to one embodiment of the invention, the method 400 being suitable for execution in a computing device (e.g., the computing device 100 shown in FIG. 1). The method comprises the steps that a numerical data object is stored in a memory of the computing device according to a preset storage structure, structure information of the preset storage structure comprises a data starting address, a data bit width and a data symbol of the data object, the data starting address is the starting address of the absolute value of the data object in the memory, and the data object comprises a first data object and a second data object to be processed. Wherein the first data object and the second data object may be data in the field of cryptography, such as cryptographic data; data in the field of weather prediction, such as temperature data; but also data in the field of astronomy, such as volume data, area data, etc. As shown in fig. 4, the method 400 begins at step S410.
In step S410, a first start address storing structure information of a first data object and a second start address storing structure information of a second data object are acquired. In this embodiment, the first data object and the second data object are stored based on the data storage method 200 of the present invention, so that the first start address of the memory storing the structure information of the first data object and the start address of the memory storing the structure information of the second data object can be obtained. The structure information of the first data object comprises a data start address, a data bit width and a data symbol of the first data object. The structure information of the second data object includes a data start address, a data bit width, and a data symbol of the second data object.
Then, step S420 is proceeded to obtain the data symbol of the first data object according to the first start address, and obtain the data symbol of the second data object according to the second start address. Specifically, after the first start address is obtained, the data symbol of the first data object is obtained according to the data start address, the data bit width, and the data symbol sequence of the first data object stored in the memory, and the data start address, the data bit width, and the number of bytes occupied by the data symbol of the first data object. The symbol of the second data object is obtained based on the same method, which is not described herein again.
Proceeding to step S430, performing an operation on the first data object and the second data object using the SIMD instruction based on at least the data symbols of the first data object and the second data object, where the operation includes at least one of a comparison operation, an addition operation, a subtraction operation, and a multiplication operation, and where the operation is performed on a group of N-bit numbers when the addition operation, the subtraction operation, or the multiplication operation is performed, and N is a bit width of a vector register in a processor of the computing device.
According to an embodiment of the present invention, a comparison operation is performed on the first data object and the second data object, and particularly, refer to fig. 5. Where a and b are two data to be compared, namely a first data object and a second data object. a (0) represents bit 1 of the first data object a and b (0) represents bit 1 of the second data object b. a (n-1) represents the nth bit of the first data object a and b (n-1) represents the nth bit of the second data object b. H represents high, L represents low. VEQ is used in this embodiment as a comparison instruction to compare first data object a and second data object b. The method specifically comprises the following steps:
and judging whether the data symbols of the first data object and the second data object are the same. If the data symbols of the first data object and the second data object are different, one of the first data object and the second data object is a positive number and one is a negative number. Obviously, a positive number is greater than a negative number, resulting in a comparison of the first data object and the second data object.
And if the data symbols of the first data object and the second data object are the same, acquiring the data bit width of the first data object according to the first starting address, and acquiring the data bit width of the second data object according to the second starting address. And judging whether the data bit widths of the first data object and the second data object are the same or not.
If the data bit widths of the first data object and the second data object are different, the first data object and the second data object are compared in combination with the data symbols of the first data object and the second data object. If the data symbols of the first data object and the second data object are positive signs, the data size of the first data object and the second data object is larger if the bit width is larger. If the data signs of the first data object and the second data object are negative signs, the data of the first data object and the second data object is large when the bit width is small. For example, the data symbols of the first data object and the second data object are negative, the data bit width of the first data object is less than the data bit width of the second data object, and the first data object is larger than the second data object.
And if the data bit widths of the first data object and the second data object are the same, acquiring a data starting address of the first data object according to the first starting address, and acquiring a data starting address of the second data object according to the second starting address.
And sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the upper bits to the lower bits respectively by taking N-bit numbers as a group according to the data start address of the first data object and the data start address of the second data object, extracting a group of data from the absolute value of the first data object and the absolute value of the second data object each time, and performing comparison operation by using a SIMD (single instruction multiple data) instruction until a comparison result of the absolute value of the first data object and the absolute value of the second data object is obtained.
In this embodiment, first, a first set of data a (0) … a (n-1) in the absolute value of a first data object is extracted from the most significant bit, based on the data start address of the first data object, and stored in a vector register in the CPU. A first set of data b (0) … b (n-1) in the absolute value of the second data object is extracted from the most significant bits and stored in a vector register in the CPU, based on the data start address of the second data object. The bit width of the first group data a (0) … a (n-1) and the first group data b (0) … b (n-1) is the same as the bit width of the vector register in the CPU, and may be 128 bits, for example. It should be noted that the bit width of the first set of data extracted from the absolute value of the first data object and the absolute value of the second data object is not specifically limited herein. In the specific embodiment, a person skilled in the art can set the setting according to actual needs.
Then, the first set of data a (0) … a (n-1) extracted from the absolute value of the first data object is compared with the first set of data b (0) … b (n-1) extracted from the absolute value of the second data object. If the m-th bit in the first set of data a (0) … a (n-1) and the m-th bit in the first set of data b (0) … b (n-1) are equal, they are denoted by the number 0, and if not, they are denoted by the number 1. Thus, a first set of vector results is obtained by comparing the first set of data a (0) … a (n-1) and the first set of data b (0) … b (n-1).
Finally, each bit value in the first set of vector results is examined from high to low. As long as a non-zero value is present in the first set of vector results, indicating that the first set of data a (0) … a (n-1) is not equal to the first set of data b (0) … b (n-1), the check is ended. If the first set of data a (0) … a (n-1) is greater than the first set of data b (0) … b (n-1), the absolute value of the first data object is greater than the absolute value of the second data object, and if the first set of data a (0) … a (n-1) is less than the first set of data b (0) … b (n-1), the absolute value of the first data object is less than the absolute value of the second data object.
If each bit value in the first set of vector results is all zero, representing that the first set of data a (0) … a (n-1) and the first set of data b (0) … b (n-1) are equal, then the next set of data continues to be extracted from the absolute values of the first data object and the second data object and compared (in the same way as the first set of data a (0) … a (n-1) and the first set of data b (0) … b (n-1)) until a comparison is obtained of the absolute values of the first data object and the second data object. If a non-zero vector result does not occur until all values in the absolute values of the first data object and the second data object are traversed, it means that the absolute values of the first data object and the second data object are equal.
After obtaining the comparison result of the absolute value of the first data object and the absolute value of the second data object, the comparison result of the first data object and the second data object is determined in combination with the data signs of the first data object and the second data object. And if the data signs of the first data object and the second data object are positive signs, the data corresponding to the larger absolute value of the first data object and the second data object is larger. And if the data signs of the first data object and the second data object are negative signs, the data corresponding to the smaller absolute value of the first data object and the second data object is large. The result of the comparison of the first data object and the second data object is stored.
In summary, according to the data storage method 200 of the present invention, when comparing a large number, a symbol of the large number is obtained by storing a start address of structure information of the large number, and whether the symbols are the same is determined. If the signs are different, the result is obtained directly. And if the signs are the same, acquiring the bit width of the large number by storing the initial address of the structural information of the large number, and comparing the bit width. If the bit widths are different, the result is directly obtained. And if the bit widths are the same, comparing the absolute values of the large numbers by using the SIMD instruction so as to obtain the comparison result of the large numbers. Obviously, when the invention compares the large numbers, the analysis processing on the large numbers is not needed, thereby improving the efficiency of the large number comparison operation.
According to an embodiment of the present invention, the adding operation of the first data object and the second data object specifically includes the following steps:
and judging whether the data symbols of the first data object and the second data object are the same. If the data symbols of the first data object and the second data object are the same, the data symbol of the first data object is taken as the data symbol of the sum of the first data object and the second data object. Since the data symbols of the first data object and the second data object are the same, the symbol of any one of the data objects can be selected as the symbol of the sum of the first data object and the second data object.
After the data symbols of the sum of the first data object and the second data object have been determined, the absolute value of the first data object and the absolute value of the second data object are added.
And acquiring the data start address of the first data object according to the first start address, and acquiring the data start address of the second data object according to the second start address.
And sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data start address of the first data object and the data start address of the second data object, performing addition operation by using a SIMD (single instruction multiple data) instruction every time one group of data is extracted from the absolute value of the first data object and the absolute value of the second data object, and obtaining the sum of the absolute value of the first data object and the absolute value of the second data object.
The sum of the data sign of the first data object, the absolute value of the first data object and the absolute value of the second data object is taken as the result of the addition of the first data object and the second data object. By using the data storage method 200 of the present invention, the result of adding the first data object and the second data object is stored, that is, the sum result of the first data object and the second data object is stored in the memory according to the predetermined storage structure.
FIG. 6 shows a schematic diagram of absolute value addition of data according to one embodiment of the invention. Where a and b are the two data to be added, namely the first data object and the second data object. a (0) represents bit 1 of the first data object a and b (0) represents bit 1 of the second data object b. a (n-1) represents the nth bit of the first data object a and b (n-1) represents the nth bit of the second data object b. H represents high, L represents low. In this embodiment the addition instruction SIMD, VADD, is used to add the absolute value of the first data object a and the absolute value of the second data object b. The method specifically comprises the following steps:
and acquiring the data bit width of the first data object according to the first starting address, and acquiring the data bit width of the second data object according to the second starting address. And comparing the data bit width of the first data object with the data bit width of the second data object, and taking the larger bit width as the reference bit width. For example, if the data bit width of the first data object is greater than the data bit width of the second data object, the bit width of the first data object is taken as the reference bit width.
A first set of data a (0) … a (n-1) in the absolute value of the first data object is extracted from the lowest order bits based on the data start address of the first data object and stored in a vector register in the CPU. A first set of data b (0) … b (n-1) in the absolute value of the second data object is extracted from the lowest order bits and stored in a vector register in the CPU based on the data start address of the second data object.
The first group of data a (0) … a (n-1) extracted from the absolute value of the first data object and the first group of data b (0) … b (n-1) extracted from the absolute value of the second data object are added, and the result of the addition is saved as a sum vector. A carry bit (i.e., an add overflow bit) is generated if the m-th bit in the first set of data a (0) … a (n-1) and the m-th bit in the first set of data b (0) … b (n-1) are added, which is denoted by the numeral 1, and a carry bit is not generated if the m-th bit in the first set of data a (0) … a (n-1) and the m-th bit in the first set of data b (0) … b (n-1) are added, which is denoted by the numeral 0. Thus, when data a (0) … a (n-1) and data b (0) … b (n-1) are added, a set of overflow vectors is also obtained.
Each bit value in the overflow vector is checked. Whenever there is a non-zero value in the overflow vector, the sum vector and the overflow vector are added using SIMD instructions to obtain a new sum vector and a new overflow vector. This step is repeated until all the bit values in the resulting overflow vector are 0.
And continuing to extract the next group of data from the absolute value of the first data object and the absolute value of the second data object, and adding until the bit number of all the data extracted from the absolute value of the first data object reaches the reference bit width. Since each set of data is extracted from the absolute value of the first data object, a corresponding set of data is also extracted from the absolute value of the second data object, the termination condition of the traversal may also be set such that the number of bits of all data extracted from the absolute value of the second data object reaches the reference bit width.
When the last addition is performed. If the overflow bit of the highest bit is 0, the sum of the sign of the first data object, the absolute value of the first data object and the absolute value of the second data object is taken as the result of the addition of the first data object and the second data object. And the bit width of the sum of the first data object and the second data object is still the reference bit width. If the overflow bit of the highest bit is 1, a one-bit value of 1 is supplemented before the highest bit of the sum of the absolute value of the first data object and the absolute value of the second data object, and the obtained data and the sign of the first data object are taken as the result of the addition of the first data object to the second data object. In this case, the bit width of the sum of the first data object and the second data object needs to be increased by one based on the reference bit width. And storing the summation result of the first data object and the second data object into a memory according to a preset storage structure.
If the data symbols of the first data object and the second data object are different, comparing the absolute value of the first data object with the absolute value of the second data object, and taking the data symbol of the data object with the larger absolute value of the first data object and the second data object as the data symbol of the sum of the first data object and the second data object. The SIMD instruction is used to compare the absolute value of the first data object with the absolute value of the second data object, and the specific comparison method is disclosed in detail in the above embodiment of comparison operation, and is not described herein again.
Converting the addition operation of the first data object and the second data object into a subtraction operation of the absolute value of the first data object and the absolute value of the second data object, wherein the larger of the absolute value of the first data object and the absolute value of the second data object is taken as a subtree.
And acquiring the data start address of the first data object according to the first start address, and acquiring the data start address of the second data object according to the second start address.
And sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data start address of the first data object and the data start address of the second data object, performing subtraction operation by using a SIMD (single instruction multiple data) instruction every time one group of data is extracted from the absolute value of the first data object and the absolute value of the second data object, and obtaining the difference between the absolute value of the first data object and the absolute value of the second data object until the subtraction of the absolute value of the first data object and the absolute value of the second data object is completed.
The sign of the sum of the first data object and the second data object, the difference of the absolute value of the first data object and the absolute value of the second data object, obtained, are taken as the result of the addition of the first data object and the second data object. And storing the summation result of the first data object and the second data object into a memory according to a preset storage structure.
As one example, the first data object is a positive number, the second data object is a negative number, and the absolute value of the first data object is greater than the absolute value of the second data object. Since the absolute value of the first data object is greater than the absolute value of the second data object, the data sign of the first data object is taken as the sign of the sum of the first data object and the second data object, i.e. the sign of the sum of the first data object and the second data object is a positive sign. And converting the addition of the first data object and the second data object into a subtraction of the absolute value of the first data object and the absolute value of the second data object, wherein the absolute value of the first data object is taken as a subtrahend, i.e. the absolute value of the first data object is subtracted by the absolute value of the second data object.
The step of subtracting the absolute value of the second data object from the absolute value of the first data object can be seen in fig. 7. Where a and b are two data to be processed, namely a first data object and a second data object. a (0) represents bit 1 of the first data object a and b (0) represents bit 1 of the second data object b. a (n-1) represents the nth bit of the first data object a and b (n-1) represents the nth bit of the second data object b. H represents high, L represents low. In particular, VSUB is used to subtract the absolute value of the first data object a and the absolute value of the second data object b, as follows:
and acquiring the data bit width of the first data object according to the first starting address, and taking the data bit width of the first data object as the reference bit width.
A first set of data a (0) … a (n-1) in the absolute value of the first data object is extracted from the lowest order bits based on the data start address of the first data object and stored in a vector register in the CPU. A first set of data b (0) … b (n-1) in the absolute value of the second data object is extracted from the lowest order bits and stored in a vector register in the CPU based on the data start address of the second data object.
The first set of data a (0) … a (n-1) extracted from the absolute value of the first data object is subtracted from the first set of data b (0) … b (n-1) extracted from the absolute value of the second data object, and the result of the subtraction is stored as a difference vector. If the mth bit in the first set of data a (0) … a (n-1) and the mth bit in the first set of data b (0) … b (n-1) are subtracted, a borrow (i.e., a subtraction overflow bit) is generated, denoted by the numeral 1, and if the mth bit in the first set of data a (0) … a (n-1) and the mth bit in the first set of data b (0) … b (n-1) are subtracted, a borrow is not generated, denoted by the numeral 0. Thus, subtracting the first set of data a (0) … a (n-1) and the first set of data b (0) … b (n-1) also results in a set of overflow vectors.
Each bit value in the overflow vector is checked. Whenever there is a non-zero value in the overflow vector, the difference vector and the overflow vector are subtracted (difference vector minus overflow vector) using SIMD instruction to get a new difference vector and a new overflow vector. This step is repeated until all the bit values in the resulting overflow vector are 0.
And continuously extracting the next group of data from the absolute value of the first data object and the absolute value of the second data object, and subtracting until the bit number of all data extracted from the absolute value of the first data object reaches the reference bit width. Finally, the difference between the absolute value of the first data object and the absolute value of the second data object is obtained, and the bit width of the difference between the absolute value of the first data object and the absolute value of the second data object is calculated.
The sign of the first data object, the difference between the absolute value of the first data object and the absolute value of the second data object are taken as the result of the addition of the first data object and the second data object. And stores the result of adding the first data object and the second data object using the data storage method 200 in the present invention.
According to an embodiment of the present invention, the subtracting operation performed on the first data object and the second data object specifically includes the following steps:
and judging whether the data symbols of the first data object and the second data object are the same.
If the data symbols of the first data object and the second data object differ, the subtraction of the first data object and the second data object is converted into an addition of the absolute value of the first data object and the absolute value of the second data object and the sign of the subtraction is taken as the data symbol of the difference of the first data object and the second data object. For example, the first data object is a positive number, the second data object is a negative number, and the first data object is a decremented number. The subtraction of the first data object and the second data object is converted into an addition of the absolute value of the first data object and the absolute value of the second data object and the data sign (plus sign) of the first data object is taken as the sign of the difference of the first data object and the second data object.
And acquiring the data start address of the first data object according to the first start address, and acquiring the data start address of the second data object according to the second start address.
And sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data start address of the first data object and the data start address of the second data object, performing addition operation by using a SIMD (single instruction multiple data) instruction every time one group of data is extracted from the absolute value of the first data object and the absolute value of the second data object, and obtaining the sum of the absolute value of the first data object and the absolute value of the second data object. The specific method for adding the absolute value of the first data object and the absolute value of the second data object is disclosed in detail in the above embodiment for adding the first data object and the second data object, and is not described herein again.
The sign of the obtained difference of the first data object and the second data object, the sum of the absolute value of the first data object and the absolute value of the second data object are taken as the difference of the subtraction of the first data object and the second data object. And storing the difference result of the first data object and the second data object into a memory according to a preset storage structure.
If the data symbols of the first data object and the second data object are identical, the sizes of the first data object and the second data object are compared to determine the data symbol of the difference between the first data object and the second data object. The sign of the difference between the first data object and the second data object is a positive sign if the subtrahend is greater than the subtrahend. The sign of the difference of the first data object and the second data object is negative if the subtrahend is less than the subtrahend. The SIMD instruction is used to compare the first data object with the second data object, and the specific comparison method is disclosed in detail in the embodiment of the comparison operation, and is not described herein again.
Converting the subtraction of the first data object and the second data object into a subtraction of the absolute value of the first data object and the absolute value of the second data object, wherein the larger of the absolute value of the first data object and the absolute value of the second data object is taken as the subtree. For example, the absolute value of the first data object is smaller than the absolute value of the second data object, and the subtraction of the first data object from the second data object is converted into a subtraction of the absolute value of the first data object from the absolute value of the second data object, wherein the absolute value of the second data object is taken as the subtracted value, i.e. the absolute value of the second data object is subtracted from the absolute value of the first data object.
And acquiring the data start address of the first data object according to the first start address, and acquiring the data start address of the second data object according to the second start address.
And sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data start address of the first data object and the data start address of the second data object, performing subtraction operation by using a SIMD (single instruction multiple data) instruction every time one group of data is extracted from the absolute value of the first data object and the absolute value of the second data object, and obtaining the difference between the absolute value of the first data object and the absolute value of the second data object until the subtraction of the absolute value of the first data object and the absolute value of the second data object is completed. The specific method for subtracting the absolute value of the first data object from the absolute value of the second data object is disclosed in detail in the above embodiment of adding the first data object and the second data object, and is not described herein again.
The sign of the difference between the first data object and the second data object, the difference between the absolute value of the first data object and the absolute value of the second data object, obtained, is taken as the difference of the subtraction of the first data object from the second data object. The difference between the first data object and the second data object is stored using the data storage method 200 of the present invention.
Obviously, when the same-sign large numbers are added or the different-sign large numbers are subtracted, the signs of the large numbers are obtained by storing the starting addresses of the structural information of the large numbers. Then, the sign of the operation result is determined according to the signs and operators of the large numbers participating in the operation. And finally, converting the operation of adding the same-sign large numbers or subtracting the different-sign large numbers into the addition operation of the absolute values of the large numbers, and adding the absolute values of the large numbers by using the SIMD (single instruction multiple data) instruction to obtain an operation result. When the big numbers with different signs are added or the big numbers with the same signs are subtracted, firstly, the signs of the big numbers and the data initial addresses of the big numbers are obtained by storing the initial addresses of the structural information of the big numbers, and the absolute values of the big numbers are compared by using the SIMD instruction. And then determining the sign of the operation result according to the comparison result, the signs of the large numbers participating in the operation and the operator. And finally, converting the operation of adding the large numbers with different signs or subtracting the large numbers with the same signs into subtraction operation of the absolute values of the large numbers (the absolute values of the subtracted numbers are larger than the absolute values of the subtracted numbers), and subtracting the absolute values of the large numbers by using the SIMD instruction to obtain an operation result.
In summary, according to the data storage method 200 of the present invention, when a large number is added or subtracted, the sign of the operation result is determined according to the sign and the operator of the large number. The addition or subtraction of the large numbers is then converted into an addition or subtraction of the absolute values of the large numbers. And finally, carrying out addition or subtraction operation on the absolute values of the large numbers by using the SIMD instruction to obtain an operation result. Therefore, when the addition and subtraction processing is carried out on the large numbers, the analysis processing is not carried out on the large numbers, and the SIMD instruction is used for carrying out the addition and subtraction operation on the large numbers, so that the operation efficiency is improved.
According to an embodiment of the present invention, the multiplying operation performed on the first data object and the second data object specifically includes the following steps:
the data symbols of the product of the first data object and the second data object are determined from the data symbols of the first data object and the second data object. The sign of the product of the first data object and the second data object is a positive sign if the data signs of the first data object and the second data object are the same. The sign of the product of the first data object and the second data object is negative if the data signs of the first data object and the second data object are different.
After the sign of the product of the first data object and the second data object is determined, the absolute value of the first data object and the absolute value of the second data object are multiplied.
And acquiring the data start address of the first data object according to the first start address, and acquiring the data start address of the second data object according to the second start address.
And sequentially extracting each digit numerical value in the absolute value of the first data object from the lower order to the upper order according to the data starting address of the first data object until all numerical values in the absolute value of the first data object are extracted.
And extracting a numerical value i from the absolute value of the first data object, sequentially extracting data in the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data start address of the second data object, and performing multiplication operation on the group of data and the numerical value i by using a SIMD (single instruction multiple data) instruction until the absolute value of the second data object is multiplied by the numerical value i to obtain the product of the absolute value of the second data object and the numerical value i.
Using a SIMD instruction, the product of the absolute value of the first data object and the absolute value of the second data object is bit-wise added to obtain the product of the absolute value of the first data object and the absolute value of the second data object. The bit-wise addition of the product of each bit value of the absolute value of the first data object and the absolute value of the second data object means that the addition of the values of each bit value of the absolute value of the first data object and the product of the absolute value of the second data object at the same bit. That is, the same digit values of the products are aligned before the products are added.
For example, the lowest bit of the product of the lowest bit value (i.e., bit) of the absolute values of the first data object and the absolute value of the second data object is the bit. The lowest order bit of the product of the next lowest order value (i.e., the ten bit) of the absolute values of the first data object and the absolute value of the second data object is the ten bit. Therefore, when the product of the lowest numerical value of the absolute values of the first data object and the absolute value of the second data object, and the product of the next lowest numerical value of the absolute values of the first data object and the absolute value of the second data object are added, the lowest numerical value of the absolute values of the first data object and the absolute value of the second data object, and the product of the lowest numerical value of the absolute values of the first data object and the absolute value of the second data object are aligned for ten bits, and then the addition is performed.
The sign of the obtained product of the first data object and the second data object, the product of the absolute value of the first data object and the absolute value of the second data object are taken as the result of the multiplication of the first data object with the second data object. The result of multiplying the first data object by the second data object is stored using the data storage method 200 of the present invention.
FIG. 8 shows a schematic diagram of data multiplication according to one embodiment of the invention. Where a and b are two data to be multiplied, namely a first data object and a second data object. a (0) represents bit 1 of the first data object a and b (0) represents bit 1 of the second data object b. a (n-1) represents the nth bit of the first data object a and b (n-1) represents the nth bit of the second data object b. H represents high, L represents low. In this embodiment, a multiplication instruction SIMD is used, and VMUL multiplies the first data object a and the second data object b, specifically including the following steps:
the sign of the product of the first data object and the second data object is determined from the signs of the two.
And acquiring the data starting address of the first data object according to the first starting address. A first bit value a (0) of the absolute value of the first data object is extracted from the lowest bit based on the data start address of the first data object, and the value a (0) is stored in a temporary vector register in the CPU.
And acquiring the data starting address of the second data object according to the second starting address. A first set of data b (0) … b (n-1) in the absolute value of the second data object is extracted from the lowest order bits based on the data start address of the second data object, and the first set of data b (0) … b (n-1) is stored in a temporary vector register in the CPU.
The first set of data b (0) … b (n-1) extracted from the absolute value of the second data object is multiplied by the value a (0) using SIMD instructions.
And continuously extracting the next group of data from the absolute value of the second data object, and multiplying the next group of data by the value a (0) until the bit number of all data extracted from the absolute value of the second data object reaches the bit width of the second data object, so as to obtain the product of the absolute value of the second data object and the value a (0), namely the product of the absolute value of the second data object and the lowest bit value in the absolute value of the first data object. The product of the absolute value of the second data object and the value a (0) is stored in a temporary vector register in the CPU.
And continuously extracting the next bit of data from the absolute value of the first data object, and multiplying the next bit of data by the absolute value of the second data object until the bit of all the data extracted from the absolute value of the first data object reaches the bit width of the first data object, so as to obtain the product of the absolute value of the second data object and each bit value in the absolute value of the first data object. The product of the absolute value of the second data object and each bit value in the absolute value of the first data object is stored in a temporary vector register in the CPU. In this step, the method for multiplying each bit value in the absolute value of the first data object by the absolute value of the second data object is the same as the method for multiplying the value a (0) by the absolute value of the second data object, and is not described herein again.
Using the SIMD instruction, the product of each bit value in the absolute value of the first data object and the absolute value of the second data object is bit-added (see the schematic diagram of adding vector product 0 and vector product 1 in fig. 8), the product of the absolute value of the first data object and the absolute value of the second data object is obtained, and the bit width of the product of the absolute value of the first data object and the absolute value of the second data object is calculated. The method for performing the bit-wise addition of the product of each bit value in the absolute value of the first data object and the absolute value of the second data object by using the SIMD instruction is disclosed in detail in the above-mentioned embodiment of the addition operation, and is not described herein again.
The sign of the obtained product of the first data object and the second data object, the product of the absolute value of the first data object and the absolute value of the second data object are taken as the product of the first data object and the second data object. The result of multiplying the first data object by the second data object is stored using the data storage method 200 of the present invention.
An exemplary code for a large number multiplication operation, according to one embodiment of the invention, is as follows:
// storing large numbers according to a predetermined storage structure
typedef struct bn
{
int *data;
int width;
int signal;
} bn;
void getmemory()
{
bn *bignum = malloc(bn);
blgnum->data = malloc(4 * width);
}
// calculating the sign of the product
void getsignal(bn *output)
{
output-> signal = 1;
}
Extracting multiplicand, storing in vector register, and performing shift multiplication
void mul_bn()
{
v4i32 input_a, input_b; //v4i32 = vector * 4 (int32)
for(int j=0;j<bn_b—>width;j++)
{
for(int i=0;i<bn_a—>width/4; i++)
{
simd_load(input_a, bn_a—>data); //input_a = (bn_a[0], bn_a[l], bn_a[2], bn_a[3] }
simd_cpy(input_b, bn_b—>data[0]); //input_b = {bn_b[0], bn_b[0], bn_b[0], bn_b[0] }
simd_mul(temp[i], input_a, input_b); //temp[i] = input_a[i] * input_b[i]
v4i32 carry[i] = getcarry; //get carry of multiply
bn_a—>data += 4;
}
bn_b->data++;
}
}
V/adding the obtained intermediate result according to the memory alignment position and simultaneously adding the intermediate result with the carry
void sum_bn()
{
v4i32 output;
for(int i; i < (bn_a->width + bn _b->width) /4;i++)
{
simd_shift_add(output, temp[i],carry[i],shift) //
output += 1;
}
}
/*
+ temp[0]
+ temp[1]
+ temp[i]
*/
// calculating the number of bits of the multiplication result
void get_width(bn *output)
{
output->width = get_width(output->bn) //traveral the width until no data avaliable
}
In summary, according to the data storage method 200 of the present invention, a large number is stored, and when a multiplication operation is performed on the large number, the sign of the operation result is determined according to the sign of the large number. Then, the absolute values of the large numbers are multiplied by the SIMD instruction to obtain the operation result. Therefore, when the large number is subjected to multiplication, the large number is not analyzed and is subjected to multiplication by using the SIMD instruction, so that the efficiency of the multiplication is improved.
According to the data storage method, a large number to be stored is received, and the bit width and the symbol of the large number are obtained firstly. The majority of the values are then stored in a contiguous segment of memory. And finally, storing the initial address of the memory for storing the large number value, the bit width and the symbol of the large number in another continuous storage space of the memory. Therefore, when the large number is processed, the symbol, the bit width and the actual numerical value of the large number can be directly extracted from the memory, and the analysis processing on the large number is not needed, so that the efficiency of processing the large number is improved. On the basis, the invention uses the SIMD instruction to process the large number, thereby further improving the efficiency of processing the large number.
The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as removable hard drives, U.S. disks, floppy disks, CD-ROMs, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
In the case of program code execution on programmable computers, the computing device will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to execute the document loading method of the present invention according to instructions in the program code stored in the memory.
By way of example, and not limitation, readable media may comprise readable storage media and communication media. Readable storage media store information such as computer readable instructions, data structures, program modules or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of readable media.
In the description provided herein, algorithms and displays are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with examples of this invention. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
It should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules or units or components of the devices in the examples disclosed herein may be arranged in a device as described in this embodiment or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may be further divided into multiple sub-modules.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
Furthermore, some of the described embodiments are described herein as a method or combination of method elements that can be performed by a processor of a computer system or by other means of performing the described functions. A processor having the necessary instructions for carrying out the method or method elements thus forms a means for carrying out the method or method elements. Further, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is used to implement the functions performed by the elements for the purpose of carrying out the invention.
As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention has been disclosed in an illustrative rather than a restrictive sense, and the scope of the present invention is defined by the appended claims.

Claims (10)

1. A data processing method, adapted to be executed in a computing device, wherein a data object of a numerical type is stored in a memory of the computing device according to a predetermined storage structure, structure information of the predetermined storage structure includes a data start address, a data bit width, and a data symbol of the data object, the data start address is a start address of an absolute value of the data object in the memory, and the data object includes a first data object and a second data object to be processed, the method includes:
acquiring a first start address storing structure information of a first data object and a second start address storing structure information of a second data object;
acquiring the data symbol of the first data object according to the first starting address, and acquiring the data symbol of the second data object according to the second starting address;
performing an operation on the first data object and the second data object using a SIMD instruction based on at least data symbols of the first data object and the second data object, the operation including at least one of a comparison operation, an addition operation, a subtraction operation, and a multiplication operation, wherein the operation is performed in groups of N-bit numbers when performing the addition operation, the subtraction operation, or the multiplication operation, and N is a bit width of a vector register in a processor of the computing device.
2. The method of claim 1, wherein the step of performing a comparison operation on the first data object and the second data object comprises:
judging whether the data symbols of the first data object and the second data object are the same or not;
if the data symbols of the first data object and the second data object are the same, acquiring the data bit width of the first data object according to the first starting address, and acquiring the data bit width of the second data object according to the second starting address;
if the data bit width of the first data object is the same as that of the second data object, acquiring a data starting address of the first data object according to the first starting address, and acquiring a data starting address of the second data object according to the second starting address;
and according to the data starting addresses of the first data object and the second data object, taking N-bit numbers as a group, sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the upper bit to the lower bit respectively, extracting a group of data from the absolute value of the first data object and the absolute value of the second data object each time, and performing comparison operation by using a SIMD (single instruction multiple data) instruction until a comparison result of the absolute value of the first data object and the absolute value of the second data object is obtained.
3. The method of claim 2, wherein said step of performing a comparison operation on the first data object and the second data object further comprises:
determining the size of the first data object and the second data object if the data symbols of the first data object and the second data object are different;
and if the data bit widths of the first data object and the second data object are different, determining the sizes of the first data object and the second data object by combining the data symbols of the first data object and the second data object.
4. The method of claim 1, wherein the step of adding the first data object and the second data object comprises:
judging whether the data symbols of the first data object and the second data object are the same or not;
if the data symbols of the first data object and the second data object are the same, taking the data symbol of the first data object as the data symbol of the sum of the first data object and the second data object;
acquiring a data starting address of the first data object according to the first starting address, and acquiring a data starting address of the second data object according to the second starting address;
sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data starting addresses of the first data object and the second data object, performing addition operation by using a SIMD (single instruction multiple data) instruction every time one group of data is extracted from the absolute value of the first data object and the absolute value of the second data object, and obtaining the sum of the absolute value of the first data object and the absolute value of the second data object;
and storing the summation result of the first data object and the second data object into a memory according to the preset storage structure.
5. The method of claim 4, wherein said step of adding said first data object and said second data object further comprises:
if the data signs of the first data object and the second data object are different, comparing the absolute value of the first data object with the absolute value of the second data object, and taking the data sign of the data object with the larger absolute value of the first data object and the second data object as the data sign of the sum of the first data object and the second data object;
converting an addition operation of the first data object and the second data object into a subtraction operation of the absolute value of the first data object and the absolute value of the second data object, wherein the greater of the absolute value of the first data object and the absolute value of the second data object is taken as a subtrahend;
acquiring a data starting address of the first data object according to the first starting address, and acquiring a data starting address of the second data object according to the second starting address;
sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data starting addresses of the first data object and the second data object, performing subtraction operation by using a SIMD (single instruction multiple data) instruction every time one group of data is extracted from the absolute value of the first data object and the absolute value of the second data object, and obtaining the difference between the absolute value of the first data object and the absolute value of the second data object until the subtraction of the absolute value of the first data object and the absolute value of the second data object is completed;
and storing the summation result of the first data object and the second data object into a memory according to the preset storage structure.
6. The method of claim 1, wherein the step of subtracting the first data object and the second data object comprises:
judging whether the data symbols of the first data object and the second data object are the same or not;
if the data symbols of the first data object and the second data object are different, converting the subtraction operation of the first data object and the second data object into an addition operation of the absolute value of the first data object and the absolute value of the second data object, and taking the subtracted symbol as the data symbol of the difference of the first data object and the second data object;
acquiring a data starting address of the first data object according to the first starting address, and acquiring a data starting address of the second data object according to the second starting address;
sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data starting addresses of the first data object and the second data object, performing addition operation by using a SIMD (single instruction multiple data) instruction every time one group of data is extracted from the absolute value of the first data object and the absolute value of the second data object, and obtaining the sum of the absolute value of the first data object and the absolute value of the second data object;
and storing the difference result of the first data object and the second data object into a memory according to the preset storage structure.
7. The method of claim 6, wherein the step of subtracting the first data object and the second data object further comprises:
if the data symbols of the first data object and the second data object are the same, comparing the sizes of the first data object and the second data object, and determining the data symbol of the difference of the first data object and the second data object;
converting the subtraction operation of the first data object and the second data object into a subtraction operation of the absolute value of the first data object and the absolute value of the second data object, wherein the greater of the absolute value of the first data object and the absolute value of the second data object is taken as a subtree;
acquiring a data starting address of the first data object according to the first starting address, and acquiring a data starting address of the second data object according to the second starting address;
sequentially extracting data in the absolute value of the first data object and the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data starting addresses of the first data object and the second data object, performing subtraction operation by using a SIMD (single instruction multiple data) instruction every time one group of data is extracted from the absolute value of the first data object and the absolute value of the second data object, and obtaining the difference between the absolute value of the first data object and the absolute value of the second data object until the subtraction of the absolute value of the first data object and the absolute value of the second data object is completed;
and storing the difference result of the first data object and the second data object into a memory according to the preset storage structure.
8. The method of claim 1, wherein the step of multiplying the first data object and the second data object comprises:
determining a data symbol of a product of the first data object and the second data object from data symbols of the first data object and the second data object;
acquiring a data starting address of the first data object according to the first starting address, and acquiring a data starting address of the second data object according to the second starting address;
sequentially extracting each numerical value in the absolute values of the first data object from a low order to a high order according to the data starting address of the first data object until all numerical values in the absolute values of the first data object are extracted;
extracting a numerical value i from the absolute value of the first data object, sequentially extracting data in the absolute value of the second data object from the lower order to the upper order by taking N-bit numbers as a group according to the data start address of the second data object, and performing multiplication operation on the group of data and the numerical value i by using a SIMD (single instruction multiple data) instruction until the absolute value of the second data object is multiplied by the numerical value i to obtain the product of the absolute value of the second data object and the numerical value i;
using a SIMD instruction, performing bit-wise addition on a product of each bit value in the absolute value of the first data object and the absolute value of the second data object to obtain a product of the absolute value of the first data object and the absolute value of the second data object;
and storing the product result of the first data object and the second data object into a memory according to the preset storage structure.
9. A computing device, comprising:
at least one processor; and
a memory storing program instructions, wherein the program instructions are configured to be executed by the at least one processor, the program instructions comprising instructions for performing the method of any of claims 1-8.
10. A readable storage medium storing program instructions that, when read and executed by a computing device, cause the computing device to perform the method of any of claims 1-8.
CN202110000639.XA 2021-01-04 2021-01-04 Data processing method, computing device and readable storage medium Active CN112328511B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110000639.XA CN112328511B (en) 2021-01-04 2021-01-04 Data processing method, computing device and readable storage medium
CN202110367882.5A CN113064841B (en) 2021-01-04 2021-01-04 Data storage method, processing method, computing device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110000639.XA CN112328511B (en) 2021-01-04 2021-01-04 Data processing method, computing device and readable storage medium

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202110367882.5A Division CN113064841B (en) 2021-01-04 2021-01-04 Data storage method, processing method, computing device and readable storage medium

Publications (2)

Publication Number Publication Date
CN112328511A CN112328511A (en) 2021-02-05
CN112328511B true CN112328511B (en) 2021-05-04

Family

ID=74302394

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202110000639.XA Active CN112328511B (en) 2021-01-04 2021-01-04 Data processing method, computing device and readable storage medium
CN202110367882.5A Active CN113064841B (en) 2021-01-04 2021-01-04 Data storage method, processing method, computing device and readable storage medium

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202110367882.5A Active CN113064841B (en) 2021-01-04 2021-01-04 Data storage method, processing method, computing device and readable storage medium

Country Status (1)

Country Link
CN (2) CN112328511B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1174353A (en) * 1996-08-19 1998-02-25 三星电子株式会社 Single-instruction-multiple-data processing using multiple banks of vector registers
CN1842779A (en) * 2003-09-08 2006-10-04 飞思卡尔半导体公司 Data processing system for implementing SIMD operations and method thereof
CN101493760A (en) * 2008-12-24 2009-07-29 京信通信系统(中国)有限公司 High speed divider and method thereof for implementing high speed division arithmetic
CN103593159A (en) * 2012-08-14 2014-02-19 重庆重邮信科通信技术有限公司 High efficiency high accuracy division implementation method and device
WO2014210363A1 (en) * 2013-06-28 2014-12-31 Intel Corporation Multiple register memory access instructions, processors, methods, and systems
CN104598197A (en) * 2015-01-26 2015-05-06 中国科学院自动化研究所 Operation method for reciprocal value and/or reciprocal square root of floating-point number and operation device
CN105808206A (en) * 2016-03-04 2016-07-27 广州海格通信集团股份有限公司 Method and system for realizing multiplication on the basis of RAM (Random Access Memory)
CN108365849A (en) * 2018-01-10 2018-08-03 东南大学 The long LDPC code coding/decoding method of multi code Rate of Chinese character multi-code based on SIMD instruction collection
CN109582231A (en) * 2018-11-21 2019-04-05 金色熊猫有限公司 Date storage method, device, electronic equipment and storage medium
CN110084361A (en) * 2017-10-30 2019-08-02 上海寒武纪信息科技有限公司 A kind of arithmetic unit and method
CN110880038A (en) * 2019-11-29 2020-03-13 中国科学院自动化研究所 System for accelerating convolution calculation based on FPGA and convolution neural network

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7127593B2 (en) * 2001-06-11 2006-10-24 Broadcom Corporation Conditional execution with multiple destination stores
CN103577153A (en) * 2012-07-28 2014-02-12 王妍丹 Quick great number modulus solving method suitable for embedded system
CN105701200B (en) * 2016-01-12 2019-08-20 中国人民大学 A kind of Data Warehouse Security OLAP method on memory cloud computing platform
CN109600795B (en) * 2017-09-30 2021-09-28 智邦科技股份有限公司 Processing method of A-MSDU subframe and wireless network access device
CN111752955A (en) * 2020-06-29 2020-10-09 深圳前海微众银行股份有限公司 Data processing method, device, equipment and computer readable storage medium

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1174353A (en) * 1996-08-19 1998-02-25 三星电子株式会社 Single-instruction-multiple-data processing using multiple banks of vector registers
CN1842779A (en) * 2003-09-08 2006-10-04 飞思卡尔半导体公司 Data processing system for implementing SIMD operations and method thereof
CN101493760A (en) * 2008-12-24 2009-07-29 京信通信系统(中国)有限公司 High speed divider and method thereof for implementing high speed division arithmetic
CN103593159A (en) * 2012-08-14 2014-02-19 重庆重邮信科通信技术有限公司 High efficiency high accuracy division implementation method and device
WO2014210363A1 (en) * 2013-06-28 2014-12-31 Intel Corporation Multiple register memory access instructions, processors, methods, and systems
CN105247477A (en) * 2013-06-28 2016-01-13 英特尔公司 Multiple register memory access instructions, processors, methods, and systems
CN104598197A (en) * 2015-01-26 2015-05-06 中国科学院自动化研究所 Operation method for reciprocal value and/or reciprocal square root of floating-point number and operation device
CN105808206A (en) * 2016-03-04 2016-07-27 广州海格通信集团股份有限公司 Method and system for realizing multiplication on the basis of RAM (Random Access Memory)
CN110084361A (en) * 2017-10-30 2019-08-02 上海寒武纪信息科技有限公司 A kind of arithmetic unit and method
CN108365849A (en) * 2018-01-10 2018-08-03 东南大学 The long LDPC code coding/decoding method of multi code Rate of Chinese character multi-code based on SIMD instruction collection
CN109582231A (en) * 2018-11-21 2019-04-05 金色熊猫有限公司 Date storage method, device, electronic equipment and storage medium
CN110880038A (en) * 2019-11-29 2020-03-13 中国科学院自动化研究所 System for accelerating convolution calculation based on FPGA and convolution neural network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《大数乘法与实数乘法的快速算法》;陈婷等;《南京邮电大学学报(自然科学版)》;20190331;第39卷(第01期);全文 *
《大数运算的算法描述》;谭振江等;《吉林师范大学学报(自然科学版)》;20190831;第40卷(第03期);全文 *
《电子商务交易系统安全技术实现方法研究——任意长度数值实数四则运算》;钟红山;《数字技术与应用》;20131130;全文 *

Also Published As

Publication number Publication date
CN112328511A (en) 2021-02-05
CN113064841A (en) 2021-07-02
CN113064841B (en) 2023-06-06

Similar Documents

Publication Publication Date Title
CN110036368B (en) Apparatus and method for performing arithmetic operations to accumulate floating point numbers
EP3651017B1 (en) Systems and methods for performing 16-bit floating-point matrix dot product instructions
RU2263947C2 (en) Integer-valued high order multiplication with truncation and shift in architecture with one commands flow and multiple data flows
EP3719639B1 (en) Systems and methods to perform floating-point addition with selected rounding
TWI635446B (en) Weight-shifting appratus, method, system and machine accessible storage medium
US7430578B2 (en) Method and apparatus for performing multiply-add operations on packed byte data
US20040122887A1 (en) Efficient multiplication of small matrices using SIMD registers
US7395298B2 (en) Method and apparatus for performing multiply-add operations on packed data
JP7481069B2 (en) System and method for performing chained tile operations - Patents.com
JP2024038122A (en) Apparatus, method, and system for instruction of matrix operation accelerator
EP3623941A2 (en) Systems and methods for performing instructions specifying ternary tile logic operations
JP5201641B2 (en) SIMD inner product operation using duplicate operands
EP3314407B1 (en) Methods, apparatus, instructions and logic to provide vector packed histogram functionality
JP2006107463A (en) Apparatus for performing multiply-add operations on packed data
EP3623940A2 (en) Systems and methods for performing horizontal tile operations
US20210389948A1 (en) Mixed-element-size instruction
US9766886B2 (en) Instruction and logic to provide vector linear interpolation functionality
EP4020169A1 (en) Apparatuses, methods, and systems for 8-bit floating-point matrix dot product instructions
US7644115B2 (en) System and methods for large-radix computer processing
WO2018138469A1 (en) An apparatus and method for processing input operand values
CN111124495B (en) Data processing method, decoding circuit and processor
CN112328511B (en) Data processing method, computing device and readable storage medium
EP3716050B1 (en) Using fuzzy-jbit location of floating-point multiply-accumulate results
US20050154773A1 (en) Data processing apparatus and method for performing data processing operations on floating point data elements
TW202333041A (en) System and method performing floating-point operations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant