CN112506591A - Character string copying method and device, readable storage medium and computing device - Google Patents
Character string copying method and device, readable storage medium and computing device Download PDFInfo
- Publication number
- CN112506591A CN112506591A CN202110162279.3A CN202110162279A CN112506591A CN 112506591 A CN112506591 A CN 112506591A CN 202110162279 A CN202110162279 A CN 202110162279A CN 112506591 A CN112506591 A CN 112506591A
- Authority
- CN
- China
- Prior art keywords
- character string
- bit
- copying
- value
- copy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 72
- 230000010076 replication Effects 0.000 claims description 12
- 229910002056 binary alloy Inorganic materials 0.000 claims description 2
- 230000008901 benefit Effects 0.000 abstract description 4
- 238000004891 communication Methods 0.000 description 17
- 230000006870 function Effects 0.000 description 12
- 238000012545 processing Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000002093 peripheral effect Effects 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 230000007723 transport mechanism Effects 0.000 description 2
- 101100285899 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) SSE2 gene Proteins 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 239000011800 void material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/448—Execution paradigms, e.g. implementations of programming paradigms
- G06F9/4482—Procedural
- G06F9/4484—Executing subprograms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3887—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/448—Execution paradigms, e.g. implementations of programming paradigms
- G06F9/4488—Object-oriented
- G06F9/449—Object-oriented method invocation or resolution
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Executing Machine-Instructions (AREA)
Abstract
The embodiment of the invention provides a method, a device, a readable storage medium and a computing device for copying a character string, wherein the method does not repeatedly copy an original character string, but adopts a longer intermediate result formed after memcpy is executed last time to complete the copying operation, fully exerts the performance advantages of SIMD and improves the copying efficiency, and comprises the following steps: acquiring a source character string for copying and the copying times of binary representation; writing the source character string into the address space of the target character string; sequentially acquiring the numerical value of each bit from the lowest bit of the binary representation of the copying times to the right 1 bit of the highest bit of the position 1, and copying the content of the specified multiple of the source character string length from the beginning of the target character string to the end of the target character string after acquiring the numerical value of each bit; and if the obtained digit value is 1, copying the content of the target character string which is in the appointed multiple of the source character string length from the beginning to the end of the target character string again; and outputting the second character string.
Description
Technical Field
The present invention relates to the field of computer data processing technologies, and in particular, to a method and an apparatus for copying a character string, a readable storage medium, and a computing device.
Background
The string function repeat is a basic function commonly used in the universal programming language (Java, C + +) and the relational data system (MySQL, PostgreSQL, DorisDB, ClickHouse), and is used to splice a given string with itself several times to generate a new string.
Taking C language as an example, the repeat function is implemented as:
void repeat(
const char* s,
size_t n,
size_t repeat_times,
char*new_s
);
where s denotes a string to be copied, n denotes a string length, and repeat _ times denotes the number of times of copying, and if we give the string s "foobar", n is 6, and repeat _ times is 3, the length of new _ s of the calculation result of the function is 18, and the content is "foobarfoobaroboobara".
In the existing algorithm, a memory buffer with a specified size is generally allocated, and then a memcpy function is repeatedly used for copying a given character string for a plurality of times. With the number of calls of memcpy as the basic operation, the time complexity of the algorithm is O (n). However, when a given string length is small (e.g. the string length is < 8 bytes), memcpy implemented using SIMD (Single Instruction Multiple Data) instructions on the X86_64 platform degenerates to a common Data transfer Instruction (MOVE), and cannot exert the superscalar and multi-issue computing power of modern processors.
Specifically, the number of times of calling memcpy by the existing repeat function is repeat _ times, and s with the length of n is copied each time and added to the end of the new character string new _ s. The length of the new string is repeat _ times multiplied by n. The algorithm is not friendly to strings of smaller length (e.g. < 16) because the length of the copied string of memcpy does not satisfy the word length required for processing by SIMD instructions (SSE2 is 16 bytes long, AVX is 32 bytes, AVX512 is 64 bytes), and can be degraded to slower versions that are copied byte-by-byte or double-byte. For example, when repeat ("a", 1,1024, new _ s) is executed, memcpy is called 1024 times, while when repeat ("aaaaaaaaaaaaaaaa", 16, 64, new _ s) is executed, memcpy is called 64 times, and the latter is optimized by SIMD instructions, and has performance 5 times or more higher than the former.
Therefore, the implementation of the prior repeat function has the following two disadvantages: 1. the time complexity of the algorithm is O (N); 2. SIMD instruction performance cannot be exploited.
Disclosure of Invention
To this end, the present invention provides a method, an apparatus, a readable storage medium, and a computing device for copying a character string in an effort to solve or at least alleviate at least one of the problems presented above.
According to an aspect of an embodiment of the present invention, there is provided a method for copying a character string, including:
acquiring a source character string for copying and the copying times of binary representation;
initializing an address space of a target character string, writing the source character string into a start position of the address space of the target character string, setting an iteration counter with an initial value of 0, and setting an initial value as a position variable of a position obtained by adding the start position of the target character string and the length of the source character string;
sequentially acquiring the value of each bit from the lowest bit of the binary representation of the copy times to the right bit of the highest bit with the value of 1, wherein after acquiring the value of one bit each time, the method comprises the following steps: calculating the length of the source character string multiplied by the power of the current value of an iteration counter of 2 to obtain a copy length, copying the content of which the byte number from the beginning of the target character string is the copy length to a position pointed by the position variable, wherein the position variable is added with the copy length; if the obtained bit value is 1, copying the content of the target character string with the byte number from the beginning as the copy length to the position pointed by the position variable, wherein the position variable is added with the copy length; self-increment the iteration counter by 1;
and outputting the target character string.
Optionally, sequentially obtaining the value of each bit from the lowest bit of the binary representation of the number of times of copying to the right bit of the highest bit taking a value of 1, includes:
and carrying out a plurality of operations of logical right shift by 1 bit on the copy times of the binary representation until the copy times is not more than 0, and acquiring the lowest bit value of the copy times of the binary representation before carrying out the operation of logical right shift by one bit each time.
Optionally, before obtaining the source string for copying and the number of copies of the binary representation, the method further comprises:
acquiring the copying times;
judging whether the copying times is 0 or not, and determining that the copying times is not 0;
when the copying times are determined to be 0, setting a target character string as an empty character string, and outputting the target character string;
and the number of the first and second groups,
before sequentially acquiring the value of each bit from the lowest bit of the binary representation with the copy number to the right bit of the highest bit with the value of 1, the method further comprises the following steps:
judging whether the copying times is 1 or not, and determining that the copying times is not 1;
wherein the target character string is output when it is determined that the number of copying times is 1.
According to another aspect of the embodiments of the present invention, there is provided a method for copying a character string, including:
acquiring a source character string for copying and the copying times of binary representation;
initializing an address space of a target character string, writing the source character string into a start position of the address space of the target character string, setting an iteration counter with an initial value of 0, and setting an initial value as a position variable of a position obtained by adding the start position of the target character string and the length of the source character string;
sequentially acquiring the numerical value of each bit from the lowest bit of the binary representation of the copy times to the right bit of the highest bit with the value of 1; after each bit of value is obtained, the method comprises the following steps: calculating the length of the source character string multiplied by the power of the current value of the iteration counter of 2 to obtain the copy length; if the obtained bit value is 0, copying the content of the target character string with the byte number from the beginning as the copy length to the position pointed by the position variable, wherein the position variable is added with the copy length; if the obtained bit value is 1, copying the content of which the byte number from the beginning of the target character string is the copy length to the position pointed by the position variable, adding the copy length to the position variable by the position variable, and copying the content of which the byte number from the beginning of the target character string is the copy length to the position pointed by the position variable; self-increment the iteration counter by 1;
and outputting the target character string.
Optionally, sequentially obtaining the value of each bit from the lowest bit of the binary representation of the number of times of copying to the right bit of the highest bit taking a value of 1, includes:
and carrying out a plurality of operations of logical right shift by 1 bit on the copy times of the binary representation until the copy times is not more than 0, and acquiring the lowest bit value of the copy times of the binary representation before carrying out the operation of logical right shift by one bit each time.
Optionally, before obtaining the source string for copying and the number of copies of the binary representation, the method further comprises:
acquiring the copying times;
judging whether the copying times is 0 or not, and determining that the copying times is not 0;
when the copying times are determined to be 0, setting a target character string as an empty character string, and outputting the target character string;
and the number of the first and second groups,
before sequentially acquiring the value of each bit from the lowest bit of the binary representation with the copy number to the right bit of the highest bit with the value of 1, the method further comprises the following steps:
judging whether the copying times is 1 or not, and determining that the copying times is not 1;
wherein the target character string is output when it is determined that the number of copying times is 1.
According to still another aspect of the embodiments of the present invention, there is provided an apparatus for copying a character string, including:
a basic information acquisition unit for acquiring a source string for copying and the number of times of copying in binary representation;
an initialization unit configured to initialize an address space of a target character string, write the source character string in a start position of the address space of the target character string, set an iteration counter having an initial value of 0, and set an initial value as a position variable of a position obtained by adding the start position of the target character string and a length of the source character string;
an efficient replication unit, configured to sequentially obtain a value of each bit from a lowest bit of the binary representation of the number of replication times to a right bit of a highest bit having a value of 1, where after obtaining the value of each bit, the efficient replication unit includes: calculating the length of the source character string multiplied by the power of the current value of an iteration counter of 2 to obtain a copy length, copying the content of which the byte number from the beginning of the target character string is the copy length to a position pointed by the position variable, wherein the position variable is added with the copy length; if the obtained bit value is 1, copying the content of the target character string with the byte number from the beginning as the copy length to the position pointed by the position variable, wherein the position variable is added with the copy length; self-increment the iteration counter by 1;
and the result output unit is used for outputting the target character string.
According to still another aspect of the embodiments of the present invention, there is provided an apparatus for copying a character string, including:
a basic information acquisition unit for acquiring a source string for copying and the number of times of copying in binary representation;
an initialization unit configured to initialize an address space of a target character string, write the source character string in a start position of the address space of the target character string, set an iteration counter having an initial value of 0, and set an initial value as a position variable of a position obtained by adding the start position of the target character string and a length of the source character string;
the efficient copying unit is used for sequentially acquiring the numerical value of each bit from the lowest bit of the copying times represented by the binary system to the right bit of the highest bit with the value of 1; after each bit of value is obtained, the method comprises the following steps: calculating the length of the source character string multiplied by the power of the current value of the iteration counter of 2 to obtain the copy length; if the obtained bit value is 0, copying the content of the target character string with the byte number from the beginning as the copy length to the position pointed by the position variable, wherein the position variable is added with the copy length; if the obtained bit value is 1, copying the content of which the byte number from the beginning of the target character string is the copy length to the position pointed by the position variable, adding the copy length to the position variable by the position variable, and copying the content of which the byte number from the beginning of the target character string is the copy length to the position pointed by the position variable; self-increment the iteration counter by 1;
and a result output unit that outputs the target character string.
According to still another aspect of embodiments of the present invention, there is provided a readable storage medium having executable instructions thereon, which when executed, cause a computer to perform the above-described character string replication method.
According to still another aspect of an embodiment of the present invention, there is provided a computing device including: one or more processors, a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors to perform the above-described method of copying a character string.
According to the technical scheme provided by the embodiment of the invention, acquiring a source character string for copying and the copying times of a binary representation, initializing the address space of a target character string, writing the source character string into the starting position of the address space of the target character string, setting an iteration counter with an initial value of 0, setting a position variable with an initial value of a position obtained by adding the starting position of the target character string and the length of the source character string, and sequentially acquiring the numerical value of each bit from the lowest bit of the copying times of the binary representation to the right bit of the highest bit with the value of 1, wherein after acquiring the numerical value of one bit each time, the method comprises the following steps: calculating the length of the source character string multiplied by the power of the current value of an iteration counter of 2 to obtain a copy length, copying the content of which the byte number from the beginning of the target character string is the copy length to a position pointed by the position variable, wherein the position variable is added with the copy length; if the obtained bit value is 1, copying the content of the target character string with the byte number from the beginning as the copy length to the position pointed by the position variable, wherein the position variable is added with the copy length; adding 1 to the iteration counter by itself, and outputting the target character string; on one hand, the time complexity of the character string copying algorithm is reduced to O (logN), on the other hand, the advantages of the SIMD instruction can be fully exerted, the degradation of the memcpy instruction is avoided, and the character string copying efficiency is greatly improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the principles of the invention.
FIG. 1 is a block diagram of an exemplary computing device;
FIG. 2 is a flow chart illustrating a method for copying a string according to an embodiment of the present invention;
FIG. 3 is a flow chart illustrating a method for copying a string according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a string replication process according to an embodiment of the present invention;
FIG. 5 is a flow chart illustrating a method for copying a character string according to another embodiment of the present invention;
fig. 6 is a schematic structural diagram of an apparatus for copying a character string according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
FIG. 1 is a block diagram of an example computing device 100 arranged to implement a method of copying a string in accordance with the present invention. In a basic configuration 102, computing device 100 typically includes system memory 106 and one or more processors 104. A memory bus 108 may be used for communication between the processor 104 and the system memory 106.
Depending on the desired configuration, the processor 104 may be any type of processing, including but not limited to: a microprocessor (μ P), a microcontroller (μ C), a digital information processor (DSP), or any combination thereof. The processor 104 may include one or more levels of cache, such as a level one cache 110 and a level two cache 112, a processor core 114, and registers 116. The example processor core 114 may include an Arithmetic Logic Unit (ALU), a Floating Point Unit (FPU), a digital signal processing core (DSP core), or any combination thereof. The example memory controller 118 may be used with the processor 104, or in some implementations the memory controller 118 may be an internal part of the processor 104.
Depending on the desired configuration, system memory 106 may be any type of memory, including but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. System memory 106 may include an operating system 120, one or more programs 122, and program data 124. In some implementations, the program 122 can be configured to execute instructions on an operating system by one or more processors 104 using program data 124.
Computing device 100 may also include an interface bus 140 that facilitates communication from various interface devices (e.g., output devices 142, peripheral interfaces 144, and communication devices 146) to the basic configuration 102 via the bus/interface controller 130. The example output device 142 includes a graphics processing unit 148 and an audio processing unit 150. They may be configured to facilitate communication with various external devices, such as a display terminal or speakers, via one or more a/V ports 152. Example peripheral interfaces 144 may include a serial interface controller 154 and a parallel interface controller 156, which may be configured to facilitate communication with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.) via one or more I/O ports 158. An example communication device 146 may include a network controller 160, which may be arranged to facilitate communications with one or more other computing devices 162 over a network communication link via one or more communication ports 164.
A network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, and may include any information delivery media, such as carrier waves or other transport mechanisms, in a modulated data signal. A "modulated data signal" may be a signal that has one or more of its data set or its changes made in such a manner as to encode information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or private-wired network, and various wireless media such as acoustic, Radio Frequency (RF), microwave, Infrared (IR), or other wireless media. The term computer readable media as used herein may include both storage media and communication media.
The computing device 100 may be implemented as various forms of personal computers and server devices, and several computing devices 100 may constitute a cluster to provide cloud services to the outside.
Among other things, one or more programs 122 of computing device 100 include instructions for performing a method for copying character strings in accordance with the present invention.
Fig. 2 illustrates a flow chart of a method for copying a character string according to the present invention, the method starting at step S210.
In step S210, a source string for copying, and the number of copies of the binary representation are acquired.
Specifically, a source character string for copying may be acquired through an input interface of the function, where the source character string may be a character string manually input by a user or a character string output by another function. And representing the number of times of copying in a binary form so as to perform efficient copying operation based on the binary form of the number of times of copying in a subsequent step.
Subsequently, in step S220, the address space of the target character string is initialized, the source character string is written in the start position of the address space of the target character string, an iteration counter whose initial value is 0 is set, and a position variable whose initial value is a position obtained by adding the start position of the target character string and the length of the source character string is set.
If the number of times of copying is only 1, ending the process after step S220; if the number of copies is greater than 1, step S230 is performed. Further, if the number of times of copying is 0, an empty character string is directly output.
Subsequently, in step S230, sequentially obtaining the value of each bit from the lowest bit of the binary representation of the copy number to the right bit of the highest bit taking the value of 1, after obtaining the value of each bit, the method includes: calculating the length of the source character string multiplied by the power of 2 of the current value of the iterative counter to obtain the copy length, copying the content with the byte number from the beginning of the target character string as the copy length to the position pointed by the position variable, and adding the copy length by the position variable; if the obtained bit value is 1, copying the content of which the byte number from the beginning of the target character string is the copy length to the position pointed by the position variable, wherein the position variable is the self-adding copy length; the iteration counter is incremented by 1.
The self-addition of 1 means that the value of a certain variable is added with 1 and then stored in the original variable.
According to the calculation method provided in this step, the copy-once operation of each round is twice the data amount of the copy-once operation of the previous round, so that the efficiency of copying the character string is higher as the number of bits of the number of copies of the acquired binary representation increases. Particularly, when the length of the initial first character string is smaller than the minimum processing word length required by the SIMD instruction, the operation method provided in step S230 can quickly increase the data size of each copy operation to a level higher than the minimum processing word length required by the SIMD instruction, thereby avoiding a large number of inefficient MOVE operations at the bottom of the computer and significantly increasing the copy speed of the character string.
In the real-time embodiment of the present invention, an address management method in the copy operation process is provided from the bottom of the algorithm, so as to implement the copy operation based on memcpy, which is a better choice among many implementation manners, and in addition, a user can also determine the splicing position of a newly copied character string in real time by other manners, for example, by recording the length of a target character string, so as to achieve similar technical effects, and details are not repeated here.
Subsequently, in step S240, the target character string is output.
Optionally, in step S230, sequentially obtaining the value of each bit from the lowest bit of the binary representation of the copy times to the right bit of the highest bit with a value of 1, includes: and performing a plurality of operations of logical right shift by 1 bit on the copy times of the binary representation until the copy times is not more than 0, and acquiring the lowest bit value of the copy times of the binary representation before performing the operation of logical right shift by one bit.
In this embodiment, a method of sequentially obtaining the value of each bit from the lowest bit of the binary representation of the number of times of copying to the right bit of the highest bit having a value of 1 is realized. It should be noted that, since in step S220, data has already been written once into the target string, and then, at least one copy operation is performed when each bit of the binary representation copy number is acquired, and the amount of data copied in each round is twice as large as that in the previous round, the total copy number of the final source string is consistent with the input copy number.
In addition to the foregoing embodiments, an embodiment of the present invention further provides a method for sequentially obtaining a value of each bit from a lowest bit of the binary representation of the number of times of copying to a right bit of a highest bit having a value of 1, where the method specifically includes: and sequentially extracting the data of the nth bit of the copying times of the binary expression from right to left, wherein N is added by 1 after each extraction until N = N-1, the initial value of N is 1, and N represents the total number of the copying times of the binary expression.
The concept of the present invention is further explained below with reference to a complete algorithm implementation flow, and with reference to fig. 3, the method includes:
step 1: and judging whether the repeat _ times is greater than 0, if the repeat _ times is less than or equal to 0, determining that the calculated result is a null character string, and directly jumping to the step 9 of ending.
Step 2: initializing k =0, and enabling new _ s _ hdr to point to a first byte address of the byte array, wherein the new _ s _ hdr is kept unchanged all the time in the algorithm process, and the new _ s _ hdr comprises an intermediate result of each loop calculation; copy s to new _ s, which advances n bytes.
And step 3: the is _ odd is set to the last bit of the repeat _ times, and then the repeat _ times is logically shifted to the right by 1 bit.
And 4, step 4: judging whether the repeat _ times is more than 0, if so, carrying out the next step; otherwise, a jump is made to the end step 9.
And 5: cpy _ size is calculated, the result being n multiplied by 2 raised to the power k, copying cpy _ size bytes from new _ s _ hdr to the current position of new _ s, which advances cpy _ size.
Step 6: judging whether is _ odd is true, and if yes, sequentially executing; otherwise, go to step 8.
And 7: copy the cpy _ size bytes from new _ s _ hdr again to the current location of new _ s, and advance new _ s by cpy _ size.
And 8: k is increased by 1 and the step 3 is skipped.
And step 9: the algorithm ends.
The algorithm loops for the number of times log (repeat _ times) where the most significant bit 1 of repeat _ times occurs, and performs memcpy once or twice per loop, depending on whether the last bit of the current repeat _ time is 1. The temporal complexity of the algorithm is therefore log (repeat _ times) and 2log (repeat _ times) in the worst and best case, respectively, with an average temporal complexity of Θ (logN). The intermediate result generated in the previous round is used for the current calculation, the length of the character string copied in each round is multiplied along with the increase of the number of the cycles, and after the previous cycles, the length of the copied character string is quickly increased to exceed the processing word length of the SIMD instruction, so that the SIMD optimization of memcpy returns the effect, and the degradation problem of the memcpy when the short character string with the fixed length of copy is used is avoided.
The Benchmark and system tests prove that compared with the traditional repeat function, the performance of the algorithm is improved by more than 5 times.
The character string copying method provided by the invention utilizes the characteristic that the SIMD optimization of memcpy can exert performance advantages on the character string with the character string length exceeding the character length processed by SIMD, does not copy the original character string repeatedly, and completes the copying operation by adopting a longer intermediate result formed after the last execution of memcpy. Taking repeat (s, n, 1024, new _ s) as an example, referring to fig. 4, at the beginning, a first copy is made using the string s, and a subsequent copy is used, and the current result of the new _ s is used, for example, after a certain operation, 16 times s already exists in the new _ s, and at the next copy, the intermediate result of 16 times s can be directly used, so that the length of the string in the new _ s is doubled and after the initial several times of doubling, the intermediate result thereof quickly reaches the processing word length of the SIMD instruction, and then memcpy starts to use the SIMD instruction to quickly copy the string.
Referring to fig. 5, an embodiment of the present invention further provides a method for copying a character string, including:
s510, obtaining a source character string for copying and copying times of binary representation;
s520, initializing an address space of the target character string, writing the source character string into the starting position of the address space of the target character string, setting an iterative counter with an initial value of 0, and setting an initial value as a position variable of a position obtained by adding the starting position of the target character string and the length of the source character string;
s530, sequentially acquiring the numerical value of each bit from the lowest bit of the binary representation of the copy times to the right bit of the highest bit with the value of 1; after each bit of value is obtained, the method comprises the following steps: calculating the length of the source character string multiplied by the power of 2 of the current value of the iterative counter to obtain the copy length; if the obtained bit value is 0, copying the content of which the byte number from the beginning of the target character string is the copy length to the position pointed by the position variable, wherein the position variable is added with the copy length; if the obtained bit value is 1, copying the content with the byte number from the beginning of the target character string as the copy length to the position pointed by the position variable, and adding the copy length by the position variable, and copying the content with the byte number from the beginning of the target character string as the copy length to the position pointed by the position variable, wherein the position variable adds the copy length by the position variable; self-increment the iteration counter by 1;
and S540, outputting the target character string.
Since the method provided by the embodiment of the present invention has the same principle as the methods provided in steps S210 to S240, details of implementation are not described herein again.
Referring to fig. 6, an embodiment of the present invention provides an apparatus for copying a character string, including:
a basic information acquisition unit 610 for acquiring a source string for copying and the number of times of copying in binary representation;
an initializing unit 620 configured to initialize an address space of a target character string, write the source character string into a start position of the address space of the target character string, set an iteration counter having an initial value of 0, and set an initial value as a position variable of a position obtained by adding the start position of the target character string and a length of the source character string;
an efficient copy unit 630, configured to sequentially obtain a value of each bit from a lowest bit of the binary representation of the number of copies to a right bit of a highest bit having a value of 1, where after obtaining the value of each bit, the efficient copy unit includes: calculating the length of the source character string multiplied by the power of the current value of an iteration counter of 2 to obtain a copy length, copying the content of which the byte number from the beginning of the target character string is the copy length to a position pointed by the position variable, wherein the position variable is added with the copy length; if the obtained bit value is 1, copying the content of the target character string with the byte number from the beginning as the copy length to the position pointed by the position variable, wherein the position variable is added with the copy length; self-increment the iteration counter by 1;
and a result output unit 640 for outputting the target character string.
Optionally, the efficient copy unit 630 is configured to sequentially obtain the value of each bit from the lowest bit of the binary representation of the copy number to the right of the highest bit with a value of 1, where the method includes: and performing a plurality of operations of logical right shift by 1 bit on the copy times of the binary representation until the copy times is not more than 0, and acquiring the lowest bit value of the copy times of the binary representation before performing the operation of logical right shift by one bit.
Optionally, the apparatus further comprises a determining unit configured to: acquiring the copying times; judging whether the copying times is 0 or not, and determining that the copying times is not 0; and when the copying times is determined to be 0, setting the target character string as an empty character string and outputting the target character string.
Referring to fig. 6, another embodiment of the present invention provides an apparatus for copying a character string, including:
a basic information acquisition unit 710 for acquiring a source string for copying and the number of times of copying in binary representation;
an initializing unit 720 for initializing an address space of the target character string, writing the source character string into a start position of the address space of the target character string, setting an iteration counter whose initial value is 0, and setting an initial value as a position variable of a position obtained by adding the start position of the target character string and the length of the source character string;
the efficient copying unit 730 is used for sequentially acquiring the numerical value of each bit from the lowest bit of the binary representation of the copying times to the right bit of the highest bit with the value of 1; after each bit of value is obtained, the method comprises the following steps: calculating the length of the source character string multiplied by the power of 2 of the current value of the iterative counter to obtain the copy length; if the obtained bit value is 0, copying the content of which the byte number from the beginning of the target character string is the copy length to the position pointed by the position variable, wherein the position variable is the self-adding copy length; if the obtained bit value is 1, copying the content with the byte number from the beginning of the target character string as the copy length to the position pointed by the position variable, and adding the copy length by the position variable, and copying the content with the byte number from the beginning of the target character string as the copy length to the position pointed by the position variable, wherein the position variable adds the copy length by the position variable; self-increment the iteration counter by 1;
the result output unit 740 outputs the target character string.
Optionally, the efficient copy unit 730 is configured to sequentially obtain the value of each bit from the lowest bit of the binary representation of the copy number to the right bit of the highest bit with a value of 1, and includes: and performing a plurality of operations of logical right shift by 1 bit on the copy times of the binary representation until the copy times is not more than 0, and acquiring the lowest bit value of the copy times of the binary representation before performing the operation of logical right shift by one bit.
Optionally, the apparatus further comprises a determining unit configured to: acquiring the copying times; judging whether the copying times is 0 or not, and determining that the copying times is not 0; and when the copying times is determined to be 0, setting the target character string as an empty character string and outputting the target character string.
It should be understood that the various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
In the case of program code execution on programmable computers, the computing device will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to perform the various methods of the present invention according to instructions in the program code stored in the memory.
By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer-readable media includes both computer storage media and communication media. Computer storage media store information such as computer readable instructions, data structures, program modules or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of computer readable media.
It should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the method of the invention should not be construed to reflect the intent: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing inventive embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules or units or components of the apparatus in the examples invented herein may be arranged in an apparatus as described in this embodiment or alternatively may be located in one or more apparatuses different from the apparatus in this example. The modules in the foregoing examples may be combined into one module or may be further divided into multiple sub-modules.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features of the invention in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so invented, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature of the invention in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
Furthermore, some of the described embodiments are described herein as a method or combination of method elements that can be performed by a processor of a computer system or by other means of performing the described functions. A processor having the necessary instructions for carrying out the method or method elements thus forms a means for carrying out the method or method elements. Further, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is used to implement the functions performed by the elements for the purpose of carrying out the invention.
As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention is to be considered as illustrative and not restrictive in character, with the scope of the invention being indicated by the appended claims.
Claims (10)
1. A method for copying a character string, comprising:
acquiring a source character string for copying and the copying times of binary representation;
initializing an address space of a target character string, writing the source character string into a start position of the address space of the target character string, setting an iteration counter with an initial value of 0, and setting an initial value as a position variable of a position obtained by adding the start position of the target character string and the length of the source character string;
sequentially acquiring the value of each bit from the lowest bit of the binary representation of the copy times to the right bit of the highest bit with the value of 1, wherein after acquiring the value of one bit each time, the method comprises the following steps: calculating the length of the source character string multiplied by the power of the current value of an iteration counter of 2 to obtain a copy length, copying the content of which the byte number from the beginning of the target character string is the copy length to a position pointed by the position variable, wherein the position variable is added with the copy length; if the obtained bit value is 1, copying the content of the target character string with the byte number from the beginning as the copy length to the position pointed by the position variable, wherein the position variable is added with the copy length; self-increment the iteration counter by 1;
and outputting the target character string.
2. The method of claim 1, wherein obtaining the value of each bit in sequence from the least significant bit of the binary representation of the number of copies to the right of the most significant bit taking the value of 1 comprises:
and carrying out a plurality of operations of logical right shift by 1 bit on the copy times of the binary representation until the copy times is not more than 0, and acquiring the lowest bit value of the copy times of the binary representation before carrying out the operation of logical right shift by one bit each time.
3. The method of claim 1, wherein prior to obtaining the source string for replication and the number of replications of the binary representation, the method further comprises:
acquiring the copying times;
judging whether the copying times is 0 or not, and determining that the copying times is not 0;
when the copying times are determined to be 0, setting a target character string as an empty character string, and outputting the target character string;
and the number of the first and second groups,
before sequentially acquiring the value of each bit from the lowest bit of the binary representation with the copy number to the right bit of the highest bit with the value of 1, the method further comprises the following steps:
judging whether the copying times is 1 or not, and determining that the copying times is not 1;
wherein the target character string is output when it is determined that the number of copying times is 1.
4. A method for copying a character string, comprising:
acquiring a source character string for copying and the copying times of binary representation;
initializing an address space of a target character string, writing the source character string into a start position of the address space of the target character string, setting an iteration counter with an initial value of 0, and setting an initial value as a position variable of a position obtained by adding the start position of the target character string and the length of the source character string;
sequentially acquiring the numerical value of each bit from the lowest bit of the binary representation of the copy times to the right bit of the highest bit with the value of 1; after each bit of value is obtained, the method comprises the following steps: calculating the length of the source character string multiplied by the power of the current value of the iteration counter of 2 to obtain the copy length; if the obtained bit value is 0, copying the content of the target character string with the byte number from the beginning as the copy length to the position pointed by the position variable, wherein the position variable is added with the copy length; if the obtained bit value is 1, copying the content of which the byte number from the beginning of the target character string is the copy length to the position pointed by the position variable, adding the copy length to the position variable by the position variable, and copying the content of which the byte number from the beginning of the target character string is the copy length to the position pointed by the position variable; self-increment the iteration counter by 1;
and outputting the target character string.
5. The method of claim 4, wherein obtaining the value of each bit in sequence from the least significant bit of the binary representation of the number of copies to the right of the most significant bit taking a value of 1 comprises:
and carrying out a plurality of operations of logical right shift by 1 bit on the copy times of the binary representation until the copy times is not more than 0, and acquiring the lowest bit value of the copy times of the binary representation before carrying out the operation of logical right shift by one bit each time.
6. The method of claim 4, wherein prior to obtaining the source string for replication and the number of replications of the binary representation, the method further comprises:
acquiring the copying times;
judging whether the copying times is 0 or not, and determining that the copying times is not 0;
when the copying times are determined to be 0, setting a target character string as an empty character string, and outputting the target character string;
and the number of the first and second groups,
before sequentially acquiring the value of each bit from the lowest bit of the binary representation with the copy number to the right bit of the highest bit with the value of 1, the method further comprises the following steps:
judging whether the copying times is 1 or not, and determining that the copying times is not 1;
wherein the target character string is output when it is determined that the number of copying times is 1.
7. An apparatus for copying a character string, comprising:
a basic information acquisition unit for acquiring a source string for copying and the number of times of copying in binary representation;
an initialization unit configured to initialize an address space of a target character string, write the source character string in a start position of the address space of the target character string, set an iteration counter having an initial value of 0, and set an initial value as a position variable of a position obtained by adding the start position of the target character string and a length of the source character string;
an efficient replication unit, configured to sequentially obtain a value of each bit from a lowest bit of the binary representation of the number of replication times to a right bit of a highest bit having a value of 1, where after obtaining the value of each bit, the efficient replication unit includes: calculating the length of the source character string multiplied by the power of the current value of an iteration counter of 2 to obtain a copy length, copying the content of which the byte number from the beginning of the target character string is the copy length to a position pointed by the position variable, wherein the position variable is added with the copy length; if the obtained bit value is 1, copying the content of the target character string with the byte number from the beginning as the copy length to the position pointed by the position variable, wherein the position variable is added with the copy length; self-increment the iteration counter by 1;
and the result output unit is used for outputting the target character string.
8. An apparatus for copying a character string, comprising:
a basic information acquisition unit for acquiring a source string for copying and the number of times of copying in binary representation;
an initialization unit configured to initialize an address space of a target character string, write the source character string in a start position of the address space of the target character string, set an iteration counter having an initial value of 0, and set an initial value as a position variable of a position obtained by adding the start position of the target character string and a length of the source character string;
the efficient copying unit is used for sequentially acquiring the numerical value of each bit from the lowest bit of the copying times represented by the binary system to the right bit of the highest bit with the value of 1; after each bit of value is obtained, the method comprises the following steps: calculating the length of the source character string multiplied by the power of the current value of the iteration counter of 2 to obtain the copy length; if the obtained bit value is 0, copying the content of the target character string with the byte number from the beginning as the copy length to the position pointed by the position variable, wherein the position variable is added with the copy length; if the obtained bit value is 1, copying the content of which the byte number from the beginning of the target character string is the copy length to the position pointed by the position variable, adding the copy length to the position variable by the position variable, and copying the content of which the byte number from the beginning of the target character string is the copy length to the position pointed by the position variable; self-increment the iteration counter by 1;
and a result output unit that outputs the target character string.
9. A readable storage medium having executable instructions thereon, which when executed, cause a computer to perform a method as comprised in any one of claims 1-3 or cause a computer to perform a method as comprised in any one of claims 4-6.
10. A computing device, comprising: one or more processors, memory, and one or more programs stored in the memory and configured to be executed by the one or more processors for performing a method as included in any of claims 1-3, or stored in the memory and configured to be executed by the one or more processors for performing a method as included in any of claims 4-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110162279.3A CN112506591A (en) | 2021-02-05 | 2021-02-05 | Character string copying method and device, readable storage medium and computing device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110162279.3A CN112506591A (en) | 2021-02-05 | 2021-02-05 | Character string copying method and device, readable storage medium and computing device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112506591A true CN112506591A (en) | 2021-03-16 |
Family
ID=74953192
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110162279.3A Pending CN112506591A (en) | 2021-02-05 | 2021-02-05 | Character string copying method and device, readable storage medium and computing device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112506591A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104169870A (en) * | 2012-03-15 | 2014-11-26 | 国际商业机器公司 | Copying character data having a termination character from one memory location to another |
US20160063055A1 (en) * | 2014-08-29 | 2016-03-03 | Alvin Roy Reed | Method And Apparatus For Improved Database Searching |
CN108885551A (en) * | 2016-03-31 | 2018-11-23 | 英特尔公司 | memory copy instruction, processor, method and system |
CN110334257A (en) * | 2018-07-09 | 2019-10-15 | 深圳睿尚教育科技有限公司 | A kind of a plurality of data copy method and its device |
-
2021
- 2021-02-05 CN CN202110162279.3A patent/CN112506591A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104169870A (en) * | 2012-03-15 | 2014-11-26 | 国际商业机器公司 | Copying character data having a termination character from one memory location to another |
US20160063055A1 (en) * | 2014-08-29 | 2016-03-03 | Alvin Roy Reed | Method And Apparatus For Improved Database Searching |
CN108885551A (en) * | 2016-03-31 | 2018-11-23 | 英特尔公司 | memory copy instruction, processor, method and system |
CN110334257A (en) * | 2018-07-09 | 2019-10-15 | 深圳睿尚教育科技有限公司 | A kind of a plurality of data copy method and its device |
Non-Patent Citations (1)
Title |
---|
沉末: "手动实现字符串方法repeat(算法优化)", 《CSDN》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109284130B (en) | Neural network operation device and method | |
KR102471606B1 (en) | Floating-point instruction format with built-in rounding rules | |
US10884744B2 (en) | System and method of loop vectorization by compressing indices and data elements from iterations based on a control mask | |
CN110945477B (en) | Counting elements in data items in a data processing device | |
JP6082116B2 (en) | Vector move command controlled by read mask and write mask | |
CN107766079B (en) | Processor and method for executing instructions on processor | |
US10459727B2 (en) | Loop code processor optimizations | |
KR20160006589A (en) | Instruction for implementing vector loops of iterations having an iteration dependent condition | |
WO2022151854A1 (en) | Lattice password processing system and method based on risc-v, and device and storage medium | |
CN113127100B (en) | Heterogeneous program execution method and device, computing device and readable storage medium | |
US10083034B1 (en) | Method and apparatus for prefix decoding acceleration | |
Al Sideiri et al. | CUDA implementation of fractal image compression | |
US10592252B2 (en) | Efficient instruction processing for sparse data | |
CN114003289A (en) | Application program running method, computing device and storage medium | |
CN112506591A (en) | Character string copying method and device, readable storage medium and computing device | |
CN111158757A (en) | Parallel access device and method and chip | |
CN113724127B (en) | Method for realizing image matrix convolution, computing equipment and storage medium | |
CN114691549A (en) | File writing method and device and computing equipment | |
WO2019053915A1 (en) | Image processing device, image processing method, and image processing program | |
JPH11242599A (en) | Computer program | |
CN114730295A (en) | Mode-based cache block compression | |
CN116804915B (en) | Data interaction method, processor, device and medium based on memory | |
CN113064841B (en) | Data storage method, processing method, computing device and readable storage medium | |
CN117389571B (en) | Method and device for parallel decoding of t1 in jpeg2000 based on opencl | |
US11321094B2 (en) | Non-transitory computer-readable medium, assembly instruction conversion method and information processing apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210316 |
|
RJ01 | Rejection of invention patent application after publication |