CN111309988A - Character string retrieval method and device based on coding and electronic equipment - Google Patents
Character string retrieval method and device based on coding and electronic equipment Download PDFInfo
- Publication number
- CN111309988A CN111309988A CN202010078611.3A CN202010078611A CN111309988A CN 111309988 A CN111309988 A CN 111309988A CN 202010078611 A CN202010078611 A CN 202010078611A CN 111309988 A CN111309988 A CN 111309988A
- Authority
- CN
- China
- Prior art keywords
- character string
- value
- character
- length
- string
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000006870 function Effects 0.000 claims description 23
- 238000004364 calculation method Methods 0.000 claims description 16
- 238000010276 construction Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 9
- 238000004590 computer program Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 8
- 238000011156 evaluation Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- GKQPCPXONLDCMU-CCEZHUSRSA-N lacidipine Chemical compound CCOC(=O)C1=C(C)NC(C)=C(C(=O)OCC)C1C1=CC=CC=C1\C=C\C(=O)OC(C)(C)C GKQPCPXONLDCMU-CCEZHUSRSA-N 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
- G06F16/90344—Query processing by using string matching techniques
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The embodiment of the disclosure provides a character string retrieval method and device based on coding and electronic equipment, belonging to the technical field of data processing, wherein the method comprises the following steps: calculating the length of the character string to obtain a length value L of the character string; performing a modulo operation based on an N value on the sum of all character values in the character string to obtain a coded value C of the character string; constructing an index value list of the character string based on the character string length value L and the code value C of the character string; based on the encoding in the list of index values, a retrieval operation is performed on the string. By the processing scheme, the retrieval efficiency of the character strings is improved.
Description
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a method and an apparatus for retrieving a character string based on encoding, and an electronic device.
Background
In the computer field, online analytical processing OLAP refers to analytical queries on multidimensional data. Today's number OLTP is the primary application of traditional relational databases, primarily for basic, everyday transactions such as banking transactions. OLAP is a major application of data warehouse systems, supports complex analytical operations, emphasizes decision support, and provides intuitive and understandable query results.
In general, OLAP is classified into ROLAP and MOLAP. Where ROLAP refers to Relational OLAP, and in particular to multidimensional modeling on top of Relational databases. Such as the MDR modeling by IBM Cognos Framework Manager. MOLAP, then based on multidimensional database directly. Such as TM1, Essbase, PowerCube. The OLAP tool is a type of data analysis software. For TM1, it is a tool based directly on multidimensional databases; for other ROLAP tools, the multidimensional model or view is based on the relational database, and the tools provide a layer of encapsulation, so that the relational database can support multidimensional query. Of course, in essence, ROLAP is still a relational query.
In a large data scene such as OLAP, a large number of character comparison algorithms are time-consuming work, and because character strings are of non-fixed length, the existing comparison algorithms are originally inefficient. And cannot take advantage of advanced features such as parallel instructions of existing hardware.
Disclosure of Invention
In view of the above, embodiments of the present disclosure provide a method and an apparatus for searching character strings based on encoding, and an electronic device, so as to at least partially solve the problems in the prior art.
In a first aspect, an embodiment of the present disclosure provides a character string retrieval method based on encoding, including:
calculating the length of the character string to obtain a length value L of the character string;
performing a modulo operation based on an N value on the sum of all character values in the character string to obtain a coded value C of the character string;
constructing an index value list of the character string based on the character string length value L and the code value C of the character string;
based on the encoding in the list of index values, a retrieval operation is performed on the string.
According to a specific implementation manner of the embodiment of the present disclosure, the calculating the length of the character string includes:
setting a character string length calculation function;
calculating the length of the character string based on the character string length calculation function.
According to a specific implementation manner of the embodiment of the present disclosure, the calculating the length of the character string based on the character string length calculation function includes:
judging whether the character string contains an end character;
and if so, counting the number of the characters in the character string before the end character to obtain the length of the character string.
According to a specific implementation manner of the embodiment of the present disclosure, before performing an N-value-based modulo operation on a sum of all character values in the character string, the method further includes:
the N value for the modulo calculation is set in advance.
According to a specific implementation manner of the embodiment of the present disclosure, before performing an N-value-based modulo operation on a sum of all character values in the character string, the method further includes:
and pre-calculating the sum of all character values in the character string.
According to a specific implementation manner of the embodiment of the present disclosure, the constructing an index value list of the character string based on the character string length value L and the code value C of the character string includes:
dividing the characters with the same character length value L into the same large group;
within the large group, character strings with the same code value C of the character strings are distributed in the same small group;
the character strings included in each group are set to the index values.
According to a specific implementation manner of the embodiment of the present disclosure, the setting of the index value of the character string included in each group includes:
acquiring an original index value of the character string in a previous storage list;
and taking the original index value as the index value of the character string in the group.
According to a specific implementation manner of the embodiment of the present disclosure, the performing a retrieval operation on a string based on an encoding in the index value list includes:
acquiring the length LT of a character string T to be retrieved so as to determine a large group corresponding to the character string based on the LT;
calculating the code CT of the character string T to be retrieved, and finding an index value list corresponding to the CT in the group;
and comparing whether the character string at the corresponding position is the content to be retrieved or not according to the index value list corresponding to the found CT.
In a second aspect, an embodiment of the present disclosure provides an encoding-based character string retrieving apparatus, including:
the calculation module is used for calculating the length of the character string to obtain a length value L of the character string;
the module taking module is used for carrying out module taking operation based on the N value on the sum of all character values in the character string to obtain a coded value C of the character string;
the construction module is used for constructing an index value list of the character string based on the character string length value L and the code value C of the character string;
and the retrieval module is used for executing retrieval operation on the character string based on the codes in the index value list.
In a third aspect, an embodiment of the present disclosure further provides an electronic device, where the electronic device includes:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect or any implementation of the first aspect.
In a fourth aspect, the disclosed embodiments also provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the encoding-based string retrieval method in the first aspect or any implementation manner of the first aspect.
In a fifth aspect, the present disclosure also provides a computer program product, which includes a computer program stored on a non-transitory computer-readable storage medium, where the computer program includes program instructions, and when the program instructions are executed by a computer, the computer is caused to execute the encoding-based character string retrieval method in the foregoing first aspect or any implementation manner of the first aspect.
The character string retrieval scheme based on the coding in the embodiment of the disclosure comprises the steps of calculating the length of a character string to obtain a length value L of the character string; performing a modulo operation based on an N value on the sum of all character values in the character string to obtain a coded value C of the character string; constructing an index value list of the character string based on the character string length value L and the code value C of the character string; based on the encoding in the list of index values, a retrieval operation is performed on the string. By adopting the processing scheme disclosed by the invention and adopting a specific coding method, the retrieval target can be quickly positioned, and the retrieval efficiency of the character string is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a method for retrieving a character string based on encoding according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of a character string retrieval method based on encoding according to an embodiment of the present disclosure;
FIG. 3 is a flow chart of another method for retrieving a string based on encoding according to an embodiment of the present disclosure;
FIG. 4 is a flow chart of another method for retrieving a string based on encoding according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an encoding-based string retrieval apparatus according to an embodiment of the present disclosure;
fig. 6 is a schematic diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
The embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.
The embodiments of the present disclosure are described below with specific examples, and other advantages and effects of the present disclosure will be readily apparent to those skilled in the art from the disclosure in the specification. It is to be understood that the described embodiments are merely illustrative of some, and not restrictive, of the embodiments of the disclosure. The disclosure may be embodied or carried out in various other specific embodiments, and various modifications and changes may be made in the details within the description without departing from the spirit of the disclosure. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the disclosure, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present disclosure, and the drawings only show the components related to the present disclosure rather than the number, shape and size of the components in actual implementation, and the type, amount and ratio of the components in actual implementation may be changed arbitrarily, and the layout of the components may be more complicated.
In addition, in the following description, specific details are provided to facilitate a thorough understanding of the examples. However, it will be understood by those skilled in the art that the aspects may be practiced without these specific details.
The embodiment of the disclosure provides a character string retrieval method based on coding. The encoding-based character string retrieval method provided by the present embodiment may be executed by a computing device, which may be implemented as software, or implemented as a combination of software and hardware, and may be integrally provided in a server, a client, or the like.
Referring to fig. 1, the method for retrieving a character string based on encoding in the embodiment of the present disclosure may include the following steps:
s101, calculating the length of the character string to obtain a character string length value L.
Taking the column type storage common in the big data field as an example, see table 1, the following data are as follows:
table 1 string storage list
The respective indexes of the character strings can be represented by I1, I2 and I3 …, and the respective character string values can be represented by V1, V2 and V3 …. In this way, the length value L of each character can be calculated, and the length value of the character string indicates the number of characters included in the character string.
In the process of obtaining the Length value of the character string, a character string reading function Length () may be set, and the Length value of each character string is read through the character string reading function, for example, L1 ═ Length (V1) ═ 4. In this way, the lengths L2, L3 … of the other character strings can be obtained in sequence.
And S102, performing N-value-based modular operation on the sum of all character values in the character string to obtain a code value C of the character string.
In addition to calculating the length of the string, the sum of the string values is calculated, and the encoded value C of the string can be obtained by adding the values of the string and modulo the value of N. Wherein, N is a natural number, and the value of N can be dynamically adjusted according to the number of the character strings.
As an example, the encoding value calculation may be performed by setting a modulo operation function Encode (), such as: encode (V1) ═ Encode ("Jack") (106+97+99+ 107)% 32 ═ C1. In such a manner, the encoded values of other strings may also be calculated, respectively replaced with C2, C3 ….
S103, constructing an index value list of the character string based on the character string length value L and the code value C of the character string.
After the length value of the character string and the code value of the character string are obtained, an index value list of the character string may be constructed based on the length value L of the character string and the code value C of the character string, see table 2, and an index structure of the index value list may be performed in the following manner:
(1) dividing character strings into large groups according to the equal character length L;
(2) in each large group, dividing character strings with the same C value into the same small group by taking the C value as a basis;
(3) within each subgroup, each group of C values corresponds to one to a plurality of string index values L, which may be the index values of strings in the original string storage list.
TABLE 2 list of string index values
S104, based on the codes in the index value list, the character string is searched.
By setting the index value list and the corresponding codes therein, the retrieval operation can be performed on the character string based on the codes. Taking the query string T as an example, the length LT of T is calculated first, and then the query range is narrowed to the packet corresponding to LT. Then, the coding CT of T is calculated, and then an index value list corresponding to the CT is found in the grouping. And then comparing whether the characters at the corresponding positions are the contents to be retrieved or not according to the indexes.
By the mode in the embodiment, the character string can be quickly positioned and retrieved, and the efficiency of character string retrieval is improved.
Referring to fig. 2, according to a specific implementation manner of the embodiment of the present disclosure, the calculating the length of the character string includes:
s201, setting a character string length calculation function.
S202, calculating the length of the character string based on the character string length calculation function.
In the implementation of steps S201 to S202, in the process of obtaining the Length value of the character string, a character string reading function Length () may be set, and the Length value of each character string may be read through the character string reading function, for example, L1 is Length (V1) is 4. In this way, the lengths L2, L3 … of the other character strings can be obtained in sequence.
According to a specific implementation manner of the embodiment of the present disclosure, the calculating the length of the character string based on the character string length calculation function includes: judging whether the character string contains an end character; and if so, counting the number of the characters in the character string before the end character to obtain the length of the character string.
According to a specific implementation manner of the embodiment of the present disclosure, before performing an N-value-based modulo operation on a sum of all character values in the character string, the method further includes: the N value for modular calculation is preset, and the small groups can be flexibly divided according to the number of the character strings by presetting the N value.
According to a specific implementation manner of the embodiment of the present disclosure, before performing an N-value-based modulo operation on a sum of all character values in the character string, the method further includes: and pre-calculating the sum of all character values in the character string.
Referring to fig. 3, according to a specific implementation manner of the embodiment of the present disclosure, the constructing an index value list of the character string based on the character string length value L and the code value C of the character string includes:
s301, dividing the characters with the same character length value L into the same large group;
s302, in the large group, distributing character strings with the same code value C of the character strings in the same small group;
s303, sets an index value for each character string included in each group.
In the implementation of steps S301 to S303, the character strings may be divided into a large group according to the character lengths L being equal; in each large group, dividing character strings with the same C value into the same small group by taking the C value as a basis; within each subgroup, each group of C values corresponds to one to a plurality of string index values L, which may be the index values of strings in the original string storage list.
According to a specific implementation manner of the embodiment of the present disclosure, the setting of the index value of the character string included in each group includes:
acquiring an original index value of the character string in a previous storage list;
and taking the original index value as the index value of the character string in the group.
Referring to fig. 4, according to a specific implementation manner of the embodiment of the present disclosure, the performing a retrieval operation on a string based on the encoding in the index value list includes:
s401, acquiring the length LT of a character string T to be retrieved so as to determine a large group corresponding to the character string based on the LT;
s402, calculating the code CT of the character string T to be retrieved, and finding an index value list corresponding to the CT in the group;
and S403, comparing whether the character string at the corresponding position is the content to be retrieved or not according to the index value list corresponding to the found CT.
Corresponding to the above method embodiment, referring to fig. 5, the disclosed embodiment further provides an encoding-based character string retrieving apparatus 50, including:
the calculating module 501 is configured to calculate the length of the character string to obtain a length value L of the character string.
Taking the column type storage common in the big data field as an example, see table 1, the following data are as follows:
the respective indexes of the character strings can be represented by I1, I2 and I3 …, and the respective character string values can be represented by V1, V2 and V3 …. In this way, the length value L of each character can be calculated, and the length value of the character string indicates the number of characters included in the character string.
In the process of obtaining the Length value of the character string, a character string reading function Length () may be set, and the Length value of each character string is read through the character string reading function, for example, L1 ═ Length (V1) ═ 4. In this way, the lengths L2, L3 … of the other character strings can be obtained in sequence.
The module taking module 502 is configured to perform a module taking operation based on an N value on the sum of all character values in the character string to obtain a code value C of the character string.
In addition to calculating the length of the string, the sum of the string values is calculated, and the encoded value C of the string can be obtained by adding the values of the string and modulo the value of N. Wherein, N is a natural number, and the value of N can be dynamically adjusted according to the number of the character strings.
As an example, the encoding value calculation may be performed by setting a modulo operation function Encode (), such as: encode (V1) ═ Encode ("Jack") (106+97+99+ 107)% 32 ═ C1. In such a manner, the encoded values of other strings may also be calculated, respectively replaced with C2, C3 ….
A constructing module 503, configured to construct an index value list of the character string based on the character string length value L and the code value C of the character string.
After the length value of the character string and the code value of the character string are obtained, an index value list of the character string may be constructed based on the length value L of the character string and the code value C of the character string, see table 2, and an index structure of the index value list may be performed in the following manner:
(1) dividing character strings into large groups according to the equal character length L;
(2) in each large group, dividing character strings with the same C value into the same small group by taking the C value as a basis;
(3) within each subgroup, each group of C values corresponds to one to a plurality of string index values L, which may be the index values of strings in the original string storage list.
A retrieving module 504, configured to perform a retrieving operation on the string based on the encoding in the index value list.
By setting the index value list and the corresponding codes therein, the retrieval operation can be performed on the character string based on the codes. Taking the query string T as an example, the length LT of T is calculated first, and then the query range is narrowed to the packet corresponding to LT. Then, the coding CT of T is calculated, and then an index value list corresponding to the CT is found in the grouping. And then comparing whether the characters at the corresponding positions are the contents to be retrieved or not according to the indexes.
By the mode in the embodiment, the character string can be quickly positioned and retrieved, and the efficiency of character string retrieval is improved.
For parts not described in detail in this embodiment, reference is made to the contents described in the above method embodiments, which are not described again here.
Referring to fig. 6, an embodiment of the present disclosure also provides an electronic device 60, including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the encoding-based string retrieval method of the foregoing method embodiments.
The disclosed embodiments also provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the encoding-based string retrieval method in the aforementioned method embodiments.
The disclosed embodiments also provide a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform the encoding-based string retrieval method in the aforementioned method embodiments.
Referring now to FIG. 6, a schematic diagram of an electronic device 60 suitable for use in implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 6, the electronic device 60 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 60 are also stored. The processing device 601, the ROM602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, image sensor, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 60 to communicate with other devices wirelessly or by wire to exchange data. While the figures illustrate an electronic device 60 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present disclosure.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring at least two internet protocol addresses; sending a node evaluation request comprising the at least two internet protocol addresses to node evaluation equipment, wherein the node evaluation equipment selects the internet protocol addresses from the at least two internet protocol addresses and returns the internet protocol addresses; receiving an internet protocol address returned by the node evaluation equipment; wherein the obtained internet protocol address indicates an edge node in the content distribution network.
Alternatively, the computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving a node evaluation request comprising at least two internet protocol addresses; selecting an internet protocol address from the at least two internet protocol addresses; returning the selected internet protocol address; wherein the received internet protocol address indicates an edge node in the content distribution network.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation of the unit itself, for example, the first retrieving unit may also be described as a "unit for retrieving at least two internet protocol addresses".
It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof.
The above description is only for the specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present disclosure should be covered within the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.
Claims (11)
1. An encoding-based string retrieval method, comprising:
calculating the length of the character string to obtain a length value L of the character string;
performing a modulo operation based on an N value on the sum of all character values in the character string to obtain a coded value C of the character string;
constructing an index value list of the character string based on the character string length value L and the code value C of the character string;
based on the encoding in the list of index values, a retrieval operation is performed on the string.
2. The method of claim 1, wherein calculating the length of the string comprises:
setting a character string length calculation function;
calculating the length of the character string based on the character string length calculation function.
3. The method of claim 2, wherein calculating the length of the character string based on the character string length calculation function comprises:
judging whether the character string contains an end character;
and if so, counting the number of the characters in the character string before the end character to obtain the length of the character string.
4. The method of claim 1, wherein prior to performing the N-value based modulo operation on the sum of all character values in the string, the method further comprises:
the N value for the modulo calculation is set in advance.
5. The method of claim 1, wherein prior to performing the N-value based modulo operation on the sum of all character values in the string, the method further comprises:
and pre-calculating the sum of all character values in the character string.
6. The method according to claim 1, wherein constructing the list of index values of the character string based on the character string length value L and the code value C of the character string comprises:
dividing the characters with the same character length value L into the same large group;
within the large group, character strings with the same code value C of the character strings are distributed in the same small group;
the character strings included in each group are set to the index values.
7. The method according to claim 6, wherein the setting of the index value of the character string contained in each subgroup comprises:
acquiring an original index value of the character string in a previous storage list;
and taking the original index value as the index value of the character string in the group.
8. The method of claim 6, wherein performing a search operation on a string based on the encoding in the list of index values comprises:
acquiring the length LT of a character string T to be retrieved so as to determine a large group corresponding to the character string based on the LT;
calculating the code CT of the character string T to be retrieved, and finding an index value list corresponding to the CT in the group;
and comparing whether the character string at the corresponding position is the content to be retrieved or not according to the index value list corresponding to the found CT.
9. An encoding-based character string retrieval apparatus, comprising:
the calculation module is used for calculating the length of the character string to obtain a length value L of the character string;
the module taking module is used for carrying out module taking operation based on the N value on the sum of all character values in the character string to obtain a coded value C of the character string;
the construction module is used for constructing an index value list of the character string based on the character string length value L and the code value C of the character string;
and the retrieval module is used for executing retrieval operation on the character string based on the codes in the index value list.
10. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the encoding-based string retrieval method of any of the preceding claims 1-8.
11. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the encoding-based string retrieval method of any one of the preceding claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010078611.3A CN111309988B (en) | 2020-02-03 | 2020-02-03 | Character string retrieval method and device based on coding and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010078611.3A CN111309988B (en) | 2020-02-03 | 2020-02-03 | Character string retrieval method and device based on coding and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111309988A true CN111309988A (en) | 2020-06-19 |
CN111309988B CN111309988B (en) | 2023-05-02 |
Family
ID=71161635
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010078611.3A Active CN111309988B (en) | 2020-02-03 | 2020-02-03 | Character string retrieval method and device based on coding and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111309988B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111782895A (en) * | 2020-07-02 | 2020-10-16 | 北京字节跳动网络技术有限公司 | Retrieval processing method and device, readable medium and electronic equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101165681A (en) * | 2006-10-17 | 2008-04-23 | 中兴通讯股份有限公司 | Character string matching information processing method in communication system |
US20150379127A1 (en) * | 2014-06-27 | 2015-12-31 | Gerd Mueller | Fuzzy substring search |
JP2018081611A (en) * | 2016-11-18 | 2018-05-24 | 日本電信電話株式会社 | Dictionary search method, device, and program |
CN109902142A (en) * | 2019-02-27 | 2019-06-18 | 西安电子科技大学 | A kind of character string fuzzy matching and querying method based on editing distance |
CN110572161A (en) * | 2019-09-10 | 2019-12-13 | 北京中科寒武纪科技有限公司 | data encoding method and device, computer equipment and readable storage medium |
-
2020
- 2020-02-03 CN CN202010078611.3A patent/CN111309988B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101165681A (en) * | 2006-10-17 | 2008-04-23 | 中兴通讯股份有限公司 | Character string matching information processing method in communication system |
US20150379127A1 (en) * | 2014-06-27 | 2015-12-31 | Gerd Mueller | Fuzzy substring search |
JP2018081611A (en) * | 2016-11-18 | 2018-05-24 | 日本電信電話株式会社 | Dictionary search method, device, and program |
CN109902142A (en) * | 2019-02-27 | 2019-06-18 | 西安电子科技大学 | A kind of character string fuzzy matching and querying method based on editing distance |
CN110572161A (en) * | 2019-09-10 | 2019-12-13 | 北京中科寒武纪科技有限公司 | data encoding method and device, computer equipment and readable storage medium |
Non-Patent Citations (1)
Title |
---|
于长永;高明;柏禄一;赵宇海;: "一种带有长度和位置约束的字符串索引方法" * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111782895A (en) * | 2020-07-02 | 2020-10-16 | 北京字节跳动网络技术有限公司 | Retrieval processing method and device, readable medium and electronic equipment |
CN111782895B (en) * | 2020-07-02 | 2024-03-19 | 北京字节跳动网络技术有限公司 | Retrieval processing method and device, readable medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN111309988B (en) | 2023-05-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11550826B2 (en) | Method and system for generating a geocode trie and facilitating reverse geocode lookups | |
US10747737B2 (en) | Altering data type of a column in a database | |
US10467229B2 (en) | Query-time analytics on graph queries spanning subgraphs | |
CN111191434B (en) | Sports news writing method and device based on natural language and electronic equipment | |
CN110198473B (en) | Video processing method and device, electronic equipment and computer readable storage medium | |
CN111198861A (en) | Logic log processing method and device and electronic equipment | |
CN111241137A (en) | Data processing method and device, electronic equipment and storage medium | |
CN112307061A (en) | Method and device for querying data | |
CN111309988B (en) | Character string retrieval method and device based on coding and electronic equipment | |
CN111737571B (en) | Searching method and device and electronic equipment | |
CN116628049B (en) | Information system maintenance management system and method based on big data | |
CN109542912B (en) | Interval data storage method, device, server and storage medium | |
CN114328700B (en) | Data checking method and device in medical data ETL task | |
CN113157695B (en) | Data processing method and device, readable medium and electronic equipment | |
CN109800361A (en) | A kind of method for digging of interest point name, device, electronic equipment and storage medium | |
CN113204557B (en) | Electronic form importing method, device, equipment and medium | |
CN111143355B (en) | Data processing method and device | |
CN111737572B (en) | Search statement generation method and device and electronic equipment | |
CN110727672A (en) | Data mapping relation query method and device, electronic equipment and readable medium | |
CN117349290A (en) | Data processing method and device based on online analysis processing and electronic equipment | |
CN118296445B (en) | Social media data acceleration calculation method and device | |
CN118368268B (en) | Unified operation method and device for multi-social media comments | |
CN111177588B (en) | Interest point retrieval method and device | |
CN116860765A (en) | JSON format data warehousing method and device operated concurrently and electronic equipment | |
CN117349288A (en) | Data query method and device based on online analysis processing and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |