CN111309988A - Character string retrieval method and device based on coding and electronic equipment - Google Patents

Character string retrieval method and device based on coding and electronic equipment Download PDF

Info

Publication number
CN111309988A
CN111309988A CN202010078611.3A CN202010078611A CN111309988A CN 111309988 A CN111309988 A CN 111309988A CN 202010078611 A CN202010078611 A CN 202010078611A CN 111309988 A CN111309988 A CN 111309988A
Authority
CN
China
Prior art keywords
character string
value
character
length
string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010078611.3A
Other languages
Chinese (zh)
Other versions
CN111309988B (en
Inventor
李育国
刘建辉
舒彦博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN202010078611.3A priority Critical patent/CN111309988B/en
Publication of CN111309988A publication Critical patent/CN111309988A/en
Application granted granted Critical
Publication of CN111309988B publication Critical patent/CN111309988B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The embodiment of the disclosure provides a character string retrieval method and device based on coding and electronic equipment, belonging to the technical field of data processing, wherein the method comprises the following steps: calculating the length of the character string to obtain a length value L of the character string; performing a modulo operation based on an N value on the sum of all character values in the character string to obtain a coded value C of the character string; constructing an index value list of the character string based on the character string length value L and the code value C of the character string; based on the encoding in the list of index values, a retrieval operation is performed on the string. By the processing scheme, the retrieval efficiency of the character strings is improved.

Description

Character string retrieval method and device based on coding and electronic equipment
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a method and an apparatus for retrieving a character string based on encoding, and an electronic device.
Background
In the computer field, online analytical processing OLAP refers to analytical queries on multidimensional data. Today's number OLTP is the primary application of traditional relational databases, primarily for basic, everyday transactions such as banking transactions. OLAP is a major application of data warehouse systems, supports complex analytical operations, emphasizes decision support, and provides intuitive and understandable query results.
In general, OLAP is classified into ROLAP and MOLAP. Where ROLAP refers to Relational OLAP, and in particular to multidimensional modeling on top of Relational databases. Such as the MDR modeling by IBM Cognos Framework Manager. MOLAP, then based on multidimensional database directly. Such as TM1, Essbase, PowerCube. The OLAP tool is a type of data analysis software. For TM1, it is a tool based directly on multidimensional databases; for other ROLAP tools, the multidimensional model or view is based on the relational database, and the tools provide a layer of encapsulation, so that the relational database can support multidimensional query. Of course, in essence, ROLAP is still a relational query.
In a large data scene such as OLAP, a large number of character comparison algorithms are time-consuming work, and because character strings are of non-fixed length, the existing comparison algorithms are originally inefficient. And cannot take advantage of advanced features such as parallel instructions of existing hardware.
Disclosure of Invention
In view of the above, embodiments of the present disclosure provide a method and an apparatus for searching character strings based on encoding, and an electronic device, so as to at least partially solve the problems in the prior art.
In a first aspect, an embodiment of the present disclosure provides a character string retrieval method based on encoding, including:
calculating the length of the character string to obtain a length value L of the character string;
performing a modulo operation based on an N value on the sum of all character values in the character string to obtain a coded value C of the character string;
constructing an index value list of the character string based on the character string length value L and the code value C of the character string;
based on the encoding in the list of index values, a retrieval operation is performed on the string.
According to a specific implementation manner of the embodiment of the present disclosure, the calculating the length of the character string includes:
setting a character string length calculation function;
calculating the length of the character string based on the character string length calculation function.
According to a specific implementation manner of the embodiment of the present disclosure, the calculating the length of the character string based on the character string length calculation function includes:
judging whether the character string contains an end character;
and if so, counting the number of the characters in the character string before the end character to obtain the length of the character string.
According to a specific implementation manner of the embodiment of the present disclosure, before performing an N-value-based modulo operation on a sum of all character values in the character string, the method further includes:
the N value for the modulo calculation is set in advance.
According to a specific implementation manner of the embodiment of the present disclosure, before performing an N-value-based modulo operation on a sum of all character values in the character string, the method further includes:
and pre-calculating the sum of all character values in the character string.
According to a specific implementation manner of the embodiment of the present disclosure, the constructing an index value list of the character string based on the character string length value L and the code value C of the character string includes:
dividing the characters with the same character length value L into the same large group;
within the large group, character strings with the same code value C of the character strings are distributed in the same small group;
the character strings included in each group are set to the index values.
According to a specific implementation manner of the embodiment of the present disclosure, the setting of the index value of the character string included in each group includes:
acquiring an original index value of the character string in a previous storage list;
and taking the original index value as the index value of the character string in the group.
According to a specific implementation manner of the embodiment of the present disclosure, the performing a retrieval operation on a string based on an encoding in the index value list includes:
acquiring the length LT of a character string T to be retrieved so as to determine a large group corresponding to the character string based on the LT;
calculating the code CT of the character string T to be retrieved, and finding an index value list corresponding to the CT in the group;
and comparing whether the character string at the corresponding position is the content to be retrieved or not according to the index value list corresponding to the found CT.
In a second aspect, an embodiment of the present disclosure provides an encoding-based character string retrieving apparatus, including:
the calculation module is used for calculating the length of the character string to obtain a length value L of the character string;
the module taking module is used for carrying out module taking operation based on the N value on the sum of all character values in the character string to obtain a coded value C of the character string;
the construction module is used for constructing an index value list of the character string based on the character string length value L and the code value C of the character string;
and the retrieval module is used for executing retrieval operation on the character string based on the codes in the index value list.
In a third aspect, an embodiment of the present disclosure further provides an electronic device, where the electronic device includes:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect or any implementation of the first aspect.
In a fourth aspect, the disclosed embodiments also provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the encoding-based string retrieval method in the first aspect or any implementation manner of the first aspect.
In a fifth aspect, the present disclosure also provides a computer program product, which includes a computer program stored on a non-transitory computer-readable storage medium, where the computer program includes program instructions, and when the program instructions are executed by a computer, the computer is caused to execute the encoding-based character string retrieval method in the foregoing first aspect or any implementation manner of the first aspect.
The character string retrieval scheme based on the coding in the embodiment of the disclosure comprises the steps of calculating the length of a character string to obtain a length value L of the character string; performing a modulo operation based on an N value on the sum of all character values in the character string to obtain a coded value C of the character string; constructing an index value list of the character string based on the character string length value L and the code value C of the character string; based on the encoding in the list of index values, a retrieval operation is performed on the string. By adopting the processing scheme disclosed by the invention and adopting a specific coding method, the retrieval target can be quickly positioned, and the retrieval efficiency of the character string is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a method for retrieving a character string based on encoding according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of a character string retrieval method based on encoding according to an embodiment of the present disclosure;
FIG. 3 is a flow chart of another method for retrieving a string based on encoding according to an embodiment of the present disclosure;
FIG. 4 is a flow chart of another method for retrieving a string based on encoding according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an encoding-based string retrieval apparatus according to an embodiment of the present disclosure;
fig. 6 is a schematic diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
The embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.
The embodiments of the present disclosure are described below with specific examples, and other advantages and effects of the present disclosure will be readily apparent to those skilled in the art from the disclosure in the specification. It is to be understood that the described embodiments are merely illustrative of some, and not restrictive, of the embodiments of the disclosure. The disclosure may be embodied or carried out in various other specific embodiments, and various modifications and changes may be made in the details within the description without departing from the spirit of the disclosure. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the disclosure, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present disclosure, and the drawings only show the components related to the present disclosure rather than the number, shape and size of the components in actual implementation, and the type, amount and ratio of the components in actual implementation may be changed arbitrarily, and the layout of the components may be more complicated.
In addition, in the following description, specific details are provided to facilitate a thorough understanding of the examples. However, it will be understood by those skilled in the art that the aspects may be practiced without these specific details.
The embodiment of the disclosure provides a character string retrieval method based on coding. The encoding-based character string retrieval method provided by the present embodiment may be executed by a computing device, which may be implemented as software, or implemented as a combination of software and hardware, and may be integrally provided in a server, a client, or the like.
Referring to fig. 1, the method for retrieving a character string based on encoding in the embodiment of the present disclosure may include the following steps:
s101, calculating the length of the character string to obtain a character string length value L.
Taking the column type storage common in the big data field as an example, see table 1, the following data are as follows:
table 1 string storage list
Figure BDA0002379411690000051
Figure BDA0002379411690000061
The respective indexes of the character strings can be represented by I1, I2 and I3 …, and the respective character string values can be represented by V1, V2 and V3 …. In this way, the length value L of each character can be calculated, and the length value of the character string indicates the number of characters included in the character string.
In the process of obtaining the Length value of the character string, a character string reading function Length () may be set, and the Length value of each character string is read through the character string reading function, for example, L1 ═ Length (V1) ═ 4. In this way, the lengths L2, L3 … of the other character strings can be obtained in sequence.
And S102, performing N-value-based modular operation on the sum of all character values in the character string to obtain a code value C of the character string.
In addition to calculating the length of the string, the sum of the string values is calculated, and the encoded value C of the string can be obtained by adding the values of the string and modulo the value of N. Wherein, N is a natural number, and the value of N can be dynamically adjusted according to the number of the character strings.
As an example, the encoding value calculation may be performed by setting a modulo operation function Encode (), such as: encode (V1) ═ Encode ("Jack") (106+97+99+ 107)% 32 ═ C1. In such a manner, the encoded values of other strings may also be calculated, respectively replaced with C2, C3 ….
S103, constructing an index value list of the character string based on the character string length value L and the code value C of the character string.
After the length value of the character string and the code value of the character string are obtained, an index value list of the character string may be constructed based on the length value L of the character string and the code value C of the character string, see table 2, and an index structure of the index value list may be performed in the following manner:
(1) dividing character strings into large groups according to the equal character length L;
(2) in each large group, dividing character strings with the same C value into the same small group by taking the C value as a basis;
(3) within each subgroup, each group of C values corresponds to one to a plurality of string index values L, which may be the index values of strings in the original string storage list.
TABLE 2 list of string index values
Figure BDA0002379411690000071
S104, based on the codes in the index value list, the character string is searched.
By setting the index value list and the corresponding codes therein, the retrieval operation can be performed on the character string based on the codes. Taking the query string T as an example, the length LT of T is calculated first, and then the query range is narrowed to the packet corresponding to LT. Then, the coding CT of T is calculated, and then an index value list corresponding to the CT is found in the grouping. And then comparing whether the characters at the corresponding positions are the contents to be retrieved or not according to the indexes.
By the mode in the embodiment, the character string can be quickly positioned and retrieved, and the efficiency of character string retrieval is improved.
Referring to fig. 2, according to a specific implementation manner of the embodiment of the present disclosure, the calculating the length of the character string includes:
s201, setting a character string length calculation function.
S202, calculating the length of the character string based on the character string length calculation function.
In the implementation of steps S201 to S202, in the process of obtaining the Length value of the character string, a character string reading function Length () may be set, and the Length value of each character string may be read through the character string reading function, for example, L1 is Length (V1) is 4. In this way, the lengths L2, L3 … of the other character strings can be obtained in sequence.
According to a specific implementation manner of the embodiment of the present disclosure, the calculating the length of the character string based on the character string length calculation function includes: judging whether the character string contains an end character; and if so, counting the number of the characters in the character string before the end character to obtain the length of the character string.
According to a specific implementation manner of the embodiment of the present disclosure, before performing an N-value-based modulo operation on a sum of all character values in the character string, the method further includes: the N value for modular calculation is preset, and the small groups can be flexibly divided according to the number of the character strings by presetting the N value.
According to a specific implementation manner of the embodiment of the present disclosure, before performing an N-value-based modulo operation on a sum of all character values in the character string, the method further includes: and pre-calculating the sum of all character values in the character string.
Referring to fig. 3, according to a specific implementation manner of the embodiment of the present disclosure, the constructing an index value list of the character string based on the character string length value L and the code value C of the character string includes:
s301, dividing the characters with the same character length value L into the same large group;
s302, in the large group, distributing character strings with the same code value C of the character strings in the same small group;
s303, sets an index value for each character string included in each group.
In the implementation of steps S301 to S303, the character strings may be divided into a large group according to the character lengths L being equal; in each large group, dividing character strings with the same C value into the same small group by taking the C value as a basis; within each subgroup, each group of C values corresponds to one to a plurality of string index values L, which may be the index values of strings in the original string storage list.
According to a specific implementation manner of the embodiment of the present disclosure, the setting of the index value of the character string included in each group includes:
acquiring an original index value of the character string in a previous storage list;
and taking the original index value as the index value of the character string in the group.
Referring to fig. 4, according to a specific implementation manner of the embodiment of the present disclosure, the performing a retrieval operation on a string based on the encoding in the index value list includes:
s401, acquiring the length LT of a character string T to be retrieved so as to determine a large group corresponding to the character string based on the LT;
s402, calculating the code CT of the character string T to be retrieved, and finding an index value list corresponding to the CT in the group;
and S403, comparing whether the character string at the corresponding position is the content to be retrieved or not according to the index value list corresponding to the found CT.
Corresponding to the above method embodiment, referring to fig. 5, the disclosed embodiment further provides an encoding-based character string retrieving apparatus 50, including:
the calculating module 501 is configured to calculate the length of the character string to obtain a length value L of the character string.
Taking the column type storage common in the big data field as an example, see table 1, the following data are as follows:
the respective indexes of the character strings can be represented by I1, I2 and I3 …, and the respective character string values can be represented by V1, V2 and V3 …. In this way, the length value L of each character can be calculated, and the length value of the character string indicates the number of characters included in the character string.
In the process of obtaining the Length value of the character string, a character string reading function Length () may be set, and the Length value of each character string is read through the character string reading function, for example, L1 ═ Length (V1) ═ 4. In this way, the lengths L2, L3 … of the other character strings can be obtained in sequence.
The module taking module 502 is configured to perform a module taking operation based on an N value on the sum of all character values in the character string to obtain a code value C of the character string.
In addition to calculating the length of the string, the sum of the string values is calculated, and the encoded value C of the string can be obtained by adding the values of the string and modulo the value of N. Wherein, N is a natural number, and the value of N can be dynamically adjusted according to the number of the character strings.
As an example, the encoding value calculation may be performed by setting a modulo operation function Encode (), such as: encode (V1) ═ Encode ("Jack") (106+97+99+ 107)% 32 ═ C1. In such a manner, the encoded values of other strings may also be calculated, respectively replaced with C2, C3 ….
A constructing module 503, configured to construct an index value list of the character string based on the character string length value L and the code value C of the character string.
After the length value of the character string and the code value of the character string are obtained, an index value list of the character string may be constructed based on the length value L of the character string and the code value C of the character string, see table 2, and an index structure of the index value list may be performed in the following manner:
(1) dividing character strings into large groups according to the equal character length L;
(2) in each large group, dividing character strings with the same C value into the same small group by taking the C value as a basis;
(3) within each subgroup, each group of C values corresponds to one to a plurality of string index values L, which may be the index values of strings in the original string storage list.
A retrieving module 504, configured to perform a retrieving operation on the string based on the encoding in the index value list.
By setting the index value list and the corresponding codes therein, the retrieval operation can be performed on the character string based on the codes. Taking the query string T as an example, the length LT of T is calculated first, and then the query range is narrowed to the packet corresponding to LT. Then, the coding CT of T is calculated, and then an index value list corresponding to the CT is found in the grouping. And then comparing whether the characters at the corresponding positions are the contents to be retrieved or not according to the indexes.
By the mode in the embodiment, the character string can be quickly positioned and retrieved, and the efficiency of character string retrieval is improved.
For parts not described in detail in this embodiment, reference is made to the contents described in the above method embodiments, which are not described again here.
Referring to fig. 6, an embodiment of the present disclosure also provides an electronic device 60, including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the encoding-based string retrieval method of the foregoing method embodiments.
The disclosed embodiments also provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the encoding-based string retrieval method in the aforementioned method embodiments.
The disclosed embodiments also provide a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform the encoding-based string retrieval method in the aforementioned method embodiments.
Referring now to FIG. 6, a schematic diagram of an electronic device 60 suitable for use in implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 6, the electronic device 60 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 60 are also stored. The processing device 601, the ROM602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, image sensor, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 60 to communicate with other devices wirelessly or by wire to exchange data. While the figures illustrate an electronic device 60 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present disclosure.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring at least two internet protocol addresses; sending a node evaluation request comprising the at least two internet protocol addresses to node evaluation equipment, wherein the node evaluation equipment selects the internet protocol addresses from the at least two internet protocol addresses and returns the internet protocol addresses; receiving an internet protocol address returned by the node evaluation equipment; wherein the obtained internet protocol address indicates an edge node in the content distribution network.
Alternatively, the computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving a node evaluation request comprising at least two internet protocol addresses; selecting an internet protocol address from the at least two internet protocol addresses; returning the selected internet protocol address; wherein the received internet protocol address indicates an edge node in the content distribution network.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation of the unit itself, for example, the first retrieving unit may also be described as a "unit for retrieving at least two internet protocol addresses".
It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof.
The above description is only for the specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present disclosure should be covered within the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (11)

1. An encoding-based string retrieval method, comprising:
calculating the length of the character string to obtain a length value L of the character string;
performing a modulo operation based on an N value on the sum of all character values in the character string to obtain a coded value C of the character string;
constructing an index value list of the character string based on the character string length value L and the code value C of the character string;
based on the encoding in the list of index values, a retrieval operation is performed on the string.
2. The method of claim 1, wherein calculating the length of the string comprises:
setting a character string length calculation function;
calculating the length of the character string based on the character string length calculation function.
3. The method of claim 2, wherein calculating the length of the character string based on the character string length calculation function comprises:
judging whether the character string contains an end character;
and if so, counting the number of the characters in the character string before the end character to obtain the length of the character string.
4. The method of claim 1, wherein prior to performing the N-value based modulo operation on the sum of all character values in the string, the method further comprises:
the N value for the modulo calculation is set in advance.
5. The method of claim 1, wherein prior to performing the N-value based modulo operation on the sum of all character values in the string, the method further comprises:
and pre-calculating the sum of all character values in the character string.
6. The method according to claim 1, wherein constructing the list of index values of the character string based on the character string length value L and the code value C of the character string comprises:
dividing the characters with the same character length value L into the same large group;
within the large group, character strings with the same code value C of the character strings are distributed in the same small group;
the character strings included in each group are set to the index values.
7. The method according to claim 6, wherein the setting of the index value of the character string contained in each subgroup comprises:
acquiring an original index value of the character string in a previous storage list;
and taking the original index value as the index value of the character string in the group.
8. The method of claim 6, wherein performing a search operation on a string based on the encoding in the list of index values comprises:
acquiring the length LT of a character string T to be retrieved so as to determine a large group corresponding to the character string based on the LT;
calculating the code CT of the character string T to be retrieved, and finding an index value list corresponding to the CT in the group;
and comparing whether the character string at the corresponding position is the content to be retrieved or not according to the index value list corresponding to the found CT.
9. An encoding-based character string retrieval apparatus, comprising:
the calculation module is used for calculating the length of the character string to obtain a length value L of the character string;
the module taking module is used for carrying out module taking operation based on the N value on the sum of all character values in the character string to obtain a coded value C of the character string;
the construction module is used for constructing an index value list of the character string based on the character string length value L and the code value C of the character string;
and the retrieval module is used for executing retrieval operation on the character string based on the codes in the index value list.
10. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the encoding-based string retrieval method of any of the preceding claims 1-8.
11. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the encoding-based string retrieval method of any one of the preceding claims 1-8.
CN202010078611.3A 2020-02-03 2020-02-03 Character string retrieval method and device based on coding and electronic equipment Active CN111309988B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010078611.3A CN111309988B (en) 2020-02-03 2020-02-03 Character string retrieval method and device based on coding and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010078611.3A CN111309988B (en) 2020-02-03 2020-02-03 Character string retrieval method and device based on coding and electronic equipment

Publications (2)

Publication Number Publication Date
CN111309988A true CN111309988A (en) 2020-06-19
CN111309988B CN111309988B (en) 2023-05-02

Family

ID=71161635

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010078611.3A Active CN111309988B (en) 2020-02-03 2020-02-03 Character string retrieval method and device based on coding and electronic equipment

Country Status (1)

Country Link
CN (1) CN111309988B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111782895A (en) * 2020-07-02 2020-10-16 北京字节跳动网络技术有限公司 Retrieval processing method and device, readable medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101165681A (en) * 2006-10-17 2008-04-23 中兴通讯股份有限公司 Character string matching information processing method in communication system
US20150379127A1 (en) * 2014-06-27 2015-12-31 Gerd Mueller Fuzzy substring search
JP2018081611A (en) * 2016-11-18 2018-05-24 日本電信電話株式会社 Dictionary search method, device, and program
CN109902142A (en) * 2019-02-27 2019-06-18 西安电子科技大学 A kind of character string fuzzy matching and querying method based on editing distance
CN110572161A (en) * 2019-09-10 2019-12-13 北京中科寒武纪科技有限公司 data encoding method and device, computer equipment and readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101165681A (en) * 2006-10-17 2008-04-23 中兴通讯股份有限公司 Character string matching information processing method in communication system
US20150379127A1 (en) * 2014-06-27 2015-12-31 Gerd Mueller Fuzzy substring search
JP2018081611A (en) * 2016-11-18 2018-05-24 日本電信電話株式会社 Dictionary search method, device, and program
CN109902142A (en) * 2019-02-27 2019-06-18 西安电子科技大学 A kind of character string fuzzy matching and querying method based on editing distance
CN110572161A (en) * 2019-09-10 2019-12-13 北京中科寒武纪科技有限公司 data encoding method and device, computer equipment and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
于长永;高明;柏禄一;赵宇海;: "一种带有长度和位置约束的字符串索引方法" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111782895A (en) * 2020-07-02 2020-10-16 北京字节跳动网络技术有限公司 Retrieval processing method and device, readable medium and electronic equipment
CN111782895B (en) * 2020-07-02 2024-03-19 北京字节跳动网络技术有限公司 Retrieval processing method and device, readable medium and electronic equipment

Also Published As

Publication number Publication date
CN111309988B (en) 2023-05-02

Similar Documents

Publication Publication Date Title
US11550826B2 (en) Method and system for generating a geocode trie and facilitating reverse geocode lookups
US10747737B2 (en) Altering data type of a column in a database
US10467229B2 (en) Query-time analytics on graph queries spanning subgraphs
CN111191434B (en) Sports news writing method and device based on natural language and electronic equipment
CN110198473B (en) Video processing method and device, electronic equipment and computer readable storage medium
CN111198861A (en) Logic log processing method and device and electronic equipment
CN111241137A (en) Data processing method and device, electronic equipment and storage medium
CN112307061A (en) Method and device for querying data
CN111309988B (en) Character string retrieval method and device based on coding and electronic equipment
CN111737571B (en) Searching method and device and electronic equipment
CN116628049B (en) Information system maintenance management system and method based on big data
CN109542912B (en) Interval data storage method, device, server and storage medium
CN114328700B (en) Data checking method and device in medical data ETL task
CN113157695B (en) Data processing method and device, readable medium and electronic equipment
CN109800361A (en) A kind of method for digging of interest point name, device, electronic equipment and storage medium
CN113204557B (en) Electronic form importing method, device, equipment and medium
CN111143355B (en) Data processing method and device
CN111737572B (en) Search statement generation method and device and electronic equipment
CN110727672A (en) Data mapping relation query method and device, electronic equipment and readable medium
CN117349290A (en) Data processing method and device based on online analysis processing and electronic equipment
CN118296445B (en) Social media data acceleration calculation method and device
CN118368268B (en) Unified operation method and device for multi-social media comments
CN111177588B (en) Interest point retrieval method and device
CN116860765A (en) JSON format data warehousing method and device operated concurrently and electronic equipment
CN117349288A (en) Data query method and device based on online analysis processing and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant