CN112199922A - Encoding method, apparatus, device and computer readable storage medium - Google Patents

Encoding method, apparatus, device and computer readable storage medium Download PDF

Info

Publication number
CN112199922A
CN112199922A CN202010861534.9A CN202010861534A CN112199922A CN 112199922 A CN112199922 A CN 112199922A CN 202010861534 A CN202010861534 A CN 202010861534A CN 112199922 A CN112199922 A CN 112199922A
Authority
CN
China
Prior art keywords
character
coded
character set
characters
encoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010861534.9A
Other languages
Chinese (zh)
Other versions
CN112199922B (en
Inventor
王毅
邓惠朋
董晓文
张成海
罗秋科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ARTICLE NUMBERING CENTER OF CHINA
Original Assignee
ARTICLE NUMBERING CENTER OF CHINA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ARTICLE NUMBERING CENTER OF CHINA filed Critical ARTICLE NUMBERING CENTER OF CHINA
Priority to CN202010861534.9A priority Critical patent/CN112199922B/en
Publication of CN112199922A publication Critical patent/CN112199922A/en
Application granted granted Critical
Publication of CN112199922B publication Critical patent/CN112199922B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Transfer Between Computers (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The application provides an encoding method, an encoding device, encoding equipment and a computer readable storage medium. Acquiring a character string to be coded corresponding to a URI to be coded; dividing the character string to be coded to obtain a character to be coded and/or a character sequence to be coded; matching the characters to be coded with characters in a preset character set, and coding the characters to be coded according to a coding mode corresponding to the successfully matched characters; and/or matching the character sequence to be coded with the character series in the preset character set, and coding the character sequence to be coded according to the coding mode corresponding to the character sequence successfully matched, wherein the preset character set comprises the corresponding relation between the characters and the codes and the corresponding relation between the character sequence and the codes, so that the coding efficiency is improved, and the space occupied by the codes is saved.

Description

Encoding method, apparatus, device and computer readable storage medium
Technical Field
The present application relates to the field of encoding, and in particular, to an encoding method, apparatus, device, and computer-readable storage medium.
Background
With the rapid development of the internet technology, people are more and more widely applied to the network in daily life, particularly, the two-dimensional code which is visible everywhere greatly facilitates the life of people, and the jump of network resources can be realized by coding information such as numbers, letters, Chinese characters and the like and writing the information into the two-dimensional code as static information and analyzing the information by third-party code scanning software, so that the code of the information is of great importance.
In the related art, for example, a Uniform Resource Identifier (URI), the URI is usually regarded as a character string composed of numbers, letters and other symbols to be encoded, that is, each character in the URI is encoded in an undifferentiated text.
However, the above coding method may result in large occupied space of coding and low coding efficiency.
Disclosure of Invention
The application provides a coding method, a coding device, coding equipment and a computer readable storage medium, so that the technical problems of large coding occupation space and low coding efficiency caused by the existing coding mode are solved.
In a first aspect, the present application provides an encoding method, including:
acquiring a character string to be coded corresponding to the URI to be coded;
dividing the character string to be coded to obtain a character to be coded and/or a character sequence to be coded;
matching the characters to be coded with characters in a preset character set, and coding the characters to be coded according to a coding mode corresponding to the successfully matched characters; and/or matching the character sequence to be coded with the character series in the preset character set, and coding the character sequence to be coded according to the coding mode corresponding to the character sequence successfully matched, wherein the preset character set comprises the corresponding relation between the characters and the codes and the corresponding relation between the character sequence and the codes.
After the character string to be coded corresponding to the URI to be coded is obtained, the character string with the code is firstly divided to obtain the character to be coded and/or the character sequence to be coded, so that the character to be coded and/or the character sequence to be coded can be coded in a preset character set pre-stored mode, and the URI to be coded is coded. The corresponding relation between the characters and the codes and between the character sequences and the codes is pre-stored in the preset character set, so that each single character does not need to be subjected to complex text coding during coding, the coding mode of pre-storing is directly adopted for coding, the coding efficiency is improved, in addition, the coding sequence can be subjected to integral coding according to the coding mode of pre-storing in the preset character set, and the space occupied by the coding is saved. According to the encoding method preferentially stored in the preset character set, the successfully matched characters to be encoded and/or the successfully matched character sequences to be encoded are encoded, so that the complex time and process of text encoding of single characters are saved, and the encoding efficiency is further improved.
Optionally, the preset character set includes a first character set and a second character set, the frequency of use of characters in the first character set is greater than the frequency of use of characters in the second character set, and the frequency of use of character sequences in the first character set is greater than the frequency of use of character sequences in the second character set.
Here, the first character set may be a character set whose use frequency is greater than a first preset frequency, the second character set may be a character set whose use frequency is greater than a second preset frequency and less than the first preset frequency, and the character set is determined according to the use frequency, and further, the common characters and the encoding modes corresponding to the character sequences may be determined, so that the preset encoding modes are directly used during encoding, text encoding of a single character is not required, and encoding efficiency is further improved.
Optionally, the encoding the character to be encoded and/or the character sequence to be encoded according to a preset character set includes:
according to the first character set, encoding the characters to be encoded and/or the character sequences to be encoded;
and if the encoding is unsuccessful, encoding the character to be encoded and/or the character sequence to be encoded according to the second character set.
Here, the character to be encoded and/or the character sequence to be encoded are encoded according to the first character set, if the encoding is successful, the first character set is used for encoding, otherwise, the second character set is used for encoding, that is, the first character set with high frequency is used for encoding in the embodiment of the present application, so as to further improve the encoding efficiency.
Optionally, the preset character set further includes a third character set, and the third character set is a collection of the first character set and the second character set.
Optionally, the encoding the character to be encoded and/or the character sequence to be encoded according to a preset character set includes: if the character to be coded is matched with the characters in the first character set and matched with the characters in the third character set, coding the character to be coded according to the first character set;
and/or if the character sequence to be coded is matched with the character sequence in the first character set and matched with the character sequence in the third character set, coding the character sequence to be coded according to the first character set.
Here, during encoding, if a character to be encoded and/or the character sequence to be encoded are found, the first character set encoding may be preferentially adopted, or the third character set encoding may be adopted, and similarly, if a character to be encoded and/or the character sequence to be encoded are found during encoding, the second character set encoding may be adopted, or the third character set encoding may be adopted, or the second character set encoding is preferentially adopted. Because the number of the characters and the character sequences stored in the first character set and the second character set is less than that of the third character set, the time is saved by adopting the first character set and the second character set for coding, and the coding efficiency is further improved.
Optionally, the encoding the character to be encoded and/or the character sequence to be encoded according to a preset character set includes:
if the characters to be coded at the same position correspond to a plurality of codes after being matched with the characters in the first character set, the codes with the maximum code values are used for coding the characters to be coded at the same position;
and/or
And if the character sequence to be coded at the same position corresponds to a plurality of codes after being matched with the character sequence in the first character set, the code with the maximum code value is used for coding the character sequence to be coded at the same position.
If a plurality of encoding methods are present in the same position and encoded by using the same first character set, the encoding with the largest encoding value is used, and the encoding accuracy is further improved. Similarly, a plurality of coding modes are generated by adopting the second character set and the third character set to code at the same position, and the coding with the largest coding value is also adopted.
Optionally, the encoding the character to be encoded and/or the character sequence to be encoded according to a preset character set includes:
acquiring a first coding bit number for coding the character to be coded and/or the character sequence to be coded according to the first character set and the second character set, and acquiring a second coding bit number for coding the character to be coded and/or the character sequence to be coded according to the third character set;
and if the second coding bit number is less than the first coding bit number, coding the character to be coded and/or the character sequence to be coded according to the third character set.
The character to be coded and/or the character sequence to be coded are coded, the number of coding bits formed by a plurality of codes is compared, and a coding mode with less coding bits is selected, so that the space occupied by the codes is further reduced.
In a second aspect, an embodiment of the present application provides a two-dimensional code, where the two-dimensional code stores a URI encoded based on the encoding method according to the first aspect or the optional manner of the first aspect.
Optionally, the two-dimensional code is used for payment, product identification or webpage skipping.
In a third aspect, an embodiment of the present application provides a Radio Frequency Identification (RFID) module, where a URI encoded based on the encoding method according to the first aspect or the optional manner of the first aspect is stored in the RFID module.
Optionally, the RFID module is used in a logistics label, a product label, an electronic certificate, an electronic key, or mobile payment.
In a fourth aspect, an embodiment of the present application provides an encoding apparatus, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the encoding method of the first aspect or the alternatives of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer-readable storage medium, in which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, the computer-readable storage medium is configured to implement the encoding method according to the first aspect or the alternatives of the first aspect.
Optionally, the computer-readable storage medium stores a two-dimensional code, and the two-dimensional code is executed by a processor, where the two-dimensional code is encoded by the encoding method according to the first aspect or the optional manner of the first aspect. Optionally, the two-dimensional code is used for payment, product identification or webpage skipping.
Optionally, the computer-readable storage medium stores therein a radio frequency identification RFID module, the RFID module stores therein data, the data is executed by the processor, and the data is encoded by the encoding method according to the first aspect or the optional manner of the first aspect. Optionally, the RFID module is used in a logistics label, a product label, an electronic certificate, an electronic key, or mobile payment.
In a sixth aspect, embodiments of the present application provide a computer program product, which includes computer executable instructions, and when the computer executable instructions are executed by a processor, the computer executable instructions are used to implement the encoding method according to the first aspect or the optional manner of the first aspect.
The encoding method, the encoding device, the encoding equipment and the computer readable storage medium are provided by the embodiment of the application, wherein the method comprises the steps of obtaining a character string to be encoded corresponding to a URI to be encoded; dividing the character string to be coded to obtain a character to be coded and/or a character sequence to be coded; the characters to be coded and/or the character sequences to be coded are coded according to a preset character set, wherein the preset character set comprises the corresponding relation between the characters and the codes and the corresponding relation between the character sequences and the codes, so that each single character does not need to be subjected to complex text coding during coding, a preset and stored coding mode can be directly adopted for coding, the coding efficiency is improved, in addition, the coding sequence can be subjected to integral coding according to the coding mode pre-stored in the preset character set, and the space occupied by the coding is saved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a schematic diagram of an encoding system architecture;
fig. 2 is a flowchart of an encoding method according to an embodiment of the present application;
fig. 3 is a flowchart of another encoding method provided in the embodiment of the present application;
fig. 4 is a flowchart of another encoding method provided in an embodiment of the present application;
fig. 5 is a flowchart of another encoding method provided in the embodiment of the present application;
fig. 6 is a flowchart of another encoding method provided in the embodiment of the present application;
fig. 7 is a schematic structural diagram of an encoding apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an encoding apparatus provided in the present application.
With the foregoing drawings in mind, certain embodiments of the disclosure have been shown and described in more detail below. These drawings and written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the concepts of the disclosure to those skilled in the art by reference to specific embodiments.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," "third," and "fourth," if any, in the description and claims of this application and the above-described figures are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
A URI is a string used to identify the name of an internet resource for resource location under a specified network protocol, which allows a user to interoperate with any resource via a particular protocol. In practical application, the URI is usually encoded and then written into a two-dimensional code as static information, the two-dimensional code is analyzed and identified as a website by third-party code scanning software, and the website is jumped to a webpage, so that the encoding of the URI is of great importance.
In order to solve the technical problems, the application provides an encoding method, an encoding device, an encoding apparatus and a computer-readable storage medium, wherein a character string to be encoded corresponding to a URI to be encoded is divided to obtain a character to be encoded and/or a character sequence to be encoded, and then the character to be encoded and/or the character sequence to be encoded are encoded by using a preset character set.
FIG. 1 is a diagram of a coding system architecture. In fig. 1, the above-described architecture includes at least one of a receiving device 101, a processor 102, and a display device 103.
It is to be understood that the illustrated structure of the embodiments of the present application does not constitute a specific limitation to the architecture of the coding system. In other possible embodiments of the present application, the foregoing architecture may include more or less components than those shown in the drawings, or combine some components, or split some components, or arrange different components, which may be determined according to practical application scenarios, and is not limited herein. The components shown in fig. 1 may be implemented in hardware, software, or a combination of software and hardware.
In a specific implementation process, the receiving device 101 may be an input/output interface, and may also be a communication interface, and may be configured to receive information such as a character string to be encoded.
The processor 102 may divide the character string to be encoded corresponding to the URI to be encoded to obtain a character to be encoded and/or a character sequence to be encoded, and then encode the character to be encoded and/or the character sequence to be encoded by using the preset character set, where the encoding is performed without performing complex text encoding on each single character, but directly using a preset and stored encoding method.
The display device 103 may be used to display the above results and the like.
The display device may also be a touch display screen for receiving user instructions while displaying the above-mentioned content to enable interaction with a user.
It should be understood that the processor may be implemented by reading instructions in the memory and executing the instructions, or may be implemented by a chip circuit.
In addition, the network architecture and the service scenario described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not constitute a limitation to the technical solution provided in the embodiment of the present application, and it can be known by a person skilled in the art that along with the evolution of the network architecture and the appearance of a new service scenario, the technical solution provided in the embodiment of the present application is also applicable to similar technical problems.
The technical scheme of the application is described in detail by combining specific embodiments as follows:
fig. 2 is a flowchart of an encoding method according to an embodiment of the present application. The execution subject of this embodiment may be the processor 102 in fig. 1, and the specific execution subject may be determined according to an actual application scenario. As shown in fig. 2, the method comprises the steps of:
s201: and acquiring a character string to be coded corresponding to the URI to be coded.
The URI to be encoded may be determined according to actual conditions, and this is not particularly limited in this embodiment of the present application.
S202: and dividing the character string to be coded to obtain the character to be coded and/or the character sequence to be coded.
Optionally, the processor divides the character string to be encoded, which may be performed according to a preset division manner, where the preset division manner may be determined according to an actual situation, for example, preset key characters and fixed structure division.
Here, the above-described division will be described by taking an example of the inherent structure in RFC 3986. URIs typically consist of three parts: the naming mechanism of the access resource, the host name of the storage resource and the name of the resource do not necessarily exist. For example, the URI to be encoded is a web address: https:// www.jd.com, a resource accessible via the https protocol, which is uniquely identified via "www.jd.com".
URI A URI segment may be encoded according to the RFC3986 standard. According to RFC3986, a typical URI structure is composed of a protocol, a host name, a domain name, a default port number, a resource path, and the like, and the order and structure are fixed. Taking the URI website https:// www.jd.com as an example, the sequence can be divided into 2 characters and 3 character sequences of "https:/", "www", "j", "d", and ". com" according to the RFC3986 standard, thereby facilitating the encoding of the character string character sequence.
S203: and coding the character to be coded and/or the character sequence to be coded according to the preset character set.
The preset character set comprises a corresponding relation between characters and codes and a corresponding relation between a character sequence and the codes.
Illustratively, the processor matches the characters to be coded with the characters in the preset character set, and codes the characters to be coded according to the coding mode corresponding to the successfully matched characters;
and/or the presence of a gas in the gas,
and matching the character sequence to be coded with the character series in the preset character set, and coding the character sequence to be coded according to the coding mode corresponding to the character sequence successfully matched.
The preset character set comprises a corresponding relation between characters and codes and a corresponding relation between a character sequence and the codes. Exemplarily, as shown in table 1 below, a correspondence between a character and a code and a correspondence between a character sequence and a code are stored, a first column in table 1 is a character and a character sequence stored in a preset character set, a second column is a code value corresponding to the character and the character sequence, and a third column is a binary code corresponding to the character and the character sequence.
TABLE 1
Figure BDA0002648306350000081
Figure BDA0002648306350000091
Because the preset character set comprises the corresponding relation between the characters and the codes and the corresponding relation between the character sequences and the codes, the characters to be coded can be matched with the characters in the preset character set, the character sequences to be coded can be matched with the character sequences in the preset character set, and if the characters to be coded are matched with the character sequences in the preset character set, the characters can be coded according to the coding mode corresponding to the preset character set. The successfully matched characters to be coded and/or the successfully matched character sequences to be coded are coded according to the pre-stored coding mode in the preset character set, so that the complex time and process of text coding of single characters are saved, and the coding efficiency is further improved.
Exemplarily, the table 2 is an encoding result of an encoding method provided by the prior art, the table 3 is an encoding result of an encoding method provided by the embodiment of the present application, and taking the URI address https:// www.jd.com as an example, encoding is performed according to a text encoding method used by the prior art, each character is 8 bytes, and 144 bytes are required for writing the URI address encoding into the two-dimensional code in total, however, only 30 bytes are required for writing the URI address encoding into the two-dimensional code by using the encoding method of the present application, and compared with the text encoding scheme, the encoding space occupation is significantly reduced, and the efficiency is significantly improved.
TABLE 2
Figure BDA0002648306350000092
TABLE 3
Figure BDA0002648306350000093
Figure BDA0002648306350000101
According to the method and the device, after the character string to be coded corresponding to the URI to be coded is obtained, the character string with the code is firstly divided to obtain the character to be coded and/or the character sequence to be coded, the character to be coded and/or the character sequence to be coded are matched with the character to be coded and the character sequence to be coded which are stored in the preset character set, so that the character to be coded and/or the character sequence to be coded can be coded in a mode of pre-storing in the preset character set, and the URI to be coded is coded. In addition, if the character string to be coded corresponding to the split URI to be coded contains one or more coding sequences, the coding sequences can be integrally coded according to the coding mode prestored in the preset character set, so that the space occupied by coding is saved.
In addition, the preset character set comprises a first character set and a second character set, the use frequency of characters in the first character set is greater than that of characters in the second character set, and the use frequency of character sequences in the first character set is greater than that of character sequences in the second character set.
Correspondingly, fig. 3 is a flowchart of another encoding method provided in the embodiment of the present application, where characters to be encoded and/or a sequence of characters to be encoded are encoded according to the first character set and the second character set, as shown in fig. 3, the method includes:
s301: and acquiring a character string to be coded corresponding to the uniform resource identifier URI to be coded.
S302: and dividing the character string to be coded to obtain the character to be coded and/or the character sequence to be coded.
The implementation manners of steps S301 to S302 are the same as those of steps S201 to S202, and are not described herein again.
S303: and coding the character to be coded and/or the character sequence to be coded according to the first character set.
S304: and if the encoding is unsuccessful, encoding the character to be encoded and/or the character sequence to be encoded according to the second character set.
Optionally, the first character set may be a character set whose use frequency is greater than a first preset frequency, the second character set may be a character set whose use frequency is greater than a second preset frequency and less than the first preset frequency, and the character set is determined according to the use frequency, and the common characters and the encoding modes corresponding to the character sequences may be determined, so that the preset encoding modes are directly used during encoding, text encoding of a single character is not required, and encoding efficiency is further improved.
Exemplarily, the first character set is a URI-a character set and the second character set is a URI-B character set, and if the character or the sequence of characters can be encoded using the URI-a character set and the URI-B character set, the encoding using the URI-a character set is preferred.
The processor firstly encodes the character to be encoded and/or the character sequence to be encoded according to the first character set, if the encoding is successful, the first character set is adopted for encoding, otherwise, the second character set is adopted, namely, the first character set with high frequency is firstly used for encoding, so that the encoding efficiency is further improved.
Under the condition that the preset character set comprises the first character set and the second character set, the preset character set further comprises a third character set, and the third character set is a collection of the first character set and the second character set. Fig. 4 is a flowchart of another encoding method provided in an embodiment of the present application, and as shown in fig. 4, the method includes:
s401: and acquiring a character string to be coded corresponding to the uniform resource identifier URI to be coded.
S402: and dividing the character string to be coded to obtain the character to be coded and/or the character sequence to be coded.
S403: and if the character to be coded is matched with the characters in the first character set and matched with the characters in the third character set, coding the character to be coded according to the first character set.
S404: and if the character sequence to be coded is matched with the character sequence in the first character set and is matched with the character sequence in the third character set, coding the character sequence to be coded according to the first character set.
Exemplarily, the first character set is a URI-a character set and the third character set is a URI-C character set, and if the character or the character sequence can be encoded using the URI-a character set and the URI-C character set, the encoding using the URI-a character set is preferred.
Similarly, if the character to be coded is matched with the characters in the second character set and is matched with the characters in the third character set, the character to be coded is coded according to the second character set;
and/or
And if the character sequence to be coded is matched with the character sequence in the second character set and is matched with the character sequence in the third character set, coding the character sequence to be coded according to the second character set.
Here, during encoding, if the character to be encoded and/or the character sequence to be encoded are found, the first character set encoding may be adopted, or the third character set encoding may be adopted, and similarly, if the character to be encoded and/or the character sequence to be encoded are found during encoding, the second character set encoding may be adopted, or the third character set encoding may be adopted, or the second character set encoding may be adopted, and since the number of the characters and the character sequences stored in the first character set and the second character set is less than that of the third character set, the time is saved more by adopting the first character set and the second character set encoding, and the encoding efficiency is further improved.
In a case that the preset character set includes a first character set, a second character set, and a third character set, fig. 5 is a flowchart of another encoding method provided in an embodiment of the present application, and as shown in fig. 5, the method includes:
s501: and acquiring a character string to be coded corresponding to the uniform resource identifier URI to be coded.
S502: and dividing the character string to be coded to obtain the character to be coded and/or the character sequence to be coded.
S503: and coding the character to be coded and/or the character sequence to be coded according to the first character set.
S504: if the characters to be coded at the same position correspond to a plurality of codes after being matched with the characters in the first character set, the codes with the maximum code values are used for coding the characters to be coded at the same position; and/or if the character sequence to be coded at the same position corresponds to a plurality of codes after being matched with the character sequence in the first character set, the code with the largest code value is used for coding the character sequence to be coded at the same position.
Exemplarily, the first character set is a URI-a character set, and if two encoding methods are used for encoding characters at the same position using the URI-a character set, a method having a larger encoding value is used.
Similarly, if the characters to be coded at the same position correspond to a plurality of codes after being matched with the characters in the second character set, the codes with the maximum code values are used for coding the characters to be coded at the same position;
and/or
And if the character sequence to be coded at the same position corresponds to a plurality of codes after being matched with the character sequence in the second character set, the code with the maximum code value is used for coding the character sequence to be coded at the same position.
Similarly, if the characters to be coded at the same position correspond to a plurality of codes after being matched with the characters in the third character set, the codes with the maximum code values are used for coding the characters to be coded at the same position;
and/or
And if the character sequence to be coded at the same position corresponds to a plurality of codes after being matched with the character sequence in the third character set, the code with the maximum code value is used for coding the character sequence to be coded at the same position.
If multiple coding modes appear by adopting the same first character set coding at the same position, the coding with the maximum coding value is adopted, and the coding accuracy is further improved.
In a case that the preset character set includes a first character set, a second character set, and a third character set, fig. 6 is a flowchart of another encoding method provided in an embodiment of the present application, and as shown in fig. 7, the method includes:
s601: and acquiring a character string to be coded corresponding to the uniform resource identifier URI to be coded.
S602: and dividing the character string to be coded to obtain the character to be coded and/or the character sequence to be coded.
S603: acquiring a first coding bit number for coding the character to be coded and/or the character sequence to be coded according to the first character set and the second character set, and acquiring a second coding bit number for coding the character to be coded and/or the character sequence to be coded according to the third character set.
S604: and if the second coding bit number is less than the first coding bit number, coding the character to be coded and/or the character sequence to be coded according to the third character set.
Exemplarily, the first character set is a URI-A character set, the second character set is a URI-B character set, and the third character set is a URI-C character set, the character string is jointly encoded by using the URI-A character set and the URI-B character set, and the total generated encoding bit number is calculated, or the character string is jointly encoded by using the URI-B character set and the URI-A character set, and the total generated encoding bit number is calculated, and then the encoding bit number generated by independently encoding the character string by using the URI-C character set is calculated. Only if the former is greater than or equal to the latter is the encoding performed using the URI-C character set.
Optionally, the order of the first character set and the second character set is adopted during encoding.
Similarly, a third encoding bit number for encoding the character to be encoded and/or the character sequence to be encoded according to the first character set and the third character set is obtained, and a fourth encoding bit number for encoding the character to be encoded and/or the character sequence to be encoded according to the third character set is obtained;
and if the fourth coding bit number is less than the third coding bit number, coding the character to be coded and/or the character sequence to be coded according to the third character set.
Similarly, acquiring a fifth encoding bit number for encoding the character to be encoded and/or the character sequence to be encoded according to the second character set and the third character set, and acquiring a sixth encoding bit number for encoding the character to be encoded and/or the character sequence to be encoded according to the third character set;
and if the sixth coding bit number is less than the fifth coding bit number, coding the character to be coded and/or the character sequence to be coded according to the third character set.
The method comprises the steps of firstly coding characters to be coded and/or character sequences to be coded in a mode of combining a first character set with a second character set, then coding the characters to be coded and/or the character sequences to be coded in a mode of using a third character set, comparing the number of coded bits formed by coding the first character set with the number of coded bits formed by coding the second character set, and selecting a coding mode with less coded bits.
The application also provides a two-dimensional code for storing the URI encoded based on the encoding method, the two-dimensional code is widely applied at present and is suitable for storing information of various encoding types and larger data volume, however, the encoding type stored by the existing two-dimensional code is still single and only the code with smaller data volume can be stored, for codes with large data volume occupied by long texts, long websites and the like, a large area of two-dimensional codes is needed, so the two-dimensional codes of the URI encoded by the encoding method are adopted, because the URI coded by the coding mode can be coded in a small space, the area occupied by the two-dimensional code storing the URI can be reduced, when the two-dimensional code module is fixed in size, the two-dimensional code storing the URI encoded by the encoding method is used, and the occupied area is greatly reduced. Further, under the condition that the printable area of the two-dimensional code is fixed, the two-dimensional code is large in module, so that the reading speed of the two-dimensional code is increased, and the anti-pollution performance, the robustness, the anti-distortion performance and the like are improved.
Optionally, the two-dimensional code may be used for payment, product identification, or web page hopping.
Exemplarily, a video of content such as product information, a manufacturing process, a brand history and the like can be stored in the two-dimensional code, the two-dimensional code can also be a website, can also be applied to a product package in the form of image-text information and the like, and can also be a two-dimensional code given as a gift along with the package.
The RFID technology realizes non-contact bidirectional communication by combining wireless communication with a data access technology through radio waves and then connecting a database system, thereby achieving the aim of identification. Similar to the two-dimensional code principle, for codes which occupy a large amount of data, such as long texts and long websites, the RFID with a large space is needed, so that the RFID of the URI encoded based on the encoding method is adopted in the application, and the URI encoded by the encoding method can occupy a small space due to the encoding, so that the space occupied by the RFID storing the URI can be reduced.
Optionally, the RFID module is used in a logistics label, a product label, an electronic certificate, an electronic key, or mobile payment.
Exemplarily, the RFID module may be used for cargo tracking, automatic information acquisition, warehousing application, port application, and express delivery in the logistics process, may also be used for real-time statistics, replenishment and theft prevention of sales data of commodities in the retail industry, may also be used for real-time monitoring, quality tracking, automated production and the like of production data in the manufacturing industry, may also be used for automated production, warehouse management, brand management, individual product management, channel management in the clothing industry, and may also be used for identification, for example: various electronic certificates such as electronic passports, identity cards, student cards and the like, and in addition, the electronic certificates can also be used in the fields of anti-counterfeiting, asset management, transportation, food, animal identification, libraries, automobiles, aviation, military and the like.
Fig. 7 is a schematic structural diagram of an encoding apparatus according to an embodiment of the present application, and as shown in fig. 7, the apparatus according to the embodiment of the present application includes:
the obtaining module 701 is configured to obtain a to-be-encoded character string corresponding to a to-be-encoded uniform resource identifier URI.
The first processing module 702 is configured to divide a character string to be encoded to obtain a character to be encoded and/or a character sequence to be encoded.
The second processing module 703 is configured to encode the character to be encoded and/or the character sequence to be encoded according to the preset character set.
The preset character set comprises a corresponding relation between characters and codes and a corresponding relation between a character sequence and the codes.
Optionally, the second processing module 703 is specifically configured to:
matching the characters to be coded with characters in a preset character set, and coding the characters to be coded according to a coding mode corresponding to the successfully matched characters;
and/or the presence of a gas in the gas,
and matching the character sequence to be coded with the character series in the preset character set, and coding the character sequence to be coded according to the coding mode corresponding to the character sequence successfully matched.
Optionally, the preset character set includes a first character set and a second character set, the frequency of use of characters in the first character set is greater than the frequency of use of characters in the second character set, and the frequency of use of character sequences in the first character set is greater than the frequency of use of character sequences in the second character set.
Optionally, the second processing module 703 is specifically configured to:
according to the first character set, encoding characters to be encoded and/or character sequences to be encoded;
and if the coding is unsuccessful, matching the character to be coded and/or the character sequence to be coded with the character and/or the character sequence in the second character set, and if the matching is successful, coding the character to be coded and/or the character sequence to be coded according to the coding mode stored in the second character set.
Optionally, the preset character set further includes a third character set, and the third character set is a collection of the first character set and the second character set.
Optionally, the second processing module 703 is specifically configured to:
if the character to be coded is matched with the characters in the first character set and matched with the characters in the third character set, coding the character to be coded according to the first character set;
and/or
And if the character sequence to be coded is matched with the character sequence in the first character set and is matched with the character sequence in the third character set, coding the character sequence to be coded according to the first character set.
Optionally, the second processing module 703 is specifically configured to:
if the characters to be coded at the same position correspond to a plurality of codes after being matched with the characters in the first character set, the codes with the maximum code values are used for coding the characters to be coded at the same position;
and/or
And if the character sequences to be coded at the same position correspond to a plurality of codes after being matched with the character sequences in the first character set, the codes with the maximum code values are used for coding the character sequences to be coded at the same position.
Optionally, the second processing module 703 is specifically configured to:
acquiring a first encoding bit number for encoding the character to be encoded and/or the character sequence to be encoded according to the first character set and the second character set, and acquiring a second encoding bit number for encoding the character to be encoded and/or the character sequence to be encoded according to the third character set;
and if the second coding bit number is less than the first coding bit number, coding the character to be coded and/or the character sequence to be coded according to the third character set.
Fig. 8 is a schematic structural diagram of an encoding apparatus provided in the present application. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not limiting to the implementations of the present application described and/or claimed herein.
As shown in fig. 8, the encoding apparatus includes: processor 801 and memory 802, the various components being interconnected using different buses, and may be mounted on a common motherboard or in other manners as desired. The processor 801 may process instructions executed within the encoding device, including instructions for graphical information stored in or on a memory for display on an external input/output device (such as a display device coupled to an interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Fig. 8 illustrates an example of a processor 801.
The memory 802 is a non-transitory computer readable storage medium, and can be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (e.g., the obtaining module 701, the first processing module 702, and the second processing module 703 shown in fig. 7) corresponding to the method responded by the encoding apparatus in the embodiment of the present application. The processor 801 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory 802, that is, implements the method of encoding device responses in the above-described method embodiments.
The encoding apparatus may further include: an input device 803 and an output device 804. The processor 801, the memory 802, the input device 803, and the output device 804 may be connected by a bus or other means, and are exemplified by a bus in fig. 8.
The input device 803 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the encoding apparatus, such as a touch screen, a keypad, a mouse, or a plurality of mouse buttons, a trackball, a joystick, or the like. The output device 804 may be an output device such as a display device of the encoding device. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
The encoding device of the embodiment of the present application may be configured to execute the technical solutions in the method embodiments of the present application, and the implementation principle and the technical effect are similar, which are not described herein again.
The embodiment of the present application further provides a computer-readable storage medium, in which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, the computer-readable storage medium is configured to implement any one of the encoding methods described above.
Optionally, the computer-readable storage medium stores a two-dimensional code, and the two-dimensional code is executed by the processor, and the two-dimensional code is obtained by encoding according to the first aspect or the encoding method of the first aspect. Optionally, the two-dimensional code is used for payment, product identification or web page jumping.
Optionally, the computer-readable storage medium stores therein a radio frequency identification RFID module, the RFID module stores therein data, the data is executed by the processor, and the data is encoded by the encoding method according to the first aspect or the optional manner of the first aspect. Optionally, the RFID module is used in a logistics label, a product label, an electronic certificate, an electronic key, or mobile payment.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (13)

1. A method of encoding, comprising:
acquiring a character string to be coded corresponding to a Uniform Resource Identifier (URI) to be coded;
dividing the character string to be coded to obtain a character to be coded and/or a character sequence to be coded;
matching the characters to be coded with characters in a preset character set, and coding the characters to be coded according to a coding mode corresponding to the successfully matched characters; and/or matching the character sequence to be coded with the character series in the preset character set, and coding the character sequence to be coded according to the coding mode corresponding to the character sequence successfully matched, wherein the preset character set comprises the corresponding relation between the characters and the codes and the corresponding relation between the character sequence and the codes.
2. The method of claim 1, wherein the preset character set comprises a first character set and a second character set, wherein the first character set has a frequency of use of characters greater than a frequency of use of characters in the second character set, and wherein the first character set has a frequency of use of character sequences greater than a frequency of use of character sequences in the second character set.
3. The method according to claim 2, wherein said encoding the character to be encoded and/or the sequence of characters to be encoded according to a preset character set comprises:
according to the first character set, encoding the characters to be encoded and/or the character sequences to be encoded;
and if the encoding is unsuccessful, encoding the character to be encoded and/or the character sequence to be encoded according to the second character set.
4. The method according to claim 2 or 3, wherein the preset character set further comprises a third character set, and the third character set is a collection of the first character set and the second character set.
5. The method according to claim 4, wherein said encoding the character to be encoded and/or the sequence of characters to be encoded according to a preset character set comprises:
if the character to be coded is matched with the characters in the first character set and matched with the characters in the third character set, coding the character to be coded according to the first character set;
and/or
And if the character sequence to be coded is matched with the character sequence in the first character set and is matched with the character sequence in the third character set, coding the character sequence to be coded according to the first character set.
6. The method according to claim 4, wherein said encoding the character to be encoded and/or the sequence of characters to be encoded according to a preset character set comprises:
if the characters to be coded at the same position correspond to a plurality of codes after being matched with the characters in the first character set, the codes with the maximum code values are used for coding the characters to be coded at the same position;
and/or
And if the character sequence to be coded at the same position corresponds to a plurality of codes after being matched with the character sequence in the first character set, the code with the maximum code value is used for coding the character sequence to be coded at the same position.
7. The method according to claim 4, wherein said encoding the character to be encoded and/or the sequence of characters to be encoded according to a preset character set comprises:
acquiring a first coding bit number for coding the character to be coded and/or the character sequence to be coded according to the first character set and the second character set, and acquiring a second coding bit number for coding the character to be coded and/or the character sequence to be coded according to the third character set;
and if the second coding bit number is less than the first coding bit number, coding the character to be coded and/or the character sequence to be coded according to the third character set.
8. An encoding device, characterized by comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the encoding method of any one of claims 1 to 7.
9. A computer-readable storage medium having computer-executable instructions stored thereon, which when executed by a processor, are configured to implement the encoding method of any one of claims 1 to 7.
10. The computer-readable storage medium according to claim 9, wherein the computer-readable storage medium stores a two-dimensional code, the two-dimensional code being executed by a processor, the two-dimensional code being encoded according to the encoding method of any one of claims 1 to 7.
11. The computer-readable storage medium of claim 10, wherein the two-dimensional code is used for payment of a fee, product identification, or web page hopping.
12. The computer-readable storage medium according to claim 9, wherein the computer-readable storage medium stores therein a Radio Frequency Identification (RFID) module, the RFID module storing therein data, the data being executed by a processor, the data being encoded according to the encoding method of any one of claims 1 to 7.
13. The computer-readable storage medium of claim 12, wherein the RFID module is used in a logistics label, a product label, an electronic certificate, an electronic key, or a mobile payment.
CN202010861534.9A 2020-08-25 2020-08-25 Encoding method, apparatus, device, and computer-readable storage medium Active CN112199922B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010861534.9A CN112199922B (en) 2020-08-25 2020-08-25 Encoding method, apparatus, device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010861534.9A CN112199922B (en) 2020-08-25 2020-08-25 Encoding method, apparatus, device, and computer-readable storage medium

Publications (2)

Publication Number Publication Date
CN112199922A true CN112199922A (en) 2021-01-08
CN112199922B CN112199922B (en) 2023-08-22

Family

ID=74005001

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010861534.9A Active CN112199922B (en) 2020-08-25 2020-08-25 Encoding method, apparatus, device, and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN112199922B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113157985A (en) * 2021-05-08 2021-07-23 北京京东乾石科技有限公司 Code matching method, device and system and storage medium thereof

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101005485A (en) * 2006-12-14 2007-07-25 钟杨 Method and system for compression coding information resource address
CN102131161A (en) * 2010-01-14 2011-07-20 华为技术有限公司 Method, device and system for encoding short message
CN102333082A (en) * 2010-08-23 2012-01-25 微软公司 The URL of safety shortens
US20120143966A1 (en) * 2010-12-07 2012-06-07 Kia Motors Corporation Apparatus and method of generating an sms message
CN102592160A (en) * 2012-01-17 2012-07-18 浙江工商大学 Character two-dimension code encoding and decoding method for short message
CN102801430A (en) * 2012-08-16 2012-11-28 福州大学 Compression algorithm for Chinese parameters of URL
CN103210590A (en) * 2012-08-21 2013-07-17 华为技术有限公司 Compression method and apparatus
CN104283568A (en) * 2013-07-12 2015-01-14 中国科学院声学研究所 Data compressed encoding method based on part Hoffman tree
CN108900196A (en) * 2018-06-28 2018-11-27 郑州云海信息技术有限公司 A kind of data decoding method based on lzw algorithm, device, equipment and medium
CN109525249A (en) * 2018-09-30 2019-03-26 湖南瑞利德信息科技有限公司 Coding-decoding method, system, readable storage medium storing program for executing and computer equipment
CN110442844A (en) * 2019-07-03 2019-11-12 北京达佳互联信息技术有限公司 Data processing method, device, electronic equipment and storage medium
CN110572161A (en) * 2019-09-10 2019-12-13 北京中科寒武纪科技有限公司 data encoding method and device, computer equipment and readable storage medium

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101005485A (en) * 2006-12-14 2007-07-25 钟杨 Method and system for compression coding information resource address
CN102131161A (en) * 2010-01-14 2011-07-20 华为技术有限公司 Method, device and system for encoding short message
CN102333082A (en) * 2010-08-23 2012-01-25 微软公司 The URL of safety shortens
US20120143966A1 (en) * 2010-12-07 2012-06-07 Kia Motors Corporation Apparatus and method of generating an sms message
CN102592160A (en) * 2012-01-17 2012-07-18 浙江工商大学 Character two-dimension code encoding and decoding method for short message
CN102801430A (en) * 2012-08-16 2012-11-28 福州大学 Compression algorithm for Chinese parameters of URL
CN103210590A (en) * 2012-08-21 2013-07-17 华为技术有限公司 Compression method and apparatus
CN104283568A (en) * 2013-07-12 2015-01-14 中国科学院声学研究所 Data compressed encoding method based on part Hoffman tree
CN108900196A (en) * 2018-06-28 2018-11-27 郑州云海信息技术有限公司 A kind of data decoding method based on lzw algorithm, device, equipment and medium
CN109525249A (en) * 2018-09-30 2019-03-26 湖南瑞利德信息科技有限公司 Coding-decoding method, system, readable storage medium storing program for executing and computer equipment
CN110442844A (en) * 2019-07-03 2019-11-12 北京达佳互联信息技术有限公司 Data processing method, device, electronic equipment and storage medium
CN110572161A (en) * 2019-09-10 2019-12-13 北京中科寒武纪科技有限公司 data encoding method and device, computer equipment and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
R.HASHEMIAN 等: "Memory Efficient and high-speed search Huffman coding", 《IEEE TRANSACTIONS ON COMMUNICATIONS》, pages 2576 - 2581 *
伍伟鑫 等: "基于差分编码的RDF分组压缩", 《计算机工程》, pages 117 - 123 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113157985A (en) * 2021-05-08 2021-07-23 北京京东乾石科技有限公司 Code matching method, device and system and storage medium thereof

Also Published As

Publication number Publication date
CN112199922B (en) 2023-08-22

Similar Documents

Publication Publication Date Title
CN108388598B (en) Electronic device, data storage method, and storage medium
US10212244B2 (en) Information push method, server, user terminal and system
CN110597511B (en) Page automatic generation method, system, terminal equipment and storage medium
US20160335279A1 (en) Method for loading website commenting information, and browser client
CN104823187A (en) Displaying sort results on mobile computing device
CN112348104B (en) Identification method, device, equipment and storage medium for counterfeit program
CN105488125A (en) Page access method and apparatus
US12008604B2 (en) Ad simulator browser extension
CN103345493A (en) Method, device and system for text content displaying on mobile terminal
CN103250172A (en) Information processing apparatus, server, information processing system and information processing method
EP2678809A1 (en) Entity fingerprints
CN113849748A (en) Information display method and device, electronic equipment and readable storage medium
CN114239504A (en) Form configuration method, device, equipment, readable storage medium and program product
CN112199922B (en) Encoding method, apparatus, device, and computer-readable storage medium
CN108810916B (en) Wi-Fi hotspot recommendation method and device and storage medium
US8230335B2 (en) Enhanced visual representations of company related data and generation of virtual business cards
CN117390011A (en) Report data processing method, device, computer equipment and storage medium
CN104572816A (en) Information processing method and electronic equipment
CN111985760A (en) Data content evaluation method and device, electronic equipment and storage medium
CN108052521B (en) Coordinated data display method, application server and storage medium
US12019625B2 (en) Techniques for automated database query generation
CN105243315A (en) Single picture verification code input method, apparatus and system
CN105808628A (en) Webpage transcoding method, apparatus and system
CN104750823B (en) Method and device for inquiring promotion condition data
CN113688899A (en) Data fusion method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant