CN110932822A - Data encoding method, data decoding method, device, equipment and storage medium - Google Patents

Data encoding method, data decoding method, device, equipment and storage medium Download PDF

Info

Publication number
CN110932822A
CN110932822A CN201911212378.7A CN201911212378A CN110932822A CN 110932822 A CN110932822 A CN 110932822A CN 201911212378 A CN201911212378 A CN 201911212378A CN 110932822 A CN110932822 A CN 110932822A
Authority
CN
China
Prior art keywords
preset
character string
character
data
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911212378.7A
Other languages
Chinese (zh)
Other versions
CN110932822B (en
Inventor
程战战
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taikang Insurance Group Co Ltd
Taikang Online Property Insurance Co Ltd
Original Assignee
Taikang Insurance Group Co Ltd
Taikang Online Property Insurance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taikang Insurance Group Co Ltd, Taikang Online Property Insurance Co Ltd filed Critical Taikang Insurance Group Co Ltd
Priority to CN201911212378.7A priority Critical patent/CN110932822B/en
Publication of CN110932822A publication Critical patent/CN110932822A/en
Application granted granted Critical
Publication of CN110932822B publication Critical patent/CN110932822B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0078Avoidance of errors by organising the transmitted data in a format specifically designed to deal with errors, e.g. location
    • H04L1/0084Formats for payload data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0078Avoidance of errors by organising the transmitted data in a format specifically designed to deal with errors, e.g. location
    • H04L1/009Avoidance of errors by organising the transmitted data in a format specifically designed to deal with errors, e.g. location arrangements specific to transmitters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0078Avoidance of errors by organising the transmitted data in a format specifically designed to deal with errors, e.g. location
    • H04L1/0091Avoidance of errors by organising the transmitted data in a format specifically designed to deal with errors, e.g. location arrangements specific to receivers, e.g. format detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/565Conversion or adaptation of application format or content

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The application provides a data encoding method, a data decoding method, a device, equipment and a storage medium, wherein the method comprises the following steps: after an original character string to be transmitted is obtained, each character in the original character string is converted into coded data in a corresponding preset format respectively to obtain a coded data sequence in the preset format corresponding to the original character string, the coded data sequence is converted according to a preset coding rule to obtain a target character string, wherein characters included in the target character string are all visible characters; further, the target character string is sent to the server, so that the server decodes the target character string to obtain the original character string. Because the characters included in the target character string are all visible characters, the server can accurately identify the target character string for decoding processing, so that the original character string can be accurately identified, and the data transmission efficiency is improved.

Description

Data encoding method, data decoding method, device, equipment and storage medium
Technical Field
The embodiments of the present application relate to the field of computer technologies, and in particular, to a data encoding method, a data decoding method, an apparatus, a device, and a storage medium.
Background
With the development of computer technology, data transmission between a terminal and a server is very common.
In the prior art, after acquiring data to be transmitted, a terminal usually encodes the data by using a specified encoding and decoding rule, and then directly sends the encoded data to a server. The data to be transmitted may include english, chinese characters and/or numbers, and when different encoding and decoding rules are adopted, the number of bytes occupied by the english and/or the number and the chinese characters may be different. For example, when an 8-bit Unicode Transformation Format (UTF-8) codec rule is employed, one english byte and one chinese byte; when a uniform Code (Unicode) encoding and decoding rule or a Chinese character encoding and decoding rule (GBK) encoding and decoding rule is adopted, both english and Chinese characters occupy two bytes.
In the prior art, a server needs to identify which Chinese characters are, which are English characters and which are numbers from a received data stream, but when different coding and decoding rules are adopted, the number of bytes occupied by English and/or numbers and Chinese characters may be different, so that the server often cannot identify which Chinese characters are, and further the server generates messy codes.
Disclosure of Invention
The embodiment of the application provides a data encoding method, a data decoding method, a device, equipment and a storage medium, and solves the problem that a server in the prior art cannot identify which Chinese characters cause messy codes.
In a first aspect, an embodiment of the present application provides a data encoding method, including:
acquiring an original character string to be transmitted;
converting each character in the original character string into corresponding coded data in a preset format respectively to obtain a coded data sequence in the preset format corresponding to the original character string;
converting the coded data sequence according to a preset coding rule to obtain a target character string; wherein the characters included in the target character string are all visible characters;
and sending the target character string to a server so that the server decodes the target character string to obtain the original character string.
In a possible implementation manner, the converting each character in the original character string into the encoded data in the corresponding preset format respectively to obtain the encoded data sequence in the preset format corresponding to the original character string includes:
for each character in the original character string, respectively determining an American Standard Code for Information Interchange (ASCII) value corresponding to the character according to a first preset function;
if the ASCII value corresponding to the character is larger than a preset numerical value, acquiring the character according to a second preset function, and converting the character into coded data in a corresponding preset format according to a third preset function; or if the ASCII value corresponding to the character is not greater than the preset numerical value, converting the ASCII value corresponding to the character into coded data in a preset format corresponding to the character;
and obtaining an encoded data sequence of a preset format corresponding to the original character string according to the encoded data of the preset format corresponding to each character in the original character string.
In one possible implementation, the preset encoding rule includes: presetting a coding sub-rule and a mapping rule, and converting the coded data sequence according to the preset coding rule to obtain a target character string, including:
for the coded data of the preset format corresponding to every three bytes in the coded data sequence, respectively converting according to the preset coding sub-rule to obtain intermediate conversion data;
obtaining target conversion data corresponding to each intermediate conversion data according to the preset mapping rule;
and obtaining the target character string according to each target conversion data.
In a possible implementation manner, the preset encoding rule further includes: presetting a character replacement rule, wherein if at least one target conversion data comprises a first preset character, before the target character string is obtained according to each target conversion data, the method further comprises the following steps:
replacing a first preset character included in the at least one target conversion data with a second preset character according to the preset character replacement rule;
the first preset character is a character with definite significance in a uniform resource positioning system (URL) parameter, and the second preset character is a character without definite significance in the URL parameter.
In a possible implementation manner, if the total number of bytes included in the encoded data sequence is not an integer multiple of three, before the encoded data in the preset format corresponding to every three bytes in the encoded data sequence is respectively converted according to the preset coding sub-rule to obtain intermediate conversion data, the method further includes:
zero padding is carried out at the tail end of the coding data sequence, so that the total number of bytes contained in the coding data sequence after zero padding is integral multiple of three;
correspondingly, the tail of the target character string is supplemented with a preset number of third preset characters.
In one possible implementation, the encoded data in the preset format includes: and UTF-8 coded data is converted into a format of UTF-8 coded data by adopting an 8-bit uniform code.
In one possible implementation, the preset encoding rule includes: the binary data base64 encoding rules are represented based on 64 printable characters.
In one possible implementation, the original string includes: at least one kanji character.
In a possible implementation manner, the obtaining an original character string to be transmitted includes:
and acquiring the original character string by adopting a preset computer language supporting world wide WEB (WEB).
In a second aspect, an embodiment of the present application provides a data decoding method, including:
receiving a target character string sent by a terminal; wherein the characters included in the target character string are all visible characters;
converting the target character string according to a preset decoding rule to obtain a decoding data sequence in a preset format corresponding to the original character string;
and converting the decoded data in each preset format in the decoded data sequence to obtain each corresponding character in the original character string.
In one possible implementation, the preset decoding rule includes: presetting a mapping rule and a decoding sub-rule, wherein the target character string is converted according to the preset decoding rule to obtain a decoding data sequence with a preset format corresponding to the original character string, and the method comprises the following steps:
aiming at every four characters in the target character string, respectively obtaining intermediate conversion data corresponding to the four characters according to the preset mapping rule;
and converting the intermediate conversion data corresponding to each of the four characters according to the preset decoding sub-rule to obtain a decoding data sequence in a preset format corresponding to the original character string.
In one possible implementation manner, the preset decoding rule further includes: presetting a character replacement rule, if the target character string comprises a second preset character, before obtaining intermediate conversion data corresponding to the four characters according to the preset mapping rule for every four characters in the target character string, the method further comprises:
replacing a second preset character included in the target character string with a first preset character according to the preset character replacement rule;
the first preset character is a character with definite significance in a uniform resource positioning system (URL) parameter, and the second preset character is a character without definite significance in the URL parameter.
In one possible implementation manner, the decoding data in the preset format includes: and decoding the data by adopting an 8-bit uniform code Unicode conversion format UTF-8.
In one possible implementation, the preset decoding rule includes: the binary data base64 decoding rules are represented based on 64 printable characters.
In one possible implementation, the original string includes: at least one kanji character.
In a third aspect, an embodiment of the present application provides a data encoding apparatus, including:
the acquisition module is used for acquiring an original character string to be transmitted;
the first conversion module is used for respectively converting each character in the original character string into the corresponding coded data in the preset format to obtain a coded data sequence in the preset format corresponding to the original character string;
the second conversion module is used for converting the coded data sequence according to a preset coding rule to obtain a target character string; wherein the characters included in the target character string are all visible characters;
and the sending module is used for sending the target character string to a server so that the server decodes the target character string to obtain the original character string.
In a possible implementation manner, the first conversion module is specifically configured to:
for each character in the original character string, respectively determining an American Standard Code for Information Interchange (ASCII) value corresponding to the character according to a first preset function;
if the ASCII value corresponding to the character is larger than a preset numerical value, acquiring the character according to a second preset function, and converting the character into coded data in a corresponding preset format according to a third preset function; or if the ASCII value corresponding to the character is not greater than the preset numerical value, converting the ASCII value corresponding to the character into coded data in a preset format corresponding to the character;
and obtaining an encoded data sequence of a preset format corresponding to the original character string according to the encoded data of the preset format corresponding to each character in the original character string.
In one possible implementation, the preset encoding rule includes: the second conversion module is specifically configured to:
for the coded data of the preset format corresponding to every three bytes in the coded data sequence, respectively converting according to the preset coding sub-rule to obtain intermediate conversion data;
obtaining target conversion data corresponding to each intermediate conversion data according to the preset mapping rule;
and obtaining the target character string according to each target conversion data.
In a possible implementation manner, the preset encoding rule further includes: a preset character replacement rule, wherein if at least one of the target conversion data includes a first preset character, the second conversion module is further configured to, before obtaining the target character string according to each of the target conversion data:
replacing a first preset character included in the at least one target conversion data with a second preset character according to the preset character replacement rule;
the first preset character is a character with definite significance in a uniform resource positioning system (URL) parameter, and the second preset character is a character without definite significance in the URL parameter.
In one possible implementation, the encoded data in the preset format includes: and UTF-8 coded data is converted into a format of UTF-8 coded data by adopting an 8-bit uniform code.
In one possible implementation, the preset encoding rule includes: the binary data base64 encoding rules are represented based on 64 printable characters.
In one possible implementation, the original string includes: at least one kanji character.
In a fourth aspect, an embodiment of the present application provides a data decoding apparatus, including:
the receiving module is used for receiving a target character string sent by the terminal; wherein the characters included in the target character string are all visible characters;
the first conversion module is used for converting the target character string according to a preset decoding rule to obtain a decoding data sequence in a preset format corresponding to the original character string;
and the second conversion module is used for converting the decoded data in each preset format in the decoded data sequence to obtain each corresponding character in the original character string.
In one possible implementation, the preset decoding rule includes: the first conversion module is specifically configured to:
aiming at every four characters in the target character string, respectively obtaining intermediate conversion data corresponding to the four characters according to the preset mapping rule;
and converting the intermediate conversion data corresponding to each of the four characters according to the preset decoding sub-rule to obtain a decoding data sequence in a preset format corresponding to the original character string.
In one possible implementation manner, the preset decoding rule further includes: presetting a character replacement rule, and if the target character string comprises a second preset character, the first conversion module is further used for:
replacing a second preset character included in the target character string with a first preset character according to the preset character replacement rule;
the first preset character is a character with definite significance in a uniform resource positioning system (URL) parameter, and the second preset character is a character without definite significance in the URL parameter.
In one possible implementation manner, the decoding data in the preset format includes: and decoding the data by adopting an 8-bit uniform code Unicode conversion format UTF-8.
In one possible implementation, the preset decoding rule includes: the binary data base64 decoding rules are represented based on 64 printable characters.
In one possible implementation, the original string includes: at least one kanji character.
In a fifth aspect, an electronic device in an embodiment of the present application includes:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of any implementation of the first or second aspect described above via execution of the executable instructions.
In a sixth aspect, the present application is embodied in a computer-readable storage medium, on which a computer program is stored, the computer program, when executed by a processor, implementing the method according to any implementation manner of the first aspect or the second aspect.
According to the data encoding method, the data decoding device, the data decoding equipment and the storage medium, after an original character string to be transmitted is obtained, each character in the original character string is converted into the corresponding encoded data in the preset format, an encoded data sequence in the preset format corresponding to the original character string is obtained, the encoded data sequence is converted according to the preset encoding rule, and a target character string is obtained, wherein characters included in the target character string are all visible characters; further, the target character string is sent to the server, so that the server decodes the target character string to obtain the original character string. Because the characters included in the target character string are all visible characters, the server can accurately identify the target character string for decoding processing, so that the original character string can be accurately identified, and the data transmission efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without inventive exercise.
Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of a user interface in a terminal;
fig. 3 is a schematic flowchart of a data encoding method according to an embodiment of the present application;
fig. 4 is a schematic flowchart of a data decoding method according to an embodiment of the present application;
fig. 5 is a schematic flowchart of a data encoding and decoding method according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a data encoding apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a data decoding apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application. As shown in fig. 1, the application scenario diagram may include: a server and at least one terminal (for convenience of description, at least one terminal is shown in fig. 1 as including a terminal 1 and a terminal 2). Of course, the application scenario diagram provided in the embodiment of the present application may further include other devices, which is not limited in the embodiment of the present application.
In the embodiment of the application, the terminal converts the obtained original character string to be transmitted to obtain the target character string, and sends the target character string to the server, so that the server decodes the target character string to obtain the original character string, wherein the characters included in the target character string are all visible characters, so that the server can accurately identify the target character string for decoding, and further accurately identify the original character string, and the problem that the server cannot identify which characters cause messy codes in the prior art is solved.
In the embodiment of the present application, an execution subject for executing the data encoding method may be a terminal, or may be a data encoding device in the terminal (it should be noted that, in the embodiment provided in the present application, description is given by taking the terminal as an example).
For example, the terminal or the data encoding apparatus in the embodiments of the present application may be implemented by software and/or hardware.
The terminal involved in the embodiments of the present application may include, but is not limited to: the mobile phone and/or the computer can also comprise other devices with data coding functions.
In the embodiment of the present application, the execution subject for executing the data decoding method may be a server, or may be a data decoding apparatus in the server (it should be noted that, in the embodiment provided in the present application, description is given taking the server as an example).
Under the existing computer system, the interior of the computer stores data in binary or hexadecimal. Generally speaking, the inside of a computer is all 'numbers'. If the computer wants to display English, since English has only 26 letters and the number of punctuation marks is not large, English can be represented by using one byte (0-255).
Illustratively, American Standard Code for Information Interchange (ASCII) Code, the most common "single byte" coding system now used to display English and some Western European speech. Where 48-57 represent the numbers 0-9 and 65-90 represent the letters a-Z in the ASCII encoding specification.
However, because there are thousands of commonly used Chinese characters, the total number of the Chinese characters reaches tens of thousands, one byte has only 256 characters, and the Chinese characters can not correspond to all the Chinese characters and can only be solved by a plurality of bytes. Under various standards, Chinese character coding specifications are various, and GBK, UTF-8, Unicode and the like are common. When different Chinese character encoding specifications are adopted, the same Chinese character may correspond to different encoded data, for example, "you" corresponds to GBK encoded data of C4E3, UTF-8 encoded data of E4BDA0, and Unicode encoded data of 4F 60.
In general, data to be transmitted may include english, chinese, and/or number, and when different encoding and decoding rules are adopted, the number of bytes occupied by english and/or number and chinese may also be different. For example, when the UTF-8 encoding and decoding rule is adopted, one English word occupies one byte, and one Chinese word occupies three bytes; when the Unicode encoding and decoding rule or the GBK encoding and decoding rule is adopted, English and Chinese characters occupy two bytes.
In the prior art, a server needs to identify which Chinese characters are, which are English characters and which are numbers from a received data stream, but when different coding and decoding rules are adopted, the number of bytes occupied by English and/or numbers and Chinese characters may be different, so that the server often cannot identify which Chinese characters are, and further the server generates messy codes.
Fig. 2 is a schematic diagram of a user interface in a terminal, and as shown in fig. 2, when a user inputs a name in a "name" column, inputs an address in an "address" column, and clicks a "submit" button, the terminal acquires data to be transmitted, for example, the name input in the "name" column and the address input in the "address" column; when the name and/or address input by the user have Chinese characters, the server may not correctly recognize the Chinese characters, resulting in a messy code.
The following describes a part that may cause code confusion, taking a World Wide Web (Web) system as an example:
A. linux operating system parameters: a Java application server of the WEB system, wherein the set LANG statement can change language parameters;
B. default parameters of the Java Environment: such as "-ddefault. client. encoding ═ GBK, -duser. language ═ Zh";
C. header of Java Server page (Java Server Pages, JSP): for example, "<% @ page language ═ java" page encoding ═ GBK "% >;
D. HTML header of JSP: for example, "< meta http-equiv" ("Content-Type" ("Content")/html); charset ═ GBK ">;
E. java code acquisition parameters: for example, the String's getBytes method in Java code can get coded data in a specified format;
F. different application frameworks may also involve internationalization: for example, different languages may be involved.
If a WEB system has messy codes, the arrangement relationship with each part can be caused, and the workload of troubleshooting the messy codes is very large.
For the scrambling problem shown in fig. 2, firstly, according to the way of obtaining parameters by the Java code mentioned in section E above, for example, a method of getBytes inside the Java code is used to attempt to obtain UTF-8 encoded data, GBK encoded data, etc., but the results are all invalid.
Secondly, according to the methods mentioned in the above parts C and D, and in combination with the method mentioned in the above part E, for example, by modifying the JSP header and the HTML header, and in combination with the JAVA code parameter acquisition method mentioned in the above part E, although it takes a lot of time to perform the encoding attempt, it is still ineffective.
In addition, it is considered that it may be due to the problem of code setting in the jquery-mobile frame, but it is difficult to find a modified place because the source code of the jquery-mobile frame is too many.
Then, considering the problem that the default parameters of the Java environment may be due to, but the default parameters of the Java environment affect a plurality of applications on the server and cannot be easily modified.
Finally, the original character string to be transmitted is converted to obtain a target character string (the characters included are all visible characters), the visible characters are transmitted to replace original Chinese coded data, so that the server can accurately identify the target character string to perform decoding processing, each character in the original character string is further accurately identified, and the problem that the server cannot identify which characters are messy codes caused by Chinese characters in the prior art is solved.
The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.
Fig. 3 is a schematic flowchart of a data encoding method according to an embodiment of the present application. In the embodiment of the present application, an implementation manner of a data encoding method is described by taking an execution subject as an example. As shown in fig. 3, the method of the embodiment of the present application may include:
step 301, obtaining an original character string to be transmitted.
In the step, the terminal obtains an original character string to be transmitted, which is input by a user; the original string may include, but is not limited to: numeric characters, English characters, and/or Kanji characters.
Illustratively, the original string may include: at least one kanji character; of course, other types of characters (e.g., numeric characters and/or English characters, etc.) may also be included in the original string.
For example, the manner in which the user enters the original character string may include, but is not limited to: a keyboard input mode, a voice input mode, or a touch screen input mode.
Alternatively, the terminal may obtain the original character string by using a preset computer language supporting World Wide Web (Web).
For example, taking the Java script language as an example, the following codes can be used to obtain the original character string:
“var insure_address=$('#address').val()”。
of course, the original character string may also be obtained by other ways, which is not limited in this embodiment of the application.
Step S302, respectively converting each character in the original character string into encoded data in a corresponding preset format, so as to obtain an encoded data sequence in the preset format corresponding to the original character string.
In this step, the terminal converts each character in the original character string obtained in step S301 into the corresponding encoded data in the preset format, so as to obtain the encoded data sequence in the preset format corresponding to the original character string.
Illustratively, the encoded data of the preset format may include: the UTF-8 coded data, that is, the terminal can respectively convert each character in the original character string into the corresponding UTF-8 coded data, so as to obtain the UTF-8 coded data sequence corresponding to the original character string. It can be understood that the encoded data in the preset format may also include encoded data in other formats besides the UTF-8 encoded data, which is not described in detail in this embodiment.
For ease of understanding, the following sections of the embodiments of the present application describe how the above-described transformation process may be implemented.
Optionally, for each character in the original character string, an ASCII value corresponding to the character is determined according to a first preset function. Illustratively, the first preset function may include, but is not limited to: charCodeAt () function in Java script language; it should be understood that the first preset function may also include other functions having a function of determining an ASCII value corresponding to a character, which is not described in detail in this embodiment of the application.
Further, if the ASCII value corresponding to the character is larger than a preset numerical value, acquiring the character according to a second preset function, and converting the character into coded data in a corresponding preset format according to a third preset function; or if the ASCII value corresponding to the character is not greater than the preset numerical value, converting the ASCII value corresponding to the character into the coded data in the preset format corresponding to the character.
Illustratively, the second preset function may include, but is not limited to: charAt () function in Java script language; it should be understood that the second preset function may also include other functions having a function of acquiring characters, which are not described in detail in this embodiment of the application.
Illustratively, the third preset function may include, but is not limited to: encoderURIComponent () function in Java script language; it should be understood that the third preset function may also include other functions having an encoding data function of converting characters into a preset format, which is not described in detail in this embodiment of the application.
In the embodiment of the present application, for each character in an original character string, if an ASCII value corresponding to the character is greater than a preset value (e.g., 127), the character is a kanji character, so that the terminal may obtain the character according to a second preset function and convert the character into encoded data in a corresponding preset format (e.g., UTF-8 encoded data) according to a third preset function. Or, if the ASCII value corresponding to the character is not greater than the preset value, the character is a numeric character or an english character, so that the terminal can directly convert the ASCII value corresponding to the character into data (for example, hexadecimal data) in a preset system to obtain the encoded data in the preset format corresponding to the character.
Further, according to the encoded data in the preset format corresponding to each character in the original character string, an encoded data sequence in the preset format corresponding to the original character string is obtained.
For example, the terminal may splice the encoded data in the preset format corresponding to each character in the original character string, so as to obtain an encoded data sequence in the preset format corresponding to the original character string.
For example, assuming that the original character string is "revival gate 156", taking the encoded data in the preset format as the UTF-8 encoded data as an example, the conversion process of the original character string is described as follows:
the first character "complex": obtaining an ASCII value corresponding to a first character complex by calling a charCodeAt (0) function to be 22797, obtaining the complex character by calling the charAt (0) function as the ASCII value corresponding to the first character complex is larger than 127, and obtaining% E5% A4% 8D by calling an encoderURIComponent (complex) function to obtain UTF-8 encoded data corresponding to the first character complex: e5 a 48D (filter out%).
The conversion process of the second character "xing" and the third character "gate" can be referred to the conversion process of the first character "complex", and finally the UTF-8 encoded data corresponding to the second character "xing" is obtained: E585B 4, and UTF-8 encoded data corresponding to the third character "gate": e997 A8.
Fourth character "1": the ASCII value corresponding to the fourth character "1" is 49 by calling the charCodeAt (3) function, and since the ASCII value corresponding to the fourth character "1" is not greater than 127, the ASCII value corresponding to the fourth character "1" is directly converted into hexadecimal data 0x31, so as to obtain UTF-8 encoded data corresponding to the fourth character "1": 31.
the conversion process of the fifth character "5" and the sixth character "6" can be referred to the conversion process of the fourth character "1", and finally the UTF-8 encoded data corresponding to the fifth character "5" is obtained: 35, and UTF-8 encoded data corresponding to the sixth character "6": 36.
the conversion process of the seventh character "number" can refer to the conversion process of the first character "complex", and finally the UTF-8 encoded data corresponding to the seventh character "number" is obtained: E58F B7.
The UTF-8 coded data sequence corresponding to the original character string 'Fuxing gate 156' is obtained by splicing UTF-8 coded data corresponding to each character in the original character string 'Fuxing gate 156': e5 a 48D E585B 4E 997 a 8313536E 58F B7.
And step S303, converting the coded data sequence according to a preset coding rule to obtain a target character string.
In this step, the terminal converts the encoded data sequence obtained in step S302 according to a preset encoding rule to obtain a target character string, where the characters included in the target character string are all visible characters, so as to send the target character string (the characters included in the target character string are all visible characters) to the server, so that the server can accurately identify the target character string for decoding, and further accurately identify the original character string.
Illustratively, the preset encoding rule may include, but is not limited to: binary data (base64) encoding rules are represented based on 64 printable characters. Among other things, the base64 encoding rules may implement the conversion of binary data into visible characters.
For ease of understanding, the following sections of the embodiments of the present application describe how the above-described transformation process may be implemented.
Optionally, the preset encoding rule in the embodiment of the present application may include: and presetting a coding sub-rule and a mapping rule, and converting the coded data of a preset format corresponding to every three bytes in the coded data sequence according to the preset coding sub-rule respectively to obtain intermediate conversion data.
For example, the preset encoding sub-rule in the embodiment of the present application may include, but is not limited to: converting the three bytes of data into binary data, and splicing (total 24 bits) to obtain a sequence; the sequence is then divided into four sets of 6-bit binary numbers, and the upper bits of each set of binary numbers are complemented by 0.
For example, the encoded data in the preset format is UTF-8 encoded data, and the UTF-8 encoded data sequence corresponding to the original character string "rejoining gate 156" is: E5A 48D E585B 4E 997A 8313536E 58F B7 is taken as an example to illustrate the preset encoding sub-rule:
converting the data of the first three bytes (e.g. E5A 48D) into binary data (e.g. 111001011010010010001101) and splicing to obtain a sequence (e.g. 111001011010010010001101); the sequence (e.g., 111001011010010010001101) is then partitioned into four sets of 6-bit binary numbers (e.g., 111001011010010010001101) and the upper bits of the four sets of binary numbers are complemented by 0 to yield four sets of 8-bit binary numbers (e.g., 00111001000110100001001000001101), which are then converted to corresponding decimal data (e.g., 57261813) to yield corresponding intermediate converted data, respectively.
The data conversion method of other bytes is the same as the data conversion method of the first three bytes, and is not described herein again.
For example, in the embodiment of the present application, for encoded data in a preset format corresponding to any three bytes, the intermediate conversion data may be obtained by performing conversion according to a preset encoding sub-rule in the following manner.
For example, ch1, ch2, and ch3 are respectively set as encoded data in a preset format corresponding to three bytes, and enc1, enc2, enc3, and enc4 are respectively set as intermediate conversion data of 4 bytes obtained after conversion.
enc1 ═ ch1> > 2; // the encoded data of the first byte is right-shifted by two bits;
enc2 ═ ((ch1&0x03) < <4) | ch2> > 4; v/the right 2 bits of the encoded data of the first byte concatenate the left 4 bits of the encoded data of the second byte;
enc3 ═ ((ch2&0xff) < <2) | (chr3> > 6); the last four bits of the coded data of the second byte are spliced with the first two bits of the coded data of the third byte;
enc4 ═ ch3&0x3 f; // the last six bits of the encoded data of the third byte are reserved.
Of course, for the encoded data in the preset format corresponding to any three bytes, the intermediate conversion data may be obtained by performing conversion in other manners according to the preset encoding sub-rule.
And further, obtaining target conversion data corresponding to each intermediate conversion data according to a preset mapping rule.
Illustratively, the preset mapping rule in the embodiment of the present application is used to indicate a correspondence relationship between intermediate data and corresponding target conversion data (visible characters). It should be understood that the preset mapping rules may exist in a table form or a sequence form, and of course, the preset mapping rules may also exist in other forms.
For example, if the preset mapping rule exists in a table form, the preset mapping rule may be as shown in table 1; if the predetermined mapping rule exists in a sequence, the predetermined mapping rule may be as "abcdefghijklmnopqrtpstuvwxyzabdcdefghijkllmnopqrstqrstqwxyz 0123456789 +/".
Table 1 is a schematic diagram of a preset mapping rule
Figure BDA0002298487110000141
Figure BDA0002298487110000151
In this embodiment of the application, the terminal may obtain, according to a preset mapping rule, target conversion data corresponding to each intermediate conversion data, for example, target conversion data corresponding to the intermediate conversion data "57" is "5", target conversion data corresponding to the intermediate conversion data "26" is "a", target conversion data corresponding to the intermediate conversion data "18" is "S", and target conversion data corresponding to the intermediate conversion data "13" is "N".
Further, target character strings are obtained according to the target conversion data.
Illustratively, after the terminal acquires the target conversion data corresponding to each intermediate conversion data, each target conversion data may be spliced to obtain the target character string.
Optionally, if the total number of bytes included in the encoded data sequence with the preset format is not an integer multiple of three, before the encoded data with the preset format corresponding to every three bytes in the encoded data sequence with the preset format is respectively converted according to the preset coding sub-rule to obtain intermediate conversion data, the terminal may perform zero padding at the end of the encoded data sequence with the preset format, so that the total number of bytes included in the encoded data sequence after the zero padding is an integer multiple of three; correspondingly, the terminal needs to complement a preset number of third preset characters at the tail of the target character string.
For example, if the total number of bytes included in the encoded data sequence of the preset format cannot be divided by 3, the terminal may complement 1 x00 at the end of the encoded data sequence of the preset format, and correspondingly, the terminal needs to complement 1 third preset character (for example, ") at the end of the target character string; if the terminal complements 2 0x00 at the end of the encoded data sequence with the preset format, correspondingly, the terminal needs to complement 2 third preset characters (for example, ") at the end of the target character string.
In addition, considering that the target data may include a character (referred to as a first predetermined character in this embodiment) having a certain meaning in a Uniform Resource Locator (URL) parameter, in order to ensure accuracy of data transmission, optionally, the predetermined encoding rule may further include: if at least one target conversion data includes a first preset character (for example, a character "+", a character "/", and/or a character "═"), the terminal may further replace the first preset character included in the at least one target conversion data with a second preset character according to a preset character replacement rule before obtaining the target character string according to each target conversion data, where the second preset character is a character that does not have a certain meaning in the URL parameter. For example, the terminal may replace a first preset character (e.g., character "+") with a second preset character (e.g., character "_"), replace a first preset character (e.g., character "/") with a second preset character (e.g., character "-"), and/or replace the first preset character (e.g., character "═") with the second preset character (e.g., character "|") according to preset character replacement rules.
And step S304, sending the target character string to a server so that the server decodes the target character string to obtain an original character string.
In this step, the terminal sends the target character string obtained in step S303 to the server, and since the characters included in the target character string are all visible characters, the server can accurately identify the target character string for decoding processing, so that the original character string can be accurately identified.
In the embodiment of the application, after an original character string to be transmitted is obtained, each character in the original character string is converted into coded data in a corresponding preset format to obtain a coded data sequence in the preset format corresponding to the original character string, and the coded data sequence is converted according to a preset coding rule to obtain a target character string, wherein characters included in the target character string are all visible characters; further, the target character string is sent to the server, so that the server decodes the target character string to obtain the original character string. Because the characters included in the target character string are all visible characters, the server can accurately identify the target character string for decoding processing, so that the original character string can be accurately identified, and the data transmission efficiency is improved.
Fig. 4 is a flowchart illustrating a data decoding method according to an embodiment of the present application. Based on the above example, in the embodiment of the present application, the implementation manner of the data decoding method is described by taking the execution subject as an example. As shown in fig. 4, the method of the embodiment of the present application may include:
and step S401, receiving the target character string sent by the terminal.
Wherein, the target character string is: the method comprises the steps that a terminal converts each character in an original character string to be transmitted into coded data in a corresponding preset format respectively to obtain a coded data sequence in the preset format corresponding to the original character string, and then converts the coded data sequence according to preset coding rules to obtain the coded data sequence; the characters included in the target character string are all visible characters, so that the server can accurately identify the target character string.
For a specific conversion manner, reference may be made to the relevant description in step S302 and step S303 in the above example of the present application, and details are not described herein again.
The original character string referred to in the embodiments of the present application may include, but is not limited to: numeric characters, English characters, and/or Kanji characters.
Illustratively, the original string may include: at least one kanji character; of course, other types of characters (e.g., numeric characters and/or English characters, etc.) may also be included in the original string.
Step S402, converting the target character string according to a preset decoding rule to obtain a decoding data sequence with a preset format corresponding to the original character string.
It should be understood that the decoded data sequence of the preset format corresponding to the original character string is a decoded data sequence corresponding to the encoded data sequence of the preset format described above.
Illustratively, the encoded data of the preset format may include UTF-8 encoded data, the encoded data sequence of the preset format may include an UTF-8 encoded data sequence, correspondingly, the decoded data of the preset format may include UTF-8 decoded data, and the decoded data sequence of the preset format may include an UTF-8 decoded data sequence.
It should be noted that the encoded data in the preset format may also include encoded data in other formats besides the UTF-8 encoded data, and correspondingly, the decoded data in the preset format may also include decoded data in other formats besides the UTF-8 decoded data, which is not described in detail in this embodiment of the application.
In a possible implementation manner, the preset decoding rule in the embodiment of the present application is a decoding rule corresponding to the preset encoding rule related in the step S303. Illustratively, the preset decoding rules may include, but are not limited to: binary data (base64) decoding rules are represented based on 64 printable characters.
Optionally, if the preset decoding rule includes: presetting a mapping rule and a decoding sub-rule, and aiming at every four characters in a target character string, respectively obtaining intermediate conversion data corresponding to the four characters according to the preset mapping rule.
The preset mapping rule in the embodiment of the present application may refer to the related content in step S303, which is not described herein again.
For example, assuming that the target character string is "5 aSN5YW06ZeoMTU25Y + 3", every four characters in the target character string may be divided into one group, and five groups of "5 aSN", "5 YW 0", "6 Zeo", "MTU 2" and "5Y + 3" are obtained, taking the first four characters "5 aSN" in the target character string as an example, the server may obtain, according to a preset mapping rule, intermediate conversion data corresponding to the four characters as: "57", "26", "18", "13".
Further, the intermediate conversion data corresponding to each of the four characters is converted according to a preset decoding sub-rule, so as to obtain a decoding data sequence in a preset format corresponding to the original character string.
The preset decoding sub-rule in the embodiment of the present application is a decoding sub-rule corresponding to the preset encoding sub-rule related in step S303. Illustratively, the preset decoding sub-rule may include, but is not limited to: and decoding the 4 intermediate conversion data to obtain corresponding data of three bytes.
For example, taking 4 pieces of intermediate conversion data "57261913" as an example, a preset decoding sub-rule is explained:
converting the 4 intermediate converted data (e.g., 57261913) into four sets of 8-bit binary numbers (e.g., 00111001000110100001001000001101), and removing 0's of the upper two bits of each set of 8-bit binary numbers, resulting in four sets of 6-bit binary numbers (e.g., 111001011010010010001101); further, four groups of 6-bit binary data are spliced to obtain a sequence (for example, 111001011010010010001101), and the sequence is divided into three groups of 8-bit binary data (for example, 111001011010010010001101); the three sets of 8-bit binary data are then converted into corresponding hexadecimal data (e.g., E5 a 48D), respectively.
It should be understood that the three bytes of data obtained by converting the intermediate conversion data corresponding to each of the four characters are spliced to obtain a decoded data sequence in a preset format corresponding to the original character string.
Optionally, the preset decoding rule further includes: and presetting a character replacement rule, and if the target character string comprises a second preset character, before the server obtains intermediate conversion data corresponding to the four characters according to a preset mapping rule aiming at every four characters in the target character string, replacing the second preset character included in the target character string with the first preset character according to the preset character replacement rule. The first preset character is a character having a certain meaning in the URL parameter (e.g., a character "+", a character "/", and/or a character "═ h"), and the second preset character is a character having no certain meaning in the URL parameter (e.g., a character "_", a character "-", and/or a character "|").
For example, the server may replace a second preset character (e.g., character "_") with a first preset character (e.g., character "+"), replace a second preset character (e.g., character "-") with a first preset character (e.g., character "/"), and/or replace a second preset character (e.g., character "|") with a first preset character (e.g., character "═") according to preset character replacement rules.
In addition, if the server replaces the second preset character included in the target character string with the first preset character according to the preset character replacement rule, and the obtained target character string includes a preset number of third preset characters (for example, "═ y"), the server needs to remove zeros of the last preset number of bytes when splicing the three bytes of data obtained by converting the intermediate conversion data corresponding to each of the four characters.
For example, if the server replaces a second preset character included in the target character string with the first preset character according to a preset character replacement rule, and then the obtained target character string includes 1 third preset character (for example, "═ y"), then when the server concatenates three bytes of data obtained by converting intermediate conversion data corresponding to each of the four characters, 10 x00 at the end needs to be removed; if the server replaces the second preset character included in the target character string with the first preset character according to the preset character replacement rule, and then the obtained target character string includes 2 third preset characters (for example, "═ y"), then when the server converts the intermediate conversion data corresponding to each of the four characters to obtain three bytes of data for splicing, 2 0x00 at the end need to be removed.
In another possible implementation manner, the preset decoding rule involved in the embodiment of the present application may be a first preset decoding function. Illustratively, the first preset decoding function may include, but is not limited to: BASE64Decoder function of sun.
In this embodiment of the application, the server may convert the target character string according to a first preset decoding function to obtain a decoded data sequence in a preset format corresponding to the original character string.
For example, the server may implement the conversion of the target string by storing the resulting decoded data sequence in the preset format (e.g., E5 a 48D E585B 4E 997 a 8313536E 58F B7) in the textByte byte array:
“BASE64Decoder decoder=new BASE64Decoder();
byte[]textByte=decoder.decode("5aSN5YW06ZeoMTU25Y+3")”。
it should be understood that the first preset decoding function may also include other functions having a decoding function, which is not described in detail in this embodiment.
Optionally, if the target character string includes a second preset character, before the server converts the target character string according to the first preset decoding function to obtain a decoded data sequence in the preset format corresponding to the original character string, the server may further replace the second preset character included in the target character string with the first preset character according to a preset character replacement rule. The first preset character is a character having a certain meaning in the URL parameter (e.g., a character "+", a character "/", and/or a character "═ h"), and the second preset character is a character having no certain meaning in the URL parameter (e.g., a character "_", a character "-", and/or a character "|").
For example, the server may replace a second preset character (e.g., character "_") with a first preset character (e.g., character "+"), replace a second preset character (e.g., character "-") with a first preset character (e.g., character "/"), and/or replace a second preset character (e.g., character "|") with a first preset character (e.g., character "═") according to preset character replacement rules.
In addition, if the server includes a preset number of third preset characters (for example, "═") in the target character string obtained after replacing the second preset character included in the target character string with the first preset character according to the preset character replacement rule, the server may remove the third preset character included in the target character string before converting the target character string according to the first preset decoding function to obtain the decoded data sequence in the preset format corresponding to the original character string, or the server may remove zeros in a preset number of bytes at the end after converting the target character string according to the first preset decoding function to obtain the decoded data sequence in the preset format corresponding to the original character string, thereby obtaining the decoded data sequence in the preset format.
For example, if the server includes 1 third preset character (e.g., "═") in the target character string obtained after replacing the second preset character included in the target character string with the first preset character according to the preset character replacement rule, the server may convert the target character string according to the first preset decoding function, and after obtaining the decoded data sequence in the preset format corresponding to the original character string, 10 x00 at the end needs to be removed; if the server includes 2 third preset characters (for example, "═") in the target character string obtained after replacing the second preset character included in the target character string with the first preset character according to the preset character replacement rule, the server may convert the target character string according to the first preset decoding function, and after obtaining the decoded data sequence in the preset format corresponding to the original character string, 2 0x00 at the end need to be removed.
Step S403, converting each decoded data in the preset format in the decoded data sequence to obtain each corresponding character in the original character string.
In this step, the server may convert, according to the second preset decoding function, each decoded data in the decoded data sequence in the preset format to obtain each corresponding character in the original character string, so as to obtain the original character string.
Illustratively, the second preset decoding function may include, but is not limited to: string () function in Java script language.
For example, the server may convert the decoded data in each preset format in the decoded data sequence by the following codes, so as to obtain the original character string (e.g. revival gate number 156):
“String str1=new String(textByte,"UTF-8")”。
it should be understood that the second preset decoding function may also include other functions having a decoding function, which is not described in detail in this embodiment.
In the embodiment of the application, after a target character string sent by a terminal is received, the target character string is converted according to a preset decoding rule to obtain a decoding data sequence in a preset format corresponding to an original character string, wherein characters included in the target character string are all visible characters; further, each decoded data in a preset format in the decoded data sequence is converted to obtain each corresponding character in the original character string, so that the original character string is obtained. Because the characters included in the target character string are all visible characters, the server can accurately identify the target character string for decoding processing, so that the original character string can be accurately identified, and the data transmission efficiency is improved.
Fig. 5 is a flowchart illustrating a data encoding and decoding method according to an embodiment of the present application. On the basis of the above example, the data encoding and decoding method is introduced in the embodiment of the present application in combination with the terminal side and the server side. As shown in fig. 5, the method of the embodiment of the present application may include:
step S501, the terminal obtains an original character string to be transmitted.
For a specific implementation manner, reference may be made to relevant contents in step S301 described above, which is not described in detail in this embodiment of the application.
Step S502, the terminal converts each character in the original character string into the corresponding coded data in the preset format respectively to obtain the coded data sequence in the preset format corresponding to the original character string.
For a specific implementation manner, reference may be made to relevant contents in step S302 described above in this application, which is not described in detail in this embodiment of the application.
And S503, converting the coded data sequence by the terminal according to a preset coding rule to obtain a target character string.
For a specific implementation manner, reference may be made to relevant contents in step S303 described above in this application, which is not described in detail in this embodiment of the application.
And step S504, the terminal sends the target character string to the server.
Illustratively, the characters included in the target character string are all visible characters, so that the server can accurately identify the target character string for decoding processing, and further accurately identify the original character string.
Step S505, the server receives the target character string transmitted by the terminal.
Wherein, the target character string is: the method comprises the steps that a terminal converts each character in an original character string to be transmitted into coded data in a corresponding preset format respectively to obtain a coded data sequence in the preset format corresponding to the original character string, and then converts the coded data sequence according to preset coding rules to obtain the coded data sequence.
Step S506, the server converts the target character string according to a preset decoding rule to obtain a decoding data sequence with a preset format corresponding to the original character string.
For a specific implementation manner, reference may be made to relevant contents in step S402 described above in this application, which is not described in detail in this embodiment of the application.
Step S507, the server converts each decoded data in the preset format in the decoded data sequence to obtain each corresponding character in the original character string.
For a specific implementation manner, reference may be made to relevant contents in step S403 described above, which is not described in detail in this embodiment of the application.
In the embodiment of the application, after an original character string to be transmitted is obtained, a terminal converts each character in the original character string into coded data in a corresponding preset format to obtain a coded data sequence in the preset format corresponding to the original character string, and converts the coded data sequence according to a preset coding rule to obtain a target character string, wherein characters included in the target character string are all visible characters; further, the terminal sends the target character string to the server, and the characters included in the target character string are all visible characters, so that the server can accurately identify the target character string for decoding processing, the original character string can be accurately identified, and the data transmission efficiency is improved.
Fig. 6 is a schematic structural diagram of a data encoding device according to an embodiment of the present application. Optionally, the data encoding apparatus provided in this embodiment may be an apparatus in a terminal. As shown in fig. 6, the data encoding apparatus 60 provided in the embodiment of the present application may include: an obtaining module 601, a first converting module 602, a second converting module 603 and a sending module 604.
The obtaining module 601 is configured to obtain an original character string to be transmitted;
a first conversion module 602, configured to convert each character in the original character string into corresponding encoded data in a preset format, so as to obtain an encoded data sequence in the preset format corresponding to the original character string;
a second conversion module 603, configured to convert the encoded data sequence according to a preset encoding rule to obtain a target character string; wherein the characters included in the target character string are all visible characters;
a sending module 604, configured to send the target character string to a server, so that the server decodes the target character string to obtain the original character string.
In a possible implementation manner, the first conversion module 602 is specifically configured to:
for each character in the original character string, respectively determining an American Standard Code for Information Interchange (ASCII) value corresponding to the character according to a first preset function;
if the ASCII value corresponding to the character is larger than a preset numerical value, acquiring the character according to a second preset function, and converting the character into coded data in a corresponding preset format according to a third preset function; or if the ASCII value corresponding to the character is not greater than the preset numerical value, converting the ASCII value corresponding to the character into coded data in a preset format corresponding to the character;
and obtaining an encoded data sequence of a preset format corresponding to the original character string according to the encoded data of the preset format corresponding to each character in the original character string.
In one possible implementation, the preset encoding rule includes: the second conversion module 603 is specifically configured to:
for the coded data of the preset format corresponding to every three bytes in the coded data sequence, respectively converting according to the preset coding sub-rule to obtain intermediate conversion data;
obtaining target conversion data corresponding to each intermediate conversion data according to the preset mapping rule;
and obtaining the target character string according to each target conversion data.
In a possible implementation manner, the preset encoding rule further includes: a preset character replacement rule, if at least one of the target conversion data includes a first preset character, the second conversion module 603 is further configured to, before obtaining the target character string according to each of the target conversion data:
replacing a first preset character included in the at least one target conversion data with a second preset character according to the preset character replacement rule;
the first preset character is a character with definite significance in a uniform resource positioning system (URL) parameter, and the second preset character is a character without definite significance in the URL parameter.
In one possible implementation, the encoded data in the preset format includes: and UTF-8 coded data is converted into a format of UTF-8 coded data by adopting an 8-bit uniform code.
In one possible implementation, the preset encoding rule includes: the binary data base64 encoding rules are represented based on 64 printable characters.
In one possible implementation, the original string includes: at least one kanji character.
The data encoding apparatus provided in this embodiment may be configured to execute the technical solution related to the terminal in the foregoing data encoding method embodiment of the present application, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 7 is a schematic structural diagram of a data decoding apparatus according to an embodiment of the present application. Optionally, the data decoding apparatus provided in this embodiment may be an apparatus in a server. As shown in fig. 7, the data decoding apparatus 70 provided in the embodiment of the present application may include: a receiving module 701, a first converting module 702 and a second converting module 703.
The receiving module 701 is configured to receive a target character string sent by a terminal; wherein the characters included in the target character string are all visible characters;
a first conversion module 702, configured to convert the target character string according to a preset decoding rule, so as to obtain a decoded data sequence in a preset format corresponding to the original character string;
a second conversion module 703 is configured to convert each decoded data in the preset format in the decoded data sequence to obtain each corresponding character in the original character string.
In one possible implementation, the preset decoding rule includes: the first conversion module 702 is specifically configured to:
aiming at every four characters in the target character string, respectively obtaining intermediate conversion data corresponding to the four characters according to the preset mapping rule;
and converting the intermediate conversion data corresponding to each of the four characters according to the preset decoding sub-rule to obtain a decoding data sequence in a preset format corresponding to the original character string.
In one possible implementation manner, the preset decoding rule further includes: a preset character replacement rule, and if the target character string includes a second preset character, the first conversion module 702 is further configured to:
replacing a second preset character included in the target character string with a first preset character according to the preset character replacement rule;
the first preset character is a character with definite significance in a uniform resource positioning system (URL) parameter, and the second preset character is a character without definite significance in the URL parameter.
In one possible implementation manner, the decoding data in the preset format includes: and decoding the data by adopting an 8-bit uniform code Unicode conversion format UTF-8.
In one possible implementation, the preset decoding rule includes: the binary data base64 decoding rules are represented based on 64 printable characters.
In one possible implementation, the original string includes: at least one kanji character.
The data decoding apparatus provided in this embodiment may be configured to execute the technical solution related to the server in the foregoing data decoding method embodiment of the present application, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device in the embodiment of the present application may include, but is not limited to: a terminal or a server.
As shown in fig. 8, an electronic device 80 provided in the embodiment of the present application may include: a processor 801 and a memory 802. Optionally, the electronic device 80 may further include a transceiver 803, and the transceiver 803 is used for communication with other devices.
The memory 802 is used for storing executable instructions of the processor 801; the processor 801 is configured to execute the technical solution of the terminal in the foregoing data encoding method embodiment of the present application or the technical solution of the server in the foregoing data decoding method embodiment of the present application by executing the executable instruction, and the implementation principle and the technical effect are similar, and are not described herein again.
It should be understood that, when the electronic device in the embodiment of the present application includes a terminal, the processor 801 is configured to execute, via executing the executable instruction, a technical solution related to the terminal in the above-mentioned data encoding method embodiment of the present application; when the electronic device in the embodiment of the present application includes a server, the processor 801 is configured to execute, by executing the executable instruction, a technical solution about the server in the above-described data decoding method embodiment of the present application.
The embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the technical solution of the terminal in the foregoing data encoding method embodiment of the present application or the technical solution of the server in the foregoing data decoding method embodiment of the present application is implemented, and the implementation principle and the technical effect are similar, and are not described herein again.
It should be understood by those of ordinary skill in the art that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of the processes should be determined by their functions and inherent logic, and should not limit the implementation process of the embodiments of the present application.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: read-only memory (ROM), RAM, flash memory, hard disk, solid state disk, magnetic tape, floppy disk, optical disk, and any combination thereof.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (11)

1. A method of encoding data, comprising:
acquiring an original character string to be transmitted;
converting each character in the original character string into corresponding coded data in a preset format respectively to obtain a coded data sequence in the preset format corresponding to the original character string;
converting the coded data sequence according to a preset coding rule to obtain a target character string; wherein the characters included in the target character string are all visible characters;
and sending the target character string to a server so that the server decodes the target character string to obtain the original character string.
2. The method according to claim 1, wherein the converting each character in the original character string into the encoded data in the corresponding preset format respectively to obtain the encoded data sequence in the preset format corresponding to the original character string comprises:
for each character in the original character string, respectively determining an American Standard Code for Information Interchange (ASCII) value corresponding to the character according to a first preset function;
if the ASCII value corresponding to the character is larger than a preset numerical value, acquiring the character according to a second preset function, and converting the character into coded data in a corresponding preset format according to a third preset function; or if the ASCII value corresponding to the character is not greater than the preset numerical value, converting the ASCII value corresponding to the character into coded data in a preset format corresponding to the character;
and obtaining an encoded data sequence of a preset format corresponding to the original character string according to the encoded data of the preset format corresponding to each character in the original character string.
3. The method of claim 1, wherein the preset encoding rule comprises: presetting a coding sub-rule and a mapping rule, and converting the coded data sequence according to the preset coding rule to obtain a target character string, including:
for the coded data of the preset format corresponding to every three bytes in the coded data sequence, respectively converting according to the preset coding sub-rule to obtain intermediate conversion data;
obtaining target conversion data corresponding to each intermediate conversion data according to the preset mapping rule;
and obtaining the target character string according to each target conversion data.
4. The method of claim 3, wherein the preset encoding rule further comprises: presetting a character replacement rule, wherein if at least one target conversion data comprises a first preset character, before the target character string is obtained according to each target conversion data, the method further comprises the following steps:
replacing a first preset character included in the at least one target conversion data with a second preset character according to the preset character replacement rule;
the first preset character is a character with definite significance in a uniform resource positioning system (URL) parameter, and the second preset character is a character without definite significance in the URL parameter.
5. A method of decoding data, comprising:
receiving a target character string sent by a terminal; wherein the characters included in the target character string are all visible characters;
converting the target character string according to a preset decoding rule to obtain a decoding data sequence in a preset format corresponding to the original character string;
and converting the decoded data in each preset format in the decoded data sequence to obtain each corresponding character in the original character string.
6. The method of claim 5, wherein the preset decoding rule comprises: presetting a mapping rule and a decoding sub-rule, wherein the target character string is converted according to the preset decoding rule to obtain a decoding data sequence with a preset format corresponding to the original character string, and the method comprises the following steps:
aiming at every four characters in the target character string, respectively obtaining intermediate conversion data corresponding to the four characters according to the preset mapping rule;
and converting the intermediate conversion data corresponding to each of the four characters according to the preset decoding sub-rule to obtain a decoding data sequence in a preset format corresponding to the original character string.
7. The method of claim 6, wherein the preset decoding rule further comprises: presetting a character replacement rule, if the target character string comprises a second preset character, before obtaining intermediate conversion data corresponding to the four characters according to the preset mapping rule for every four characters in the target character string, the method further comprises:
replacing a second preset character included in the target character string with a first preset character according to the preset character replacement rule;
the first preset character is a character with definite significance in a uniform resource positioning system (URL) parameter, and the second preset character is a character without definite significance in the URL parameter.
8. A data encoding apparatus, comprising:
the acquisition module is used for acquiring an original character string to be transmitted;
the first conversion module is used for respectively converting each character in the original character string into the corresponding coded data in the preset format to obtain a coded data sequence in the preset format corresponding to the original character string;
the second conversion module is used for converting the coded data sequence according to a preset coding rule to obtain a target character string; wherein the characters included in the target character string are all visible characters;
and the sending module is used for sending the target character string to a server so that the server decodes the target character string to obtain the original character string.
9. A data decoding apparatus, comprising:
the receiving module is used for receiving a target character string sent by the terminal; wherein the characters included in the target character string are all visible characters;
the first conversion module is used for converting the target character string according to a preset decoding rule to obtain a decoding data sequence in a preset format corresponding to the original character string;
and the second conversion module is used for converting the decoded data in each preset format in the decoded data sequence to obtain each corresponding character in the original character string.
10. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of any of claims 1-4 or 5-7 via execution of the executable instructions.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of claims 1-7.
CN201911212378.7A 2019-12-02 2019-12-02 Data encoding method, data decoding method, device, equipment and storage medium Active CN110932822B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911212378.7A CN110932822B (en) 2019-12-02 2019-12-02 Data encoding method, data decoding method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911212378.7A CN110932822B (en) 2019-12-02 2019-12-02 Data encoding method, data decoding method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110932822A true CN110932822A (en) 2020-03-27
CN110932822B CN110932822B (en) 2022-06-17

Family

ID=69848048

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911212378.7A Active CN110932822B (en) 2019-12-02 2019-12-02 Data encoding method, data decoding method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110932822B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111475234A (en) * 2020-04-10 2020-07-31 苏州浪潮智能科技有限公司 Character string transmission method and device, computer and readable storage medium
CN111597802A (en) * 2020-05-14 2020-08-28 支付宝实验室(新加坡)有限公司 Service processing method and device and electronic equipment
CN111832067A (en) * 2020-05-26 2020-10-27 华控清交信息科技(北京)有限公司 Data processing method and device and data processing device
CN112016270A (en) * 2020-09-08 2020-12-01 中国物品编码中心 Chinese-sensible code logistics information coding method, device and equipment
CN112131162A (en) * 2020-09-16 2020-12-25 广州大学 Data transmission method, system, device and medium based on USB equipment
CN112149445A (en) * 2020-10-22 2020-12-29 北京京东振世信息技术有限公司 Bar code processing method, device, equipment and storage medium
CN112818639A (en) * 2020-12-30 2021-05-18 平安普惠企业管理有限公司 Data encoding method, data encoding device, computer equipment and storage medium
CN113162628A (en) * 2021-04-26 2021-07-23 深圳希施玛数据科技有限公司 Data encoding method, data decoding method, terminal and storage medium
CN113271108A (en) * 2021-05-25 2021-08-17 上海众言网络科技有限公司 Questionnaire answering data transmission method and device
CN113468855A (en) * 2021-06-30 2021-10-01 北京达佳互联信息技术有限公司 Data processing method, device, server and storage medium
CN113746593A (en) * 2020-05-29 2021-12-03 北京沃东天骏信息技术有限公司 Character string data transmission method, system, device, electronic equipment and storage medium thereof
CN113836869A (en) * 2021-09-22 2021-12-24 中国农业银行股份有限公司 Method and device for carrying out unified code conversion on mixed multi-code character text
CN113987556A (en) * 2021-12-24 2022-01-28 杭州趣链科技有限公司 Data processing method and device, electronic equipment and storage medium
CN114626338A (en) * 2022-03-01 2022-06-14 杭州趣链科技有限公司 Character encoding method, character decoding method, character encoding system, character decoding system, character encoding device, character decoding device, and storage medium
CN115086423A (en) * 2022-05-18 2022-09-20 深圳市科陆电子科技股份有限公司 Data transmission method, data transmission device, computer device, and storage medium
CN115686759A (en) * 2023-01-04 2023-02-03 恒丰银行股份有限公司 Method and system for calculating unique identification code of virtual machine
CN116738471A (en) * 2023-08-10 2023-09-12 陕西昕晟链云信息科技有限公司 Block chain-based decentralization data analysis method
CN116915368A (en) * 2023-09-14 2023-10-20 深圳华云信息系统科技股份有限公司 Encoding and decoding method and device for data stream conforming to futures transaction data exchange protocol

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100266120A1 (en) * 2009-04-20 2010-10-21 Cleversafe, Inc. Dispersed data storage system data encryption and encoding
CN103595415A (en) * 2012-08-16 2014-02-19 中兴通讯股份有限公司 Coding method, decoding method, coding system and decoding system
CN105790853A (en) * 2014-12-26 2016-07-20 北京奇虎科技有限公司 Method and device for transmitting character data through sound wave
CN105808370A (en) * 2014-12-31 2016-07-27 航天信息股份有限公司 Method for discovering half Chinese character in character string
CN106570356A (en) * 2016-11-01 2017-04-19 南京理工大学 Unicode coding-based text watermark embedding method and extraction method
CN109829025A (en) * 2019-01-22 2019-05-31 浙江数链科技有限公司 Route bearing calibration and device, electronic equipment, storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100266120A1 (en) * 2009-04-20 2010-10-21 Cleversafe, Inc. Dispersed data storage system data encryption and encoding
CN103595415A (en) * 2012-08-16 2014-02-19 中兴通讯股份有限公司 Coding method, decoding method, coding system and decoding system
CN105790853A (en) * 2014-12-26 2016-07-20 北京奇虎科技有限公司 Method and device for transmitting character data through sound wave
CN105808370A (en) * 2014-12-31 2016-07-27 航天信息股份有限公司 Method for discovering half Chinese character in character string
CN106570356A (en) * 2016-11-01 2017-04-19 南京理工大学 Unicode coding-based text watermark embedding method and extraction method
CN109829025A (en) * 2019-01-22 2019-05-31 浙江数链科技有限公司 Route bearing calibration and device, electronic equipment, storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张玮,文福安,李江涛: "J2EE Web应用中URL中文乱码问题的研究", 《计算机信息时代》 *

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111475234B (en) * 2020-04-10 2023-01-10 苏州浪潮智能科技有限公司 Character string transmission method and device, computer and readable storage medium
CN111475234A (en) * 2020-04-10 2020-07-31 苏州浪潮智能科技有限公司 Character string transmission method and device, computer and readable storage medium
CN111597802A (en) * 2020-05-14 2020-08-28 支付宝实验室(新加坡)有限公司 Service processing method and device and electronic equipment
CN111597802B (en) * 2020-05-14 2023-08-22 支付宝实验室(新加坡)有限公司 Service processing method and device and electronic equipment
CN111832067A (en) * 2020-05-26 2020-10-27 华控清交信息科技(北京)有限公司 Data processing method and device and data processing device
CN113746593A (en) * 2020-05-29 2021-12-03 北京沃东天骏信息技术有限公司 Character string data transmission method, system, device, electronic equipment and storage medium thereof
CN112016270A (en) * 2020-09-08 2020-12-01 中国物品编码中心 Chinese-sensible code logistics information coding method, device and equipment
CN112016270B (en) * 2020-09-08 2024-04-02 中国物品编码中心 Logistics information coding method, device and equipment of Chinese-character codes
CN112131162A (en) * 2020-09-16 2020-12-25 广州大学 Data transmission method, system, device and medium based on USB equipment
CN112149445A (en) * 2020-10-22 2020-12-29 北京京东振世信息技术有限公司 Bar code processing method, device, equipment and storage medium
CN112818639A (en) * 2020-12-30 2021-05-18 平安普惠企业管理有限公司 Data encoding method, data encoding device, computer equipment and storage medium
CN113162628A (en) * 2021-04-26 2021-07-23 深圳希施玛数据科技有限公司 Data encoding method, data decoding method, terminal and storage medium
CN113271108A (en) * 2021-05-25 2021-08-17 上海众言网络科技有限公司 Questionnaire answering data transmission method and device
CN113468855A (en) * 2021-06-30 2021-10-01 北京达佳互联信息技术有限公司 Data processing method, device, server and storage medium
CN113836869A (en) * 2021-09-22 2021-12-24 中国农业银行股份有限公司 Method and device for carrying out unified code conversion on mixed multi-code character text
CN113836869B (en) * 2021-09-22 2023-12-08 中国农业银行股份有限公司 Method and device for carrying out unified code conversion on hybrid multi-code character text
CN113987556A (en) * 2021-12-24 2022-01-28 杭州趣链科技有限公司 Data processing method and device, electronic equipment and storage medium
CN114626338A (en) * 2022-03-01 2022-06-14 杭州趣链科技有限公司 Character encoding method, character decoding method, character encoding system, character decoding system, character encoding device, character decoding device, and storage medium
CN115086423A (en) * 2022-05-18 2022-09-20 深圳市科陆电子科技股份有限公司 Data transmission method, data transmission device, computer device, and storage medium
CN115686759A (en) * 2023-01-04 2023-02-03 恒丰银行股份有限公司 Method and system for calculating unique identification code of virtual machine
CN116738471A (en) * 2023-08-10 2023-09-12 陕西昕晟链云信息科技有限公司 Block chain-based decentralization data analysis method
CN116738471B (en) * 2023-08-10 2023-10-20 陕西昕晟链云信息科技有限公司 Block chain-based decentralization data analysis method
CN116915368A (en) * 2023-09-14 2023-10-20 深圳华云信息系统科技股份有限公司 Encoding and decoding method and device for data stream conforming to futures transaction data exchange protocol
CN116915368B (en) * 2023-09-14 2024-03-29 深圳华云信息系统科技股份有限公司 Encoding and decoding method and device for data stream conforming to futures transaction data exchange protocol

Also Published As

Publication number Publication date
CN110932822B (en) 2022-06-17

Similar Documents

Publication Publication Date Title
CN110932822B (en) Data encoding method, data decoding method, device, equipment and storage medium
US8368567B2 (en) Codepage-independent binary encoding method
CN109214196B (en) Data interaction method, device and equipment
US20110219357A1 (en) Compressing source code written in a scripting language
CN104199812B (en) Data system and method supporting multiple languages
US11580314B2 (en) Document translation method and apparatus, storage medium, and electronic device
US11954455B2 (en) Method for translating words in a picture, electronic device, and storage medium
CN113139390A (en) Language conversion method and device applied to code character strings
CN107526742B (en) Method and apparatus for processing multilingual text
CN111708753A (en) Method, device and equipment for evaluating database migration and computer storage medium
US20160335255A1 (en) Innovative method for text encodation in quick response code
US8271263B2 (en) Multi-language text fragment transcoding and featurization
US9684654B2 (en) Performing a code conversion in a smaller target encoding space
CN113051894A (en) Text error correction method and device
WO2006101287A1 (en) System and method for providing translated font image data using multi-language font servers
CN109657244B (en) English long sentence automatic segmentation method and system
CN112487765B (en) Method and device for generating notification text
CN110287147B (en) Character string sorting method and device
US20230214577A1 (en) Character string transmission method and device, computer, and readable storage medium
CN111832288B (en) Text correction method and device, electronic equipment and storage medium
CN114841175A (en) Machine translation method, device, equipment and storage medium
CN106471743B (en) Encoding of plain ASCII data streams
CN111049813B (en) Message assembling method, message analyzing method, message assembling device, message analyzing device and storage medium
TW561360B (en) Method and system for case conversion
CN112749353A (en) Processing method and device of webpage icon

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Floor 36, Zheshang Building, No. 718 Jianshe Avenue, Jiang'an District, Wuhan, Hubei 430019

Patentee after: TK.CN INSURANCE Co.,Ltd.

Patentee after: TAIKANG INSURANCE GROUP Co.,Ltd.

Address before: Taikang Life Building, 156 fuxingmennei street, Xicheng District, Beijing 100031

Patentee before: TAIKANG INSURANCE GROUP Co.,Ltd.

Patentee before: TK.CN INSURANCE Co.,Ltd.