WO2012027932A1 - Method, equipment and terminal for encoding/decoding short message - Google Patents
Method, equipment and terminal for encoding/decoding short message Download PDFInfo
- Publication number
- WO2012027932A1 WO2012027932A1 PCT/CN2010/079351 CN2010079351W WO2012027932A1 WO 2012027932 A1 WO2012027932 A1 WO 2012027932A1 CN 2010079351 W CN2010079351 W CN 2010079351W WO 2012027932 A1 WO2012027932 A1 WO 2012027932A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- short message
- encoding
- format
- decoding
- module
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/12—Messaging; Mailboxes; Announcements
- H04W4/14—Short messaging services, e.g. short message services [SMS] or unstructured supplementary service data [USSD]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L51/00—User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
- H04L51/58—Message adaptation for wireless communication
Definitions
- Short messages are a basic service provided by a mobile communication system to a user, which enables mobile terminals to transfer text or multimedia materials to each other. Among them, the short message monthly service that delivers text content is widely used.
- the text content passed between the mobile terminal and the short message center via the mobile communication network must be encoded and decoded.
- the 3rd Generation Partnership Project (3GPP) Protocol 23.038 stipulates that there are three formats for encoding text content in SMS, Global System for Mobile Communication (GSM) 7-bit encoding, 8-bit encoding. , Universal Multiple-Octet Coded Character Set (UCS) 2 encoding.
- GSM Global System for Mobile Communication
- UCS Universal Multiple-Octet Coded Character Set
- GSM7 encoding uses 7 binary digits to represent a character.
- the maximum number of characters that can be represented is 127. It is used for languages with less characters such as English.
- UCS2 encoding uses 16 binary digits to represent a character. It is a form of Unicode.
- the maximum number of characters that can be represented is 65536. It is used to represent languages with more characters such as Chinese. In some languages, the number of characters is slightly larger than 127, and GSM7 cannot be used. However, if UCS2 is used, it is a waste of space.
- the maximum length of a single transmission is 140 bytes according to the 3GPP protocol 23.040, if UCS2 is used, The content of a single transmission is less than half that of the GSM 7-bit encoding, so this time it can be sent using the ISO-defined 8-bit encoding.
- 8 8-bit encoding is not common in SMS text encoding. It is common for these countries to define the GSM7 code table by themselves, instead of the default GSM7 code table of the protocol, and still use GSM 7-bit encoding, such as Greece.
- the 3GPP protocol 20.038 also defines a national language extension mechanism for GSM7, which can be used to solve GSM7 encoding in languages with a slightly larger number of characters, such as Vietnamese, Spanish, Portuguese, and the like.
- a short message-enabled terminal for example, a mobile phone, a fixed station, or a data card
- a primary object of the present invention is to provide a short message encoding and decoding scheme to solve at least the above problems.
- a short message encoding method including: setting a short message into a universal multi-byte coded character set UCS2 format; and performing encoding format recognition on each character in a short message set to a UCS2 format;
- the short message is encoded using a predetermined encoding format.
- the method further includes: determining, by the predetermined encoding format, a maximum length of the maximum short message text supported by the predetermined encoding format; and dividing the short message into the level if the short message exceeds the maximum length Connected SMS.
- the predetermined coding format is one of the following: Global Mobile Communication GSM7 coding, 8-bit coding, UCS2 coding, wherein the GSM7 coding is one of the following: GSM7 standard coding, national custom coding, 3rd generation partner program 3GPP countries Language extension encoding.
- GSM7 coding is one of the following: GSM7 standard coding, national custom coding, 3rd generation partner program 3GPP countries Language extension encoding.
- the short message is saved in the short message text buffer array; the short message is read from the short message text buffer array, and the short message is encoded using a predetermined encoding format. .
- a short message decoding method comprising: receiving a short message and determining an encoding format of the short message; and decoding the short message by using a decoding format corresponding to the encoding format.
- determining the encoding format of the short message comprises: obtaining information for indicating an encoding format carried in the short message; and determining an encoding format of the short message according to the information.
- determining the encoding format of the short message comprises: identifying each of the characters in the short message; and determining that the predetermined encoding format is the encoding format of the short message if all the characters in the short message are recognized by the same predetermined encoding format. .
- the predetermined coding format is one of the following: global mobile communication GSM7 coding, 8-bit coding, UCS2 coding, wherein the GSM7 coding is one of the following: GSM7 standard coding, national custom coding, 3GPP national language extension coding.
- the method further includes: recording the 8-bit encoding or the country used in this time.
- a short message encoding apparatus comprising: an encoding mode identifying module, configured to perform encoding format recognition on each character in a short message set to a UCS2 format; an encoding module, in a short message In the case where all characters can be recognized by the same predetermined encoding format, the short message is encoded using a predetermined encoding format.
- the encoding module comprises: a GSM7 standard encoding module for encoding the short message using the GSM7 standard encoding format; a national custom encoding module for encoding the short message using the national custom encoding format; 3GPP national language extended encoding module, For encoding a short message using a 3GPP national language extended coding format; an 8-bit encoding module for encoding a short message using an 8-bit encoding format; and a UCS2 encoding module for encoding a short message using a UCS encoding module.
- a short message decoding apparatus comprising: a decoding mode identifying module, configured to determine an encoding format of the received short message; and a decoding module, configured to use a decoding format pair corresponding to the encoding format SMS is decoded.
- the decoding module comprises: a GSM7 standard decoding module, configured to decode a short message by using a GSM7 standard encoding format; a national custom decoding module, configured to decode a short message by using a national custom encoding format; 3GPP national language extended decoding module, For decoding a short message using a 3GPP national language extended coding format; an 8-bit decoding module for decoding an SMS using an 8-bit encoding format; and a UCS2 decoding module for decoding a short message using a UCS encoding module.
- a terminal is provided, the terminal comprising the above-described short message encoding device and/or the above short message decoding device.
- the short message is set to the UCS2 format; each character in the short message is encoded and formatted, and when all the characters in the short message can be recognized by the same predetermined encoding format, the predetermined encoding is used.
- the format encodes the text message. Solved the short support in the prior art The terminal of the letter function only supports the problems that may occur in the two language codes of English and local languages, thereby improving the usability of the terminal.
- FIG. 1 is a flowchart of a short message encoding method according to an embodiment of the present invention
- FIG. 2 is a flowchart of a short message decoding method according to an embodiment of the present invention
- FIG. 3 is a short message encoding according to an embodiment of the present invention
- FIG. 4 is a block diagram showing a structure of a short message decoding apparatus according to an embodiment of the present invention
- FIG. 5 is a flowchart of a preferred short message encoding method in accordance with an embodiment of the present invention
- FIG. 6 is a UCS2 code according to an embodiment of the present invention.
- Figure 7 is a flow chart for converting a UCS2 code into an 8-bit code according to an embodiment of the present invention;
- Figure 8 is a flow chart of a preferred short message decoding method according to an embodiment of the present invention;
- FIG. 10 is a structural block diagram of a codec system according to an embodiment of the present invention.
- the embodiments in the present application and the features in the embodiments may be combined with each other without conflict.
- the coding formats mentioned include: GSM7 coding, 8-bit coding, UCS2 coding, wherein the GSM7 coding includes: GSM7 standard coding, national custom coding, 3rd Generation Partnership Project 3GPP National Language Extension Coding.
- GSM7 coding includes: GSM7 standard coding, national custom coding, 3rd Generation Partnership Project 3GPP National Language Extension Coding.
- the encoding format is increased or changed relative to the above several encoding formats, the following embodiments are also applicable, and only need to be added or repaired.
- the corresponding codec module can be used.
- 1 is a flowchart of a short message encoding method according to an embodiment of the present invention. As shown in FIG.
- Step S102 setting a short message to a UCS2 format
- Step S104 setting a short message to a UCS2 format
- Each character in the encoding format is recognized as 'J
- Step S106 in the case where all the characters in the short message can be recognized by the same predetermined encoding format, the short message is encoded using the predetermined encoding format.
- the predetermined coding format is one of the following: Global Mobile Communication GSM7 coding, 8-bit coding, UCS2 coding, wherein the GSM7 coding is one of the following: GSM7 standard coding, national custom coding, 3rd generation partner program 3GPP national language Extended coding.
- an encoding format that can encode all the characters is determined, and then the encoding format is used for encoding.
- the short message can be encoded and then transmitted according to the supported encoding format, which can solve the problem that the terminal supporting the short message function in the prior art only supports the English and local language encoding.
- the monthly SMS coding mechanism in different countries in the prior art is inconsistent, and a set of software is difficult to adapt to all countries, so that each time the software is released, the problem of hacking and configuration of the relevant parts of the short message must be falsified and configured. And defects, can automatically adapt to the SMS encoding of various national languages.
- the maximum length of the maximum short message text supported by the predetermined encoding format may be determined according to the predetermined encoding format, if the encoded short message exceeds the maximum length, You can make a prompt, or you can split the SMS into cascading SMS messages.
- an array may be used for buffering, and of course, other buffering methods may be used. The following takes an array as an example for explanation.
- the encoded short message may be carried with information for indicating the short message encoding mode, so that the receiving party performs decoding.
- 2 is a flowchart of a short message decoding method according to an embodiment of the present invention, as shown in FIG. 2, the flow The process includes the following steps: Step S202: Receive a short message and determine an encoding format of the short message; Step S204: decode the short message by using a decoding format corresponding to the encoding format. Through the above step 4, the encoding format of the short message can be identified, so that the short message is correctly decoded.
- the short message carries the information indicating the encoding format of the short message
- the information carried in the short message for indicating the encoding format is obtained; and the encoding format of the short message is determined according to the information.
- the sender may also not send information for indicating the encoding format.
- each character in the short message may be encoded and formatted; if all the characters in the short message can be recognized by the same predetermined encoding format, , determining the encoding format of the predetermined encoding format as a short message.
- FIG. 3 is a structural block diagram of a short message encoding apparatus according to an embodiment of the present invention. As shown in FIG.
- the apparatus includes: an encoding mode identifying module 32 and an encoding module 34, which will be described below.
- the coding mode identification module 32 (or simply the code recognition module) is configured to perform coding format recognition on each character in the short message set to the UCS2 format; the coding module 34 is connected to the coding mode identification module 32, and all characters in the short message. In the case where both can be identified by the same predetermined encoding format, the short message is encoded using a predetermined encoding format.
- the encoding module 34 may include: a GSM7 standard encoding module for encoding a short message using a GSM7 standard encoding format; a national custom encoding module for encoding a short message using a national custom encoding format; 3GPP national language An extended coding module for encoding a short message using a 3GPP national language extended coding format; an 8-bit encoding module for encoding a short message using an 8-bit encoding format; and a UCS2 encoding module for encoding a short message using a UCS encoding module .
- FIG. 4 is a structural block diagram of a short message decoding apparatus according to an embodiment of the present invention. As shown in FIG.
- the apparatus includes: a decoding mode identification module 42 and a decoding module 44, which will be described below.
- the decoding mode identification module 42 (or simply the decoding identification module) is configured to determine an encoding format of the received short message; the decoding module 44 is coupled to the decoding mode identifying module 42 for decoding the short message by using a decoding format corresponding to the encoding format. .
- the decoding module 44 may include: a GSM7 standard decoding module for decoding a short message using a GSM7 standard encoding format; a national custom decoding module for decoding a short message using a national custom encoding format; 3GPP National Language Extension The decoding module is configured to decode the short message by using the 3GPP national language extended encoding format; the 8-bit decoding module is configured to decode the short message by using the 8-bit encoding format; and the UCS2 decoding module is configured to decode the short message by using the UCS encoding module. It should be noted that the codec module may be modified or deleted according to actual needs. FIG.
- Step S501 Set a code of a short message input text box to UCS2.
- Step S502 the user inputs the text to be sent in the short message input text box.
- the code recognition module is called to judge the destination coding format.
- step S503 after the user clicks the "send" button, an SMS text buffer array is created, and the length is 160 Int type data. The reason for this is that the maximum length of a text message sent each time is 160 characters.
- Step S504 constructing a short message data structure for process control.
- Step S505 Scan the text buffer array to determine whether the short message is a cascaded short message, the number of cascaded messages, and the basis for determining the cascaded short message is: If all the characters are GSM7 default code characters or GSM7 extended code characters, exceed 160 characters are cascading text messages, and the length of a single line in the cascading is 153. If one character is a UCS2 code character and more than 70 characters are cascaded text messages, the length of the single line in the cascade is 67.
- Step S506 if the short message coding type is 7 bits, start conversion according to FIG. 6, and convert the UCS2 code of each character in the text buffer into GSM7 code.
- step S507 if the short message encoding type is UCS2, no conversion is needed. Still maintain UCS2 encoding.
- Step S508 if the short message encoding type is 8 bits, call FIG. 7 to start conversion, and convert the UCS2 code of each character in the text buffer into an 8-bit code.
- step S509 the cascading short message is opened, and the cascading short message in the text buffer is disassembled according to the maximum length that can be sent by the single strip, and the content is sequentially transmitted to the short message sending system.
- FIG. 6 is a flowchart of converting UCS2 code into GSM7 code according to an embodiment of the present invention. As shown in FIG. 6, the process includes the following steps: Step S601: Read a UCS2 character from a text buffer. Step S602, step S502 will inform the UCS2 character which GSM7 bit code is to be encoded. Step S603, if the GSM7 default code is to be programmed, the GSM7 default encoding/decoding module is called.
- Step S604 if a GSM7 spreading code is to be coded, the GSM7 extended encoding/decoding module is called.
- Step S605 if the national extension code is to be compiled, the 3GPP national language extension encoding/decoding module is called.
- the code value consists of two code values. The first one is the fixed flag bit Oxlb, and the second code value is searched by two methods: Single Shift and Lock Shift.
- Step S606 if the national custom GSM7 code is to be compiled, the national custom encoding/decoding module is called.
- Step 4 gathers S607, repeats step 4 to gather S401-step 4 to gather S406, until all the characters in the text buffer are converted.
- Step S701 Read a UCS2 character from a text buffer.
- Step S702 From the national language file, respectively read the coding table of the currently defined country using 8-bit characters.
- the specific logical data structure is a hash table.
- Step S703 the UCS2 character in step S701 is searched in the encoding hash table, and if found, the corresponding hash table value (8-bit encoding) is used instead of the original UCS2 value.
- Step S704 returning to step S701, until all the characters in the text buffer are converted.
- Step S801 A short message PDU string to be decoded is transmitted to the system.
- Step S802 invoking a decoding identification module to determine a destination decoding format.
- Step S803 if it is an 8-bit code, the 8-bit coded country identification module is called to continue to identify
- Step 4 gathers S 804, if it is a national language custom code or standard GSM7 code, then call
- Step S805 The GSM7 standard and the National Custom Extended Identification Module continue to be identified.
- Step S806 constructing a short message data structure.
- Step S 806 if standard GSM7 decoding is used, the standard GSM7 codec module is called.
- Step S807 if 8-bit code decoding is used, the 8-bit codec module is called.
- Step 4 gathers S808. If UCS2 encoding is used, the UCS2 decoding module is called to keep the encoding unchanged.
- Step S809 if the 3GPP national language extension coding is used, the 3GPP national language extension codec module is invoked.
- Step S810 if the national custom GSM7 encoding is used, the national custom codec module is called to decode.
- step S811 the decoded text is put into the short message data structure, and the cascaded short message is stitched.
- FIG. 9 is a flowchart of extended new language support according to an embodiment of the present invention. As shown in FIG. 9, the process includes the following steps: Step S901: Obtain information, 8-bit encoding used by the national language encoding, and 3GPP national language extension. Coding, or national custom coding. Step S902, if an 8-bit encoding is used, an 8-bit encoding/decoding table file of the country is created and placed in an 8-bit code table folder for the 8-bit encoding/decoding module to call.
- Step S903 if the national custom code is used, the custom GSM7 code/decode table file of the country is created and placed in the national custom GSM7 code table folder. Make a national custom code judgment array file, put it into the judgment array folder, and call it for the national language recognition module.
- Step S904 if the 3GPP national language extension is used, the Single Shift encoding/decoding table and the Lock Shift encoding/decoding table of the 3GPP national language extension of the country are created and placed in the 3GPP national language extension code table folder.
- a terminal including: the foregoing short message encoding device and/or the short message decoding device.
- the terminal includes: a standard GSM7 codec module, a UCS2 codec module, which can be implemented using existing hardware or software modules.
- the terminal further includes: a 3GPP national language extension codec module, and an 8-bit codec module.
- the two modules can automatically load a new code table to implement support for the new language.
- the short message encoding is divided into two processes: encoding mode identification, encoding.
- the SMS decoding is divided into two processes: decoding mode identification, decoding.
- decoding mode identification the functions of each module in the process of short message encoding are as follows:
- the short message text to be encoded is transmitted into the system in UCS2 (Unicode) format.
- the encoding mode identification module reads the judgment array and identifies the destination encoding format of the text. This recognition is traversed once, that is, only when each character in the short message text is recognized as an encoding format, the target encoding format of the entire text can be considered as a certain format.
- the coding mode identification module recognizes that the standard GSM7 coding is required, the standard GSM7 codec module is invoked; if the coding mode identification module determines that the 3GPP national language extension coding is required, the 3GPP national custom codec module is invoked; The mode identification module determines that the national custom GSM 7-bit code needs to be used, and then calls the national custom codec module; if the code mode identification module recognizes that 8-bit code is needed, the 8-bit codec module is called; The mode identification module recognizes that the UCS2 code needs to be used, and then calls the UCS2 codec module. After the encoding is completed, a short message PDU string is generated and output to the short message sending software.
- each module in the process of short message decoding is as follows:
- the PDU string to be decoded is transmitted as input into the system.
- the decoding mode identification module initially determines the target decoding format of the PDU string according to the DCS field of the PDU string and the information unit of the short message header. If the decoding mode identification module recognizes that the UCS2 code is used, the UCS2 codec module is called to decode. If the decoding mode identification module recognizes the 8-bit code, further calls the 8-bit coded country identification module, which has an empirical memory and judgment function, identifies which country is the 8-bit code, and then calls the 8-bit codec module. decoding.
- the 3GPP national language extension codec module is invoked. If the decoding mode identification module is in the short message header information, no short message format is found. Then the national custom extension identification module is further called, which has an empirical memory and judgment function, and identifies whether the standard GSM7 code or the national custom extended GSM7 code should be used. If the national custom extension identification module recognizes that the standard GSM7 code is required, the standard GSM7 codec module is called for decoding. If the national custom extension identification module recognizes that the national custom extension GSM7 code needs to be used, the national custom codec module is called to decode.
- the module is to manage the short message data structure for the text before the encoding and the encoded code, and record each management data, such as the short message header and level. Information, etc.
- the module is also responsible for managing the transfer of SMS data structures between modules, as well as the input and output of the entire system. This embodiment has the following technical effects:
- a mechanism for intelligent judgment and memory is proposed, which deals with the 8-bit code, the national custom code and the standard GSM 7-bit code.
- FIG. 10 is a structural diagram of a codec system according to an embodiment of the present invention.
- the codec system will be described below with reference to FIG.
- the system consists of a coding mode identification module, a decoding mode identification module, a GSM7 standard and a national custom extension identification module, an 8-bit coded national identification module, a standard GSM7 codec module, an 8-bit codec module, and a UCS2 code.
- Decoding module 3GPP national language extended codec module, national custom codec module, configuration and support modules.
- the code recognition module is configured to identify a destination encoding format of the short message according to the input character.
- the standard GSM7 default character array, the standard GSM7 extended character array, the character array of each country of the 3GPP national language extension, the 8-bit national coded character array, and the country-defined character array are sequentially read.
- the destination encoding format can be confirmed. If any character cannot be found in the character array, it is judged that the destination encoding format is UCS2 encoding.
- Standard GSM7 codec module for conversion between standard GSM7 code and UCS2 code.
- National custom codec module for conversion between national custom GSM7 code and UCS2 code. This module has more than one code table, corresponding to different countries, with different code tables, which can dynamically load the newly added code table.
- UCS2 codec module for verification and transmission of UCS2 coded characters.
- 8-bit encoding/decoding module for conversion between 8-bit code and UCS2 code.
- 8-bit code tables are not unique, and countries that use alphabetic characters usually have their own 8-bit code tables. This module can dynamically load the newly added code table.
- 3GPP National Language Extension Codec module for conversion between 3GPP national language extension codes and UCS2 encoding.
- This module has more than one code table, corresponding to different countries, with different code tables, which can dynamically load the newly added code table.
- a decoding identification module configured to determine a destination decoding format according to the input PDU string to be decoded: according to a DCS field and a short message header information of the PDU string, according to a definition of 3GPP protocol 23.040, determining that the destination encoding format is GSM7 encoding, 8-bit encoding , UCS2 encoding, or 3GPP national extension coding, and then enter the corresponding decoding module.
- the module is not responsible and is handled by the latter two modules.
- the GSM7 standard and the national custom extension identification module are used to judge the standard GSM7 and national custom extension identification.
- the module has intelligent judgment and memory function. The first time according to the local language setting (such as Locale under Linux operating system) and the encoding method when sending text messages, comprehensive judgment is made to use standard GSM7 or national custom decoding. Users can get it. After the result is adjusted, the module will record the user's adjustment and determine which decoding method to use for the next decoding.
- the 8-bit coded national identification module is used to judge which country's 8-bit code table is used.
- the module uses intelligent judgment and memory. The first time, according to the local language setting (such as Local under Linux) and the encoding format used when sending text messages, it can comprehensively determine which country's 8-bit code table should be used for decoding. The user can adjust after obtaining the result, and the module will record the user's adjustment to determine which decoding method to use for the next decoding.
- the configuration and support module is configured to define a short message data structure, receive the short message content, fill in the short message data structure, and interact with the short message receiving/transmitting system. Define the SMS data structure, which corresponds to the protocol of the 3GPP SMS format.
- This SMS data structure is implemented to accommodate the entire cascaded SMS.
- multi-language short message encoding and decoding can be simultaneously supported.
- the system supports all currently known encoding formats, and can automatically match the encoding of the current national language according to user input and received short message content. Form, call the corresponding code table and codec.
- the system has good scalability and can quickly support the expansion of language codes that are not currently supported. For customers who frequently go abroad, using the terminal products of the above embodiments, it is not necessary to replace and upgrade the software, and the local language text messages can be correctly received and transmitted all over the world.
- the same set of short message modules can be used for terminal products shipped to different countries, which reduces the difficulty of software customization, especially language coding module customization, and can also reduce the probability of errors and ensure the quality of software products. And time nodes.
- the above modules or steps of the present invention can be implemented by a general-purpose computing device, which can be concentrated on a single computing device or distributed over a network composed of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device, such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The present invention discloses a method, an equipment and a terminal for encoding/decoding a short message. The method includes the short message is set as a Universal Multiple-Octet Coded Character Set (UCS2) format (S102); a coding format identification to each character in the UCS2 format messages is performed (S104); if all the characters in the short message can be identified by the same preset coding format, the preset coding format is used to encode the short message (S106). By the present invention, the usability of the terminal is improved.
Description
短信编码解码方法、 装置以及终端 技术领域 本发明涉及通信领域, 具体而言, 涉及一种短信编码解码方法、 装置以 及终端。 背景技术 短消息是移动通讯系统提供给用户的一个基本服务, 该服务能够使移动 终端之间互相传递文本或多媒体资料等。 其中, 传递文本内容的短消息月艮务 应用较广。 通过移动通讯网络在移动终端和短消息中心之间传递的文本内容 必须进行编码和解码。 第三代合作伙伴计划 ( 3rd Generation Partnership Project, 简称为 3GPP )协议 23.038规定, 短信文本内容的编码有三种格式, 全球移动通信 ( Global system for Mobile Communication, 简称为 GSM ) 7位 编码, 8 位编码, 通用多字节编码字符集 (Universal Multiple-Octet Coded Character Set, 简称为 UCS ) 2编码。 TECHNICAL FIELD The present invention relates to the field of communications, and in particular to a short message encoding and decoding method, apparatus, and terminal. BACKGROUND OF THE INVENTION Short messages are a basic service provided by a mobile communication system to a user, which enables mobile terminals to transfer text or multimedia materials to each other. Among them, the short message monthly service that delivers text content is widely used. The text content passed between the mobile terminal and the short message center via the mobile communication network must be encoded and decoded. The 3rd Generation Partnership Project (3GPP) Protocol 23.038 stipulates that there are three formats for encoding text content in SMS, Global System for Mobile Communication (GSM) 7-bit encoding, 8-bit encoding. , Universal Multiple-Octet Coded Character Set (UCS) 2 encoding.
GSM7编码使用 7个二进制位来表示一个字符, 可以表示的最大字符数 是 127, 用于英文等字符较少的语言。 UCS2编码使用 16个二进制位来表示 一个字符, 是 Unicode的一种形式, 可以表示的最大字符数是 65536, 用于 表示中文等字符较多的语言。 还有一些语言的字符数稍微大于 127, 不能使 用 GSM7, 但是如果使用 UCS2的话, 又比较浪费空间, 因为根据 3GPP协 议 23.040规定, 单次发送的最大长度为 140个字节, 釆用 UCS2编码的话, 单次发送的内容比釆用 GSM7位编码要少一半还多, 所以这个时候可以使用 ISO定义的 8位编码来发送。 釆用 8位编码在短信文本编码里不常见, 常见 的是, 这些国家会自行定义 GSM7码表, 替代协议的默认 GSM7 码表, 依 然使用 GSM7位编码, 例如希腊。 还有一种情况, 3GPP协议 20.038里也定 义了一种 GSM7 的国家语言扩展机制, 可以用于解决字符数稍多的语言的 GSM7编码, 例如土耳其语、 西班牙语、 葡萄牙语等。 在现有技术中, 支持短信功能的终端 (例如, 手机、 固定台或数据卡) 都只支持英文和本地语言两种语言编码。 这样就可能导致出现问题。 例如, 当终端产品需要向某个国家发货时, 就必须对软件的短消息模块进行更改, 以支持当地的语言编码, 为了支持当地的语言编码需要爹改码表重新编译软
件或者修改配置, 这样就增加了软件定制的难度, 也增加了出错的几率, 同 时也需要更多的时间用于软件定制和测试; 又例如, 对于经常出国、 更换语 言文字环境的用户, 由于终端只能支持两种语言, 导致在漫游过程中, 用户 无法正常收发当地语言的短信。 发明内容 本发明的主要目的在于提供一种短信编码解码方案, 以至少解决上述问 题。 根据本发明的一个方面, 提供了一种短信编码方法, 包括: 将短信设置 为通用多字节编码字符集 UCS2格式; 对设置为 UCS2格式的短信中的每一 个字符进行编码格式识别; 在短信中的所有字符均能被同一预定编码格式所 识别的情况下, 使用预定编码格式对短信进行编码。 优选地, 在使用预定编码格式对短信进行编码之后, 还包括: 居预定 编码格式确定预定编码格式所支持的最大短信文本的最大长度; 在短信超过 最大长度的情况下, 将短信拆分为级联短信。 优选地, 预定编码格式为以下之一: 全球移动通信 GSM7编码、 8位编 码、 UCS2编码, 其中, GSM7编码为以下之一: GSM7标准编码、 国家自 定义编码、 第三代合作伙伴计划 3GPP国家语言扩展编码。 优选地, 在对短信中的每一个字符进行编码格式识别之后, 将短信保存 在短信文本緩冲区数组中; 从短信文本緩冲区数组中读取短信, 并使用预定 编码格式对短信进行编码。 根据本发明的另一个方面, 提供了一种短信解码方法, 该方法包括: 接 收短信并确定短信的编码格式; 使用与编码格式对应的解码格式对短信进行 解码。 优选地, 确定短信的编码格式包括: 获取短信中携带的用于指示编码格 式的信息; 才艮据信息确定短信的编码格式。 优选地, 确定短信的编码格式包括: 对短信中的每一个字符进行编码格 式识别; 在短信中的所有字符均能被同一预定编码格式所识别的情况下, 确 定预定编码格式为短信的编码格式。
优选地, 预定编码格式为以下之一: 全球移动通信 GSM7编码、 8位编 码、 UCS2编码, 其中, GSM7编码为以下之一: GSM7标准编码、 国家自 定义编码、 3GPP国家语言扩展编码。 优选地,在确定短信的编码格式为 8位编码或国家自定义编码的情况下, 在使用与编码格式对应的解码格式对短信进行解码之后, 还包括: 记录本次 使用的 8位编码或国家自定义编码对应的国家解码格式, 并在下一次接收到 使用 8位编码或国家自定义编码进行编码的短信后, 使用记录的国家解码格 式对短信进行解码。 根据本发明的另一个方面, 提供了一种短信编码装置, 该装置包括: 编 码方式识别模块, 用于对设置为 UCS2格式的短信中的每一个字符进行编码 格式识别; 编码模块, 在短信中的所有字符均能被同一预定编码格式所识别 的情况下, 使用预定编码格式对短信进行编码。 优选地, 编码模块包括: GSM7标准编码模块, 用于使用 GSM7标准编 码格式对短信进行编码; 国家自定义编码模块, 用于使用国家自定义编码格 式对短信进行编码; 3GPP国家语言扩展编码模块, 用于使用 3GPP国家语言 扩展编码格式对短信进行编码; 8位编码模块, 用于使用 8位编码格式对短 信进行编码;以及 UCS2编码模块,用于使用 UCS编码模块对短信进行编码。 根据本发明的另一个方面, 提供了一种短信解码装置, 该装置包括: 解 码方式识别模块, 用于确定接收到的短信的编码格式; 解码模块, 用于使用 与编码格式对应的解码格式对短信进行解码。 优选地, 解码模块包括: GSM7标准解码模块, 用于使用 GSM7标准编 码格式对短信进行解码; 国家自定义解码模块, 用于使用国家自定义编码格 式对短信进行解码; 3GPP国家语言扩展解码模块, 用于使用 3GPP国家语言 扩展编码格式对短信进行解码; 8位解码模块, 用于使用 8位编码格式对短 信进行解码;以及 UCS2解码模块,用于使用 UCS编码模块对短信进行解码。 才艮据本发明的另一个方面, 提供了一种终端, 该终端包括上述短信编码 装置和 /或上述短信解码装置。 通过本发明, 釆用将短信设置为 UCS2格式; 对该短信中的每一个字符 进行编码格式识别, 在该短信中的所有字符均能被同一预定编码格式所识别 的情况下, 使用该预定编码格式对短信进行编码。 解决了现有技术中支持短
信功能的终端只支持英文和本地语言两种语言编码有可能导致出现的问题, 进而提高了终端的可用性。 附图说明 此处所说明的附图用来提供对本发明的进一步理解, 构成本申请的一部 分, 本发明的示意性实施例及其说明用于解释本发明, 并不构成对本发明的 不当限定。 在附图中: 图 1是才艮据本发明实施例的短信编码方法的流程图; 图 2是 居本发明实施例的短信解码方法的流程图; 图 3是 居本发明实施例的短信编码装置的结构框图; 图 4是 居本发明实施例的短信解码装置的结构框图; 图 5是 居本发明实施例的优选的短信编码方法的流程图; 图 6是根据本发明实施例的 UCS2编码转换成 GSM7编码的流程图; 图 7是根据本发明实施例的 UCS2编码转成 8位编码的流程图; 图 8是才艮据本发明实施例的优选的短信解码方法的流程图; 图 9是根据本发明实施例的扩展新增语言支持的流程图; 图 10是才艮据本发明实施例的编解码系统的结构框图。 具体实施方式 下文中将参考附图并结合实施例来详细说明本发明。 需要说明的是, 在 不冲突的情况下, 本申请中的实施例及实施例中的特征可以相互组合。 在以下实施中, 所提到的编码格式包括: GSM7编码、 8位编码、 UCS2 编码, 其中, GSM7编码包括: GSM7标准编码、 国家自定义编码、 第三代 合作伙伴计划 3GPP国家语言扩展编码。 当然, 如果编码格式相对于上述几 种编码格式有所增加或者更改, 以下的实施例也可以适用, 只需要增加或修_ ?丈相应的编解码模块即可。
图 1是 居本发明实施例的短信编码方法的流程图, 如图 1所示, 该流 程包括如下步骤: 步骤 S 102 , 将短信设置为 UCS2格式; 步骤 S 104,对设置为 UCS2格式的短信中的每一个字符进行编码格式识 另' J ; 步骤 S 106,在短信中的所有字符均能被同一预定编码格式所识别的情况 下, 使用预定编码格式对短信进行编码。 其中, 预定编码格式为以下之一: 全球移动通信 GSM7编码、 8位编码、 UCS2编码, 其中, GSM7编码为以下之一: GSM7标准编码、 国家自定义 编码、 第三代合作伙伴计划 3GPP国家语言扩展编码。 在上述步骤中, 通过对短信中的字符进行逐一的识别, 确定一个对所有 字符都能编码的编码格式, 然后使用该编码格式进行编码。 通过这样的自动 的识别过程就可以使短信按所支持的编码格式进行编码然后发送, 可以解决 现有技术中支持短信功能的终端只支持英文和本地语言两种语言编码有可能 导致出现的问题。 例如, 克月艮了现有技术中存在的不同国家的短信编码机制 不一致, 一套软件难以适配所有国家, 从而每次发布软件的时候, 都必须对 短信相关部分进行爹改和配置的问题和缺陷, 可以自动地对各种国家语言的 短信编码进行适配。 优选地, 在实施时, 在确定对该短信所使用的编码格式之后, 可以才艮据 预定编码格式确定预定编码格式所支持的最大短信文本的最大长度, 如果编 码后的短信超过了最大长度, 可以进行提示, 也可以将该短信拆分为级联短 信。 优选地, 在实施时, 可以使用数组进行緩冲, 当然也可以釆用其他的緩 冲的方式。 下面以数组为例进行说明。 在对短信中的每一个字符进行编码格 式识别之后, 将短信保存在短信文本緩冲区数组中; 从短信文本緩冲区数组 中读取短信, 并使用预定编码格式对短信进行编码。 优选地, 在对短信进行编码之后, 可以在该编码后的短信中携带用于指 示该短信编码方式的信息, 以便于接收方进行解码。 图 2是 居本发明实施例的短信解码方法的流程图, 如图 2所示, 该流
程包括如下步骤: 步骤 S202, 接收短信并确定短信的编码格式; 步骤 S204 , 使用与编码格式对应的解码格式对短信进行解码。 通过上述的步 4聚, 可以对短信的编码格式进行识别, 从而对该短信进行 正确的解码。 在实施时, 如果短信中携带了用于指示该短信编码格式的信息, 那么, 获取短信中携带的用于指示编码格式的信息; 根据该信息确定短信的编码格 式。 当然, 发送方也可以不发送用于指示编码格式的信息, 此时, 可以对短 信中的每一个字符进行编码格式识别; 在短信中的所有字符均能被同一预定 编码格式所识别的情况下, 确定预定编码格式为短信的编码格式。 优选地, 在实施时, 在确定短信的编码格式为 8位编码或国家自定义编 码的情况下, 在使用与编码格式对应的解码格式对短信进行解码之后, 还可 以记录本次使用的 8位编码或国家自定义编码对应的国家解码格式, 并在下 一次接收到使用 8位编码或国家自定义编码进行编码的短信后, 使用记录的 国家解码格式对短信进行解码。 图 3是 居本发明实施例的短信编码装置的结构框图, 如图 3所示, 该 装置包括: 编码方式识别模块 32、 编码模块 34 , 下面对此进行说明。 编码方式识别模块 32 (或简称为编码识别模块), 用于对设置为 UCS2 格式的短信中的每一个字符进行编码格式识别; 编码模块 34 连接至编码方 式识别模块 32 ,在短信中的所有字符均能被同一预定编码格式所识别的情况 下, 使用预定编码格式对短信进行编码。 在实施时, 该编码模块 34 可以包括: GSM7标准编码模块, 用于使用 GSM7标准编码格式对短信进行编码; 国家自定义编码模块, 用于使用国家 自定义编码格式对短信进行编码; 3GPP 国家语言扩展编码模块, 用于使用 3GPP国家语言扩展编码格式对短信进行编码; 8位编码模块, 用于使用 8位 编码格式对短信进行编码; 以及 UCS2编码模块, 用于使用 UCS编码模块对 短信进行编码。 图 4是 居本发明实施例的短信解码装置的结构框图, 如图 4所示, 该 装置包括: 解码方式识别模块 42、 解码模块 44 , 下面对此进行说明。
解码方式识别模块 42 (或简称为解码识别模块), 用于确定接收到的短 信的编码格式; 解码模块 44连接至解码方式识别模块 42 , 用于使用与编码 格式对应的解码格式对短信进行解码。 在实施时,解码模块 44可以包括: GSM7标准解码模块,用于使用 GSM7 标准编码格式对短信进行解码; 国家自定义解码模块, 用于使用国家自定义 编码格式对短信进行解码; 3GPP 国家语言扩展解码模块, 用于使用 3GPP 国家语言扩展编码格式对短信进行解码; 8位解码模块, 用于使用 8位编码 格式对短信进行解码; 以及 UCS2解码模块, 用于使用 UCS编码模块对短信 进行解码。 需要说明的是, 编解码模块可以根据实际的需要增加修改或者删除。 图 5是才艮据本发明实施例的优选的短信编码方法的流程图,如图 5所示, 该流程包括如下步骤: 步骤 S501 , 将短信输入文本框的编码设置为 UCS2。 步骤 S502, 用户在短信输入文本框里输入要发送的文本。 为提示用户现 在已经输入多少个字符, 共发送多少条短信, 每输入一个字符后, 都要调用 编码识别模块, 判断目的编码格式。 步骤 S503 , 用户点击"发送"按钮后, 创建一个短信文本緩冲区数组, 长 度是 160个 Int型数据。 这样做的原因是, 短信每次发送的最大长度是 160 个字符。获取文本框里的字符,将文本框里的字符拷贝到文本緩冲区数组里。 步骤 S 504, 构造一个短信数据结构, 用于过程控制。 步骤 S505 , 对文本緩冲区数组进行扫描, 判断短信是否是级联短信, 级 联的条数, 判断级联短信的依据是: 如果全部字符都是 GSM7默认码字符或 者 GSM7扩展码字符,超过 160个字符为级联短信,级联中单条的长度是 153。 如果存在一个字符是 UCS2码字符, 超过 70个字符为级联短信, 级联中单条 的长度是 67。 如果全部字符不含 UCS2码字符, 但是包含国家扩展码字符, 则超过 155个字符为级联短信, 级联中点条的长度是 149。 将这两个数据写 入短信数据结构。 步骤 S506, 如果短信编码类型是 7位的, 则按照图 6开始进行转换, 将 文本緩冲区里的每个字符的 UCS2编码转换成 GSM7编码。
步骤 S507, 如果短信编码类型是 UCS2 的, 则不需要转换。 依然保持 UCS2编码。 步骤 S508, 如果短信编码类型是 8位的, 则调用图 7开始转换, 将文本 緩冲区里的每个字符的 UCS2编码转换成 8位编码。 步骤 S509, 将级联短信拆开, 把文本緩冲区里的级联短信, 按照单条所 能发送的最大长度, 拆开, 依次将内容传给短信发送系统。 图 6是根据本发明实施例的 UCS2编码转换成 GSM7编码的流程图, 如 图 6所示, 该流程包括如下步骤: 步骤 S601 , 从文本緩冲区里读入一个 UCS2字符。 步骤 S602, 步骤 S502会告知这个 UCS2字符要编码成何种 GSM7位编 码。 步骤 S603 , 如果要编成 GSM7默认码, 则调用 GSM7默认编 /解码模块 处理。 步骤 S604, 如果要编成 GSM7扩展码, 则调用 GSM7扩展编 /解码模块。 步骤 S605 , 如果要编成国家扩展码, 则调用 3GPP国家语言扩展编 /解码 模块。 码值由两个编码值组成, 第一个是固定标示位 Oxlb, 第二个码值的查 找方法分 Single Shift和 Lock Shift两种。 步骤 S606, 如果要编成国家自定义 GSM7码, 则调用国家自定义编 /解 码模块。 步 4聚 S607, 重复步 4聚 S401-步 4聚 S406, 直到文本緩冲区里的所有字符都 转换完毕。 图 7是根据本发明实施例的 UCS2编码转成 8位编码的流程图, 如图 7 所示, 该流程包括如下步骤: 步骤 S701 , 从文本緩冲区读入一个 UCS2字符。 步骤 S702, 从国家语言文件里, 分别读取目前已经定义的使用 8位字符 的国家的编码表。 具体的逻辑数据结构是哈希表。
步骤 S703 , 将步骤 S701里的 UCS2字符在编码哈希表里查找, 如果找 到, 就使用对应的哈希表值 ( 8位编码) 来替代原来的 UCS2值。 步骤 S704, 返回步骤 S701 , 直到文本緩冲区里的所有字符都转换完毕。 图 8是才艮据本发明实施例的优选的短信解码方法的流程图,如图 8所示, 该流程包括如下步骤: 步骤 S801 , 待解码短信 PDU串传入系统。 步骤 S802 , 调用解码识别模块判断目的解码格式。 步骤 S803 , 如果是 8位编码, 则调用 8位编码国家识别模块继续识别是GSM7 encoding uses 7 binary digits to represent a character. The maximum number of characters that can be represented is 127. It is used for languages with less characters such as English. UCS2 encoding uses 16 binary digits to represent a character. It is a form of Unicode. The maximum number of characters that can be represented is 65536. It is used to represent languages with more characters such as Chinese. In some languages, the number of characters is slightly larger than 127, and GSM7 cannot be used. However, if UCS2 is used, it is a waste of space. Because the maximum length of a single transmission is 140 bytes according to the 3GPP protocol 23.040, if UCS2 is used, The content of a single transmission is less than half that of the GSM 7-bit encoding, so this time it can be sent using the ISO-defined 8-bit encoding. 8 8-bit encoding is not common in SMS text encoding. It is common for these countries to define the GSM7 code table by themselves, instead of the default GSM7 code table of the protocol, and still use GSM 7-bit encoding, such as Greece. In another case, the 3GPP protocol 20.038 also defines a national language extension mechanism for GSM7, which can be used to solve GSM7 encoding in languages with a slightly larger number of characters, such as Turkish, Spanish, Portuguese, and the like. In the prior art, a short message-enabled terminal (for example, a mobile phone, a fixed station, or a data card) supports only English and local language encodings. This can lead to problems. For example, when an end product needs to be shipped to a country, the short message module of the software must be changed to support the local language encoding. In order to support the local language encoding, it is necessary to tamper with the code table to recompile the software. Or modify the configuration, which increases the difficulty of software customization, increases the chance of error, and also requires more time for software customization and testing; for example, for users who frequently go abroad and change the language environment, because The terminal can only support two languages, which causes the user to not send and receive text messages in the local language during the roaming process. SUMMARY OF THE INVENTION A primary object of the present invention is to provide a short message encoding and decoding scheme to solve at least the above problems. According to an aspect of the present invention, a short message encoding method is provided, including: setting a short message into a universal multi-byte coded character set UCS2 format; and performing encoding format recognition on each character in a short message set to a UCS2 format; In the case where all characters in the character can be recognized by the same predetermined encoding format, the short message is encoded using a predetermined encoding format. Preferably, after encoding the short message by using the predetermined encoding format, the method further includes: determining, by the predetermined encoding format, a maximum length of the maximum short message text supported by the predetermined encoding format; and dividing the short message into the level if the short message exceeds the maximum length Connected SMS. Preferably, the predetermined coding format is one of the following: Global Mobile Communication GSM7 coding, 8-bit coding, UCS2 coding, wherein the GSM7 coding is one of the following: GSM7 standard coding, national custom coding, 3rd generation partner program 3GPP countries Language extension encoding. Preferably, after the encoding format is recognized for each character in the short message, the short message is saved in the short message text buffer array; the short message is read from the short message text buffer array, and the short message is encoded using a predetermined encoding format. . According to another aspect of the present invention, a short message decoding method is provided, the method comprising: receiving a short message and determining an encoding format of the short message; and decoding the short message by using a decoding format corresponding to the encoding format. Preferably, determining the encoding format of the short message comprises: obtaining information for indicating an encoding format carried in the short message; and determining an encoding format of the short message according to the information. Preferably, determining the encoding format of the short message comprises: identifying each of the characters in the short message; and determining that the predetermined encoding format is the encoding format of the short message if all the characters in the short message are recognized by the same predetermined encoding format. . Preferably, the predetermined coding format is one of the following: global mobile communication GSM7 coding, 8-bit coding, UCS2 coding, wherein the GSM7 coding is one of the following: GSM7 standard coding, national custom coding, 3GPP national language extension coding. Preferably, after determining that the encoding format of the short message is 8-bit encoding or national custom encoding, after decoding the short message by using the decoding format corresponding to the encoding format, the method further includes: recording the 8-bit encoding or the country used in this time. The national decoding format corresponding to the custom encoding, and after receiving the short message encoded using the 8-bit encoding or the national custom encoding, the short-distance decoding is performed using the recorded national decoding format. According to another aspect of the present invention, a short message encoding apparatus is provided, the apparatus comprising: an encoding mode identifying module, configured to perform encoding format recognition on each character in a short message set to a UCS2 format; an encoding module, in a short message In the case where all characters can be recognized by the same predetermined encoding format, the short message is encoded using a predetermined encoding format. Preferably, the encoding module comprises: a GSM7 standard encoding module for encoding the short message using the GSM7 standard encoding format; a national custom encoding module for encoding the short message using the national custom encoding format; 3GPP national language extended encoding module, For encoding a short message using a 3GPP national language extended coding format; an 8-bit encoding module for encoding a short message using an 8-bit encoding format; and a UCS2 encoding module for encoding a short message using a UCS encoding module. According to another aspect of the present invention, a short message decoding apparatus is provided, the apparatus comprising: a decoding mode identifying module, configured to determine an encoding format of the received short message; and a decoding module, configured to use a decoding format pair corresponding to the encoding format SMS is decoded. Preferably, the decoding module comprises: a GSM7 standard decoding module, configured to decode a short message by using a GSM7 standard encoding format; a national custom decoding module, configured to decode a short message by using a national custom encoding format; 3GPP national language extended decoding module, For decoding a short message using a 3GPP national language extended coding format; an 8-bit decoding module for decoding an SMS using an 8-bit encoding format; and a UCS2 decoding module for decoding a short message using a UCS encoding module. According to another aspect of the present invention, a terminal is provided, the terminal comprising the above-described short message encoding device and/or the above short message decoding device. Through the invention, the short message is set to the UCS2 format; each character in the short message is encoded and formatted, and when all the characters in the short message can be recognized by the same predetermined encoding format, the predetermined encoding is used. The format encodes the text message. Solved the short support in the prior art The terminal of the letter function only supports the problems that may occur in the two language codes of English and local languages, thereby improving the usability of the terminal. BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are set to illustrate,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 1 is a flowchart of a short message encoding method according to an embodiment of the present invention; FIG. 2 is a flowchart of a short message decoding method according to an embodiment of the present invention; FIG. 3 is a short message encoding according to an embodiment of the present invention; FIG. 4 is a block diagram showing a structure of a short message decoding apparatus according to an embodiment of the present invention; FIG. 5 is a flowchart of a preferred short message encoding method in accordance with an embodiment of the present invention; FIG. 6 is a UCS2 code according to an embodiment of the present invention. Figure 7 is a flow chart for converting a UCS2 code into an 8-bit code according to an embodiment of the present invention; Figure 8 is a flow chart of a preferred short message decoding method according to an embodiment of the present invention; Is a flowchart of extended new language support according to an embodiment of the present invention; FIG. 10 is a structural block diagram of a codec system according to an embodiment of the present invention. BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, the present invention will be described in detail with reference to the accompanying drawings. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict. In the following implementations, the coding formats mentioned include: GSM7 coding, 8-bit coding, UCS2 coding, wherein the GSM7 coding includes: GSM7 standard coding, national custom coding, 3rd Generation Partnership Project 3GPP National Language Extension Coding. Of course, if the encoding format is increased or changed relative to the above several encoding formats, the following embodiments are also applicable, and only need to be added or repaired. The corresponding codec module can be used. 1 is a flowchart of a short message encoding method according to an embodiment of the present invention. As shown in FIG. 1, the process includes the following steps: Step S102: setting a short message to a UCS2 format; Step S104, setting a short message to a UCS2 format Each character in the encoding format is recognized as 'J; Step S106, in the case where all the characters in the short message can be recognized by the same predetermined encoding format, the short message is encoded using the predetermined encoding format. The predetermined coding format is one of the following: Global Mobile Communication GSM7 coding, 8-bit coding, UCS2 coding, wherein the GSM7 coding is one of the following: GSM7 standard coding, national custom coding, 3rd generation partner program 3GPP national language Extended coding. In the above steps, by identifying the characters in the short message one by one, an encoding format that can encode all the characters is determined, and then the encoding format is used for encoding. Through such an automatic identification process, the short message can be encoded and then transmitted according to the supported encoding format, which can solve the problem that the terminal supporting the short message function in the prior art only supports the English and local language encoding. For example, the monthly SMS coding mechanism in different countries in the prior art is inconsistent, and a set of software is difficult to adapt to all countries, so that each time the software is released, the problem of hacking and configuration of the relevant parts of the short message must be falsified and configured. And defects, can automatically adapt to the SMS encoding of various national languages. Preferably, in implementation, after determining the encoding format used for the short message, the maximum length of the maximum short message text supported by the predetermined encoding format may be determined according to the predetermined encoding format, if the encoded short message exceeds the maximum length, You can make a prompt, or you can split the SMS into cascading SMS messages. Preferably, in the implementation, an array may be used for buffering, and of course, other buffering methods may be used. The following takes an array as an example for explanation. After the encoding format is recognized for each character in the short message, the short message is saved in the short message text buffer array; the short message is read from the short message text buffer array, and the short message is encoded using a predetermined encoding format. Preferably, after the short message is encoded, the encoded short message may be carried with information for indicating the short message encoding mode, so that the receiving party performs decoding. 2 is a flowchart of a short message decoding method according to an embodiment of the present invention, as shown in FIG. 2, the flow The process includes the following steps: Step S202: Receive a short message and determine an encoding format of the short message; Step S204: decode the short message by using a decoding format corresponding to the encoding format. Through the above step 4, the encoding format of the short message can be identified, so that the short message is correctly decoded. In the implementation, if the short message carries the information indicating the encoding format of the short message, the information carried in the short message for indicating the encoding format is obtained; and the encoding format of the short message is determined according to the information. Of course, the sender may also not send information for indicating the encoding format. In this case, each character in the short message may be encoded and formatted; if all the characters in the short message can be recognized by the same predetermined encoding format, , determining the encoding format of the predetermined encoding format as a short message. Preferably, in the implementation, when it is determined that the encoding format of the short message is 8-bit encoding or national custom encoding, after the short-distance decoding is performed using the decoding format corresponding to the encoding format, the 8-bit used in the current use may also be recorded. The national decoding format corresponding to the encoding or the national custom encoding, and after receiving the short message encoded using the 8-bit encoding or the national custom encoding, the short-distance decoding is performed using the recorded national decoding format. FIG. 3 is a structural block diagram of a short message encoding apparatus according to an embodiment of the present invention. As shown in FIG. 3, the apparatus includes: an encoding mode identifying module 32 and an encoding module 34, which will be described below. The coding mode identification module 32 (or simply the code recognition module) is configured to perform coding format recognition on each character in the short message set to the UCS2 format; the coding module 34 is connected to the coding mode identification module 32, and all characters in the short message. In the case where both can be identified by the same predetermined encoding format, the short message is encoded using a predetermined encoding format. In implementation, the encoding module 34 may include: a GSM7 standard encoding module for encoding a short message using a GSM7 standard encoding format; a national custom encoding module for encoding a short message using a national custom encoding format; 3GPP national language An extended coding module for encoding a short message using a 3GPP national language extended coding format; an 8-bit encoding module for encoding a short message using an 8-bit encoding format; and a UCS2 encoding module for encoding a short message using a UCS encoding module . FIG. 4 is a structural block diagram of a short message decoding apparatus according to an embodiment of the present invention. As shown in FIG. 4, the apparatus includes: a decoding mode identification module 42 and a decoding module 44, which will be described below. The decoding mode identification module 42 (or simply the decoding identification module) is configured to determine an encoding format of the received short message; the decoding module 44 is coupled to the decoding mode identifying module 42 for decoding the short message by using a decoding format corresponding to the encoding format. . In implementation, the decoding module 44 may include: a GSM7 standard decoding module for decoding a short message using a GSM7 standard encoding format; a national custom decoding module for decoding a short message using a national custom encoding format; 3GPP National Language Extension The decoding module is configured to decode the short message by using the 3GPP national language extended encoding format; the 8-bit decoding module is configured to decode the short message by using the 8-bit encoding format; and the UCS2 decoding module is configured to decode the short message by using the UCS encoding module. It should be noted that the codec module may be modified or deleted according to actual needs. FIG. 5 is a flowchart of a preferred short message encoding method according to an embodiment of the present invention. As shown in FIG. 5, the process includes the following steps: Step S501: Set a code of a short message input text box to UCS2. Step S502, the user inputs the text to be sent in the short message input text box. In order to prompt the user how many characters have been input, how many text messages are sent, and after each character is input, the code recognition module is called to judge the destination coding format. In step S503, after the user clicks the "send" button, an SMS text buffer array is created, and the length is 160 Int type data. The reason for this is that the maximum length of a text message sent each time is 160 characters. Get the characters in the text box and copy the characters in the text box into the text buffer array. Step S504, constructing a short message data structure for process control. Step S505: Scan the text buffer array to determine whether the short message is a cascaded short message, the number of cascaded messages, and the basis for determining the cascaded short message is: If all the characters are GSM7 default code characters or GSM7 extended code characters, exceed 160 characters are cascading text messages, and the length of a single line in the cascading is 153. If one character is a UCS2 code character and more than 70 characters are cascaded text messages, the length of the single line in the cascade is 67. If all characters do not contain UCS2 code characters, but contain national extension code characters, then more than 155 characters are cascaded text messages, and the length of the point bar in the cascade is 149. Write these two data to the SMS data structure. Step S506, if the short message coding type is 7 bits, start conversion according to FIG. 6, and convert the UCS2 code of each character in the text buffer into GSM7 code. In step S507, if the short message encoding type is UCS2, no conversion is needed. Still maintain UCS2 encoding. Step S508, if the short message encoding type is 8 bits, call FIG. 7 to start conversion, and convert the UCS2 code of each character in the text buffer into an 8-bit code. In step S509, the cascading short message is opened, and the cascading short message in the text buffer is disassembled according to the maximum length that can be sent by the single strip, and the content is sequentially transmitted to the short message sending system. FIG. 6 is a flowchart of converting UCS2 code into GSM7 code according to an embodiment of the present invention. As shown in FIG. 6, the process includes the following steps: Step S601: Read a UCS2 character from a text buffer. Step S602, step S502 will inform the UCS2 character which GSM7 bit code is to be encoded. Step S603, if the GSM7 default code is to be programmed, the GSM7 default encoding/decoding module is called. Step S604, if a GSM7 spreading code is to be coded, the GSM7 extended encoding/decoding module is called. Step S605, if the national extension code is to be compiled, the 3GPP national language extension encoding/decoding module is called. The code value consists of two code values. The first one is the fixed flag bit Oxlb, and the second code value is searched by two methods: Single Shift and Lock Shift. Step S606, if the national custom GSM7 code is to be compiled, the national custom encoding/decoding module is called. Step 4 gathers S607, repeats step 4 to gather S401-step 4 to gather S406, until all the characters in the text buffer are converted. FIG. 7 is a flowchart of converting UCS2 code into 8-bit code according to an embodiment of the present invention. As shown in FIG. 7, the process includes the following steps: Step S701: Read a UCS2 character from a text buffer. Step S702: From the national language file, respectively read the coding table of the currently defined country using 8-bit characters. The specific logical data structure is a hash table. Step S703, the UCS2 character in step S701 is searched in the encoding hash table, and if found, the corresponding hash table value (8-bit encoding) is used instead of the original UCS2 value. Step S704, returning to step S701, until all the characters in the text buffer are converted. FIG. 8 is a flowchart of a preferred short message decoding method according to an embodiment of the present invention. As shown in FIG. 8, the process includes the following steps: Step S801: A short message PDU string to be decoded is transmitted to the system. Step S802, invoking a decoding identification module to determine a destination decoding format. Step S803, if it is an 8-bit code, the 8-bit coded country identification module is called to continue to identify
13那个国家。 步 4聚 S 804, 如果是国家语言自定义编码或者标准 GSM7编码, 则调用 13 that country. Step 4 gathers S 804, if it is a national language custom code or standard GSM7 code, then call
GSM7标准和国家自定义扩展识别模块继续识别。 步骤 S805 , 构造短信数据结构。 步骤 S 806 , 如果釆用标准 GSM7解码, 则调用标准 GSM7编解码模块。 步骤 S807, 如果釆用 8位编码解码, 则调用 8位编解码模块。 步 4聚 S808, 如果釆用 UCS2编码, 则调用 UCS2解码模块, 保持编码不 变。 步骤 S809, 如果釆用 3GPP国家语言扩展编码, 则调用 3GPP国家语言 扩展编解码模块。 步骤 S810,如果釆用国家自定义 GSM7编码, 则调用国家自定义编解码 模块解码。 步骤 S811 , 解码后的文本放入短信数据结构, 级联短信则进行拼接。 图 9是根据本发明实施例的扩展新增语言支持的流程图, 如图 9所示, 该流程包括如下步骤: 步骤 S901 , 获取信息, 该国家语言编码使用的 8位编码, 3GPP国家语 言扩展编码, 还是国家自定义编码。
步骤 S902, 如果使用的是 8位编码, 则制作这个国家的 8位编 /解码表 文件, 放入 8位码表文件夹下, 供 8位编 /解码模块调用。 制作 8位编码判断 数组文件, 放入判断数组文件夹下, 供国家语言识别模块调用。 步骤 S903 , 如果使用的是国家自定义编码, 则制作这个国家的自定义 GSM7编 /解码表文件,放入国家自定义 GSM7码表文件夹里。 制作国家自定 义编码判断数组文件, 放入判断数组文件夹下, 供国家语言识别模块调用。 步骤 S904,如果使用的是 3GPP国家语言扩展,则制作这个国家的 3GPP 国家语言扩展的 Single Shift编 /解码表和 Lock Shift编 /解码表, 放入 3GPP 国家语言扩展码表文件夹下。 制作 3GPP国家语言扩展判断数组, 放入判断 数组文件夹下, 供国家语言识别模块调用。 在另一个实施例中, 还提供了一种终端, 包括: 上述的短信编码装置和 / 或上述的短信解码装置。 下面对一个优选的终端实施例进行说明。 该终端包括: 标准 GSM7编解码模块, UCS2编解码模块, 这些模块可 以使用现有的硬件或者软件模块来实现。 该终端还包括: 3GPP国家语言扩展编解码模块, 8位编解码模块, 在本 实施例中这两个模块可以自动的加载新的码表以实现对新增语言的支持。 在以下叙述中将短信编码分成两个过程: 编码方式识别, 编码。 将短信 解码分成两个过程: 解码方式识别, 解码。 本实施例在短信编码的过程中各模块的作用如下: 需要编码的短信文本以 UCS2 ( Unicode ) 格式传递进系统。 编码方式识别模块读入判断数组, 对该文本的目的编码格式进行识别。 这个识别是一次遍历的, 即: 只有当该短信文本里的每一个字符, 都被识别 成某个编码格式的时候, 才可以认为整个文本的目标编码格式是某格式。 如果编码方式识别模块识别出需釆用标准 GSM7 编码, 则调用标准 GSM7编解码模块; 如果编码方式识别模块判断出需釆用 3GPP国家语言扩 展编码, 则调用 3GPP国家自定义编解码模块; 如果编码方式识别模块判断 出需釆用国家自定义 GSM7位编码, 则调用国家自定义编解码模块; 如果编 码方式识别模块识别出需釆用 8位编码, 则调用 8位编解码模块; 如果编码
方式识别模块识别出需釆用 UCS2编码, 则调用 UCS2编解码模块。 编码完成后, 生成短信 PDU串, 输出给短信发送软件。 本实施例在短信解码的过程中各模块的作用如下: 需要解码的 PDU 串作为输入传递进系统。 解码方式识别模块根据 PDU 串的 DCS字段, 和短信头的信息单元, 初步判断出 PDU串的目标解码格式。 如果解码方式识别模块识别出釆用 UCS2码,则调用 UCS2编解码模块解码。 如果解码方式识别模块识别出釆用 8位码, 则进一步调用 8位编码国家识别 模块,该模块具有经验记忆和判断功能,识别出具体是哪个国家的 8位编码, 然后调用 8位编解码模块解码。 如果解码方式识别模块识别出釆用 3GPP国 家语言扩展 GSM7码, 则调用 3GPP国家语言扩展编解码模块。 如果解码方 式识别模块在短信头信息里, 没有发现短信格式。 则进一步调用国家自定义 扩展识别模块,该模块具有经验记忆和判断功能,识别出应该使用标准 GSM7 编码还是国家自定义扩展 GSM7码。 如果国家自定义扩展识别模块识别出需 釆用标准 GSM7编码, 则调用标准 GSM7编解码模块解码。 如果国家自定义 扩展识别模块识别出需釆用国家自定义扩展 GSM7码, 则调用国家自定义编 解码模块解码。 此外, 还有编解码过程可以使用的配置和支持模块, 该模块的作用是管 理短信数据结构,用于 7 载编码前的文本和编码后的码文,记录各管理数据, 例如短信头、 级联信息等。 另外, 该模块还负责管理短信数据结构在各个模 块之间的传递, 以及整个系统的输入输出。 本实施例具有以下的技术效果: The GSM7 standard and the National Custom Extended Identification Module continue to be identified. Step S805, constructing a short message data structure. Step S 806, if standard GSM7 decoding is used, the standard GSM7 codec module is called. Step S807, if 8-bit code decoding is used, the 8-bit codec module is called. Step 4 gathers S808. If UCS2 encoding is used, the UCS2 decoding module is called to keep the encoding unchanged. Step S809, if the 3GPP national language extension coding is used, the 3GPP national language extension codec module is invoked. Step S810, if the national custom GSM7 encoding is used, the national custom codec module is called to decode. In step S811, the decoded text is put into the short message data structure, and the cascaded short message is stitched. FIG. 9 is a flowchart of extended new language support according to an embodiment of the present invention. As shown in FIG. 9, the process includes the following steps: Step S901: Obtain information, 8-bit encoding used by the national language encoding, and 3GPP national language extension. Coding, or national custom coding. Step S902, if an 8-bit encoding is used, an 8-bit encoding/decoding table file of the country is created and placed in an 8-bit code table folder for the 8-bit encoding/decoding module to call. Make an 8-bit code judgment array file, put it into the judgment array folder, and call it for the national language recognition module. Step S903, if the national custom code is used, the custom GSM7 code/decode table file of the country is created and placed in the national custom GSM7 code table folder. Make a national custom code judgment array file, put it into the judgment array folder, and call it for the national language recognition module. Step S904, if the 3GPP national language extension is used, the Single Shift encoding/decoding table and the Lock Shift encoding/decoding table of the 3GPP national language extension of the country are created and placed in the 3GPP national language extension code table folder. Make a 3GPP national language extension judgment array, put it into the judgment array folder, and call it for the national language recognition module. In another embodiment, a terminal is provided, including: the foregoing short message encoding device and/or the short message decoding device. A preferred terminal embodiment will now be described. The terminal includes: a standard GSM7 codec module, a UCS2 codec module, which can be implemented using existing hardware or software modules. The terminal further includes: a 3GPP national language extension codec module, and an 8-bit codec module. In this embodiment, the two modules can automatically load a new code table to implement support for the new language. In the following description, the short message encoding is divided into two processes: encoding mode identification, encoding. The SMS decoding is divided into two processes: decoding mode identification, decoding. In this embodiment, the functions of each module in the process of short message encoding are as follows: The short message text to be encoded is transmitted into the system in UCS2 (Unicode) format. The encoding mode identification module reads the judgment array and identifies the destination encoding format of the text. This recognition is traversed once, that is, only when each character in the short message text is recognized as an encoding format, the target encoding format of the entire text can be considered as a certain format. If the coding mode identification module recognizes that the standard GSM7 coding is required, the standard GSM7 codec module is invoked; if the coding mode identification module determines that the 3GPP national language extension coding is required, the 3GPP national custom codec module is invoked; The mode identification module determines that the national custom GSM 7-bit code needs to be used, and then calls the national custom codec module; if the code mode identification module recognizes that 8-bit code is needed, the 8-bit codec module is called; The mode identification module recognizes that the UCS2 code needs to be used, and then calls the UCS2 codec module. After the encoding is completed, a short message PDU string is generated and output to the short message sending software. In this embodiment, the functions of each module in the process of short message decoding are as follows: The PDU string to be decoded is transmitted as input into the system. The decoding mode identification module initially determines the target decoding format of the PDU string according to the DCS field of the PDU string and the information unit of the short message header. If the decoding mode identification module recognizes that the UCS2 code is used, the UCS2 codec module is called to decode. If the decoding mode identification module recognizes the 8-bit code, further calls the 8-bit coded country identification module, which has an empirical memory and judgment function, identifies which country is the 8-bit code, and then calls the 8-bit codec module. decoding. If the decoding mode identification module recognizes that the 3GPP national language extension GSM7 code is used, the 3GPP national language extension codec module is invoked. If the decoding mode identification module is in the short message header information, no short message format is found. Then the national custom extension identification module is further called, which has an empirical memory and judgment function, and identifies whether the standard GSM7 code or the national custom extended GSM7 code should be used. If the national custom extension identification module recognizes that the standard GSM7 code is required, the standard GSM7 codec module is called for decoding. If the national custom extension identification module recognizes that the national custom extension GSM7 code needs to be used, the national custom codec module is called to decode. In addition, there is a configuration and support module that can be used in the encoding and decoding process. The function of the module is to manage the short message data structure for the text before the encoding and the encoded code, and record each management data, such as the short message header and level. Information, etc. In addition, the module is also responsible for managing the transfer of SMS data structures between modules, as well as the input and output of the entire system. This embodiment has the following technical effects:
1. 无须重新编译软件或者^ ί'爹改任何配置项, 自动适配多国语言短信编 码。 1. Automatically adapt multi-language SMS encoding without recompiling the software or tampering with any configuration items.
2. 扩展能力强, 不需要重新编译程序就可以快速的加入对其他国家语 言的支持。 2. Strong ability to expand, you can quickly join the support of other countries' languages without recompiling the program.
3. 提出了一种智能判断和记忆的机制, 处理 8 位编码之间、 国家自定 义编码和标准 GSM7位编码的 i只别。 3. A mechanism for intelligent judgment and memory is proposed, which deals with the 8-bit code, the national custom code and the standard GSM 7-bit code.
4. 节省了针对不同国家发货的软件定制时间, 提高了软件的可靠性, 改善了终端用户体 -险。
在另外一个实施例中, 上述的模块也可以作为一个系统存在, 图 10 是 才艮据本发明实施例的编解码系统的结构 4 图, 下面结合图 10 对该编解码系 统进行说明。 如图 10 所示, 该系统由编码方式识别模块, 解码方式识别模 块, GSM7标准和国家自定义扩展识别模块, 8位编码国家识别模块, 标准 GSM7编解码模块, 8位编解码模块, UCS2编解码模块, 3GPP国家语言扩 展编解码模块, 国家自定义编解码模块, 配置和支持模块组成。 编码识别模块, 用于根据输入字符, 识别出该条短信的目的编码格式。 例如,依次读入标准 GSM7默认字符数组, 标准 GSM7扩展字符数组, 3GPP 国家语言扩展的各个国家的字符数组, 8 位国家编码字符数组, 国家自定义 的字符数组。 当发现所有的字符都落入某个字符数组里, 就可以确认目的编 码格式, 如果有任意一个字符不能在字符数组里找到, 就判断目的编码格式 是 UCS2编码。 标准 GSM7编解码模块, 用于标准 GSM7码与 UCS2编码之间的转换。 国家自定义编解码模块, 用于国家自定义 GSM7码与 UCS2编码之间的 转换。 这个模块的码表不止一个, 对应于不同的国家, 有不同的码表, 可以 动态的载入新增加的码表。 4. It saves software customization time for different countries, improves software reliability, and improves end-user body-risk. In another embodiment, the above module may also exist as a system. FIG. 10 is a structural diagram of a codec system according to an embodiment of the present invention. The codec system will be described below with reference to FIG. As shown in Figure 10, the system consists of a coding mode identification module, a decoding mode identification module, a GSM7 standard and a national custom extension identification module, an 8-bit coded national identification module, a standard GSM7 codec module, an 8-bit codec module, and a UCS2 code. Decoding module, 3GPP national language extended codec module, national custom codec module, configuration and support modules. The code recognition module is configured to identify a destination encoding format of the short message according to the input character. For example, the standard GSM7 default character array, the standard GSM7 extended character array, the character array of each country of the 3GPP national language extension, the 8-bit national coded character array, and the country-defined character array are sequentially read. When it is found that all characters fall into a character array, the destination encoding format can be confirmed. If any character cannot be found in the character array, it is judged that the destination encoding format is UCS2 encoding. Standard GSM7 codec module for conversion between standard GSM7 code and UCS2 code. National custom codec module for conversion between national custom GSM7 code and UCS2 code. This module has more than one code table, corresponding to different countries, with different code tables, which can dynamically load the newly added code table.
UCS2编解码模块, 用于 UCS2编码字符的检验和传递。 UCS2 codec module for verification and transmission of UCS2 coded characters.
8位编 /解码模块, 用于 8位码与 UCS2编码之间的转换。 根据 ISO-8859 的定义, 8位码表不是唯一的, 使用字母文字的国家通常都有自己的 8位码 表。 该模块可以动态的载入新增加的码表。 8-bit encoding/decoding module for conversion between 8-bit code and UCS2 code. According to the definition of ISO-8859, 8-bit code tables are not unique, and countries that use alphabetic characters usually have their own 8-bit code tables. This module can dynamically load the newly added code table.
3GPP 国家语言扩展编解码模块, 用于 3GPP 国家语言扩展码与 UCS2 编码之间的转换。 这个模块的码表也不止一个, 对应于不同的国家, 有不同 的码表, 可以动态的载入新增加的码表。 解码识别模块, 用于根据输入的需要解码的 PDU 串, 判断目的解码格 式:根据 PDU串的 DCS字段和短信头信息,根据 3GPP协议 23.040的定义, 判断出目的编码格式是 GSM7编码, 8位编码, UCS2编码, 还是 3GPP国家 扩展编码, 然后进入相对应的解码模块。 具体釆用 8位编码的哪个国家的码 表, 或者标准 GSM7和国家自定义扩展, 该模块是不负责的, 交由后面两个 模块处理。
GSM7标准和国家自定义扩展识别模块, 用于对标准 GSM7和国家自定 义扩展识别进行判断。 该模块具有智能判断和记忆功能, 第一次根据本机的 语言设置 (例如 Linux操作系统下的 Locale ) 和发短信时的编码方式来综合 判断使用标准 GSM7还是国家自定义解码, 用户可以在获得结果后调整, 模 块将记录用户的调整, 判断出下次解码时釆用何种解码方式。 3GPP National Language Extension Codec module for conversion between 3GPP national language extension codes and UCS2 encoding. This module has more than one code table, corresponding to different countries, with different code tables, which can dynamically load the newly added code table. a decoding identification module, configured to determine a destination decoding format according to the input PDU string to be decoded: according to a DCS field and a short message header information of the PDU string, according to a definition of 3GPP protocol 23.040, determining that the destination encoding format is GSM7 encoding, 8-bit encoding , UCS2 encoding, or 3GPP national extension coding, and then enter the corresponding decoding module. Specifically, which country code table of 8-bit code is used, or standard GSM7 and national custom extension, the module is not responsible and is handled by the latter two modules. The GSM7 standard and the national custom extension identification module are used to judge the standard GSM7 and national custom extension identification. The module has intelligent judgment and memory function. The first time according to the local language setting (such as Locale under Linux operating system) and the encoding method when sending text messages, comprehensive judgment is made to use standard GSM7 or national custom decoding. Users can get it. After the result is adjusted, the module will record the user's adjustment and determine which decoding method to use for the next decoding.
8位编码国家识别模块, 用于对具体釆用哪个国家的 8位码表进行判断。 该模块釆用智能判断和记忆, 第一次根据本机的语言设置 (如 Linux系统下 的 Local )和发短信时釆用的编码格式来综合判断解码时应釆用哪个国家的 8 位码表, 用户可以在获得结果后调整, 模块将记录用户的调整, 判断出下次 解码时釆用何种解码方式。 配置和支持模块, 用于定义短信数据结构, 接收短信内容, 填写短信数 据结构, 与短信接收 /发送系统交互。 定义短信数据结构, 该结构与 3GPP短 信格式的协议对应, 同时, 也将其他一些程序实现需要的数据字段。 这个短 信数据结构实现为可以容纳整个级联短信。 综上所述,通过上述的各个实施例,可以同时支持多国语言短信编解码, 该系统支持目前已知的所有编码格式, 可以根据用户输入和接收到的短信内 容, 自动匹配当前国家语言的编码形式, 调入相对应的码表和编解码程序。 同时, 该系统的扩展性好, 可以对目前没有支持的语言编码进行快速的扩展 支持。 对于经常出国的客户, 使用上述实施例的终端产品, 无须更换和升级 软件, 在世界各地都可以正确的接收和发送当地语言的短信。 另外, 通过上 述实施例, 向不同的国家发货的终端产品, 可以使用同一套短信模块, 降低 了软件定制、 尤其是语言编码模块定制的难度, 也可以降低出错几率, 保证 了软件产品的质量和时间节点。 显然, 本领域的技术人员应该明白, 上述的本发明的各模块或各步骤可 以用通用的计算装置来实现, 它们可以集中在单个的计算装置上, 或者分布 在多个计算装置所组成的网络上, 可选地, 它们可以用计算装置可执行的程 序代码来实现, 从而, 可以将它们存储在存储装置中由计算装置来执行, 并 且在某些情况下, 可以以不同于此处的顺序执行所示出或描述的步骤, 或者 将它们分别制作成各个集成电路模块, 或者将它们中的多个模块或步骤制作 成单个集成电路模块来实现。 这样, 本发明不限制于任何特定的硬件和软件 结合。
以上所述仅为本发明的优选实施例而已, 并不用于限制本发明, 对于本 领域的技术人员来说, 本发明可以有各种更改和变化。 凡在本发明的 ^"神和 原则之内, 所作的任何修改、 等同替换、 改进等, 均应包含在本发明的保护 范围之内。
The 8-bit coded national identification module is used to judge which country's 8-bit code table is used. The module uses intelligent judgment and memory. The first time, according to the local language setting (such as Local under Linux) and the encoding format used when sending text messages, it can comprehensively determine which country's 8-bit code table should be used for decoding. The user can adjust after obtaining the result, and the module will record the user's adjustment to determine which decoding method to use for the next decoding. The configuration and support module is configured to define a short message data structure, receive the short message content, fill in the short message data structure, and interact with the short message receiving/transmitting system. Define the SMS data structure, which corresponds to the protocol of the 3GPP SMS format. At the same time, other programs implement the required data fields. This SMS data structure is implemented to accommodate the entire cascaded SMS. In summary, through the above embodiments, multi-language short message encoding and decoding can be simultaneously supported. The system supports all currently known encoding formats, and can automatically match the encoding of the current national language according to user input and received short message content. Form, call the corresponding code table and codec. At the same time, the system has good scalability and can quickly support the expansion of language codes that are not currently supported. For customers who frequently go abroad, using the terminal products of the above embodiments, it is not necessary to replace and upgrade the software, and the local language text messages can be correctly received and transmitted all over the world. In addition, through the above embodiments, the same set of short message modules can be used for terminal products shipped to different countries, which reduces the difficulty of software customization, especially language coding module customization, and can also reduce the probability of errors and ensure the quality of software products. And time nodes. Obviously, those skilled in the art should understand that the above modules or steps of the present invention can be implemented by a general-purpose computing device, which can be concentrated on a single computing device or distributed over a network composed of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device, such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein. The steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps are fabricated as a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software. The above is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the scope of the present invention are intended to be included within the scope of the present invention.
Claims
权 利 要 求 书 一种短信编码方法, 其特征在于, 包括: Claim method A short message encoding method, which is characterized in that it comprises:
将短信设置为通用多字节编码字符集 UCS2格式; Set the SMS to the general multi-byte coded character set UCS2 format;
对设置为 UCS2 格式的所述短信中的每一个字符进行编码格式识 别; Encoding format recognition for each character in the short message set to UCS2 format;
在所述短信中的所有字符均能被同一预定编码格式所识别的情况 下, 使用所述预定编码格式对所述短信进行编码。 根据权利要求 1所述的方法, 其特征在于, 在使用所述预定编码格式对 所述短信进行编码之后 , 还包括: In the case where all characters in the short message can be recognized by the same predetermined encoding format, the short message is encoded using the predetermined encoding format. The method according to claim 1, wherein after encoding the short message by using the predetermined encoding format, the method further includes:
根据所述预定编码格式确定所述预定编码格式所支持的最大短信文 本的最大长度; Determining a maximum length of the maximum short message text supported by the predetermined encoding format according to the predetermined encoding format;
在所述短信超过所述最大长度的情况下, 将所述短信拆分为级联短 信。 根据权利要求 1所述的方法, 其特征在于, 所述预定编码格式为以下之 一:全球移动通信 GSM7编码、 8位编码、 UCS2编码,其中,所述 GSM7 编码为以下之一: GSM7 标准编码、 国家自定义编码、 第三代合作伙伴 计划 3GPP国家语言扩展编码。 根据权利要求 1至 3中任一项所述的方法, 其特征在于, In the case where the short message exceeds the maximum length, the short message is split into a concatenated short message. The method according to claim 1, wherein the predetermined encoding format is one of: Global System for Mobile Communications, GSM7 encoding, 8-bit encoding, UCS2 encoding, wherein the GSM7 encoding is one of the following: GSM7 standard encoding , National Custom Code, 3rd Generation Partnership Project 3GPP National Language Extension Coding. The method according to any one of claims 1 to 3, characterized in that
在对所述短信中的每一个字符进行编码格式识别之后, 将所述短信 保存在短信文本緩冲区数组中; After performing encoding format recognition on each character in the short message, saving the short message in an SMS text buffer array;
从所述短信文本緩冲区数组中读取所述短信, 并使用所述预定编码 格式对所述短信进行编码。 一种短信解码方法, 其特征在于, 包括: The short message is read from the short text text buffer array and the short message is encoded using the predetermined encoding format. A short message decoding method, comprising:
接收短信并确定所述短信的编码格式; Receiving a short message and determining an encoding format of the short message;
使用与所述编码格式对应的解码格式对所述短信进行解码。
The short message is decoded using a decoding format corresponding to the encoding format.
6. 根据权利要求 5所述的方法, 其特征在于, 确定所述短信的编码格式包 括: The method according to claim 5, wherein determining an encoding format of the short message comprises:
获取所述短信中携带的用于指示所述编码格式的信息; 才艮据所述信息确定所述短信的编码格式。 Acquiring information for indicating the encoding format carried in the short message; determining an encoding format of the short message according to the information.
7. 根据权利要求 5所述的方法, 其特征在于, 确定所述短信的编码格式包 括: The method according to claim 5, wherein determining the encoding format of the short message comprises:
对所述短信中的每一个字符进行编码格式识别; Performing an encoding format identification on each character in the short message;
在所述短信中的所有字符均能被同一预定编码格式所识别的情况 下, 确定所述预定编码格式为所述短信的编码格式。 In a case where all characters in the short message can be identified by the same predetermined encoding format, the predetermined encoding format is determined to be an encoding format of the short message.
8. 根据权利要求 5至 7中任一项所述的方法, 其特征在于, 所述预定编码 格式为以下之一: 全球移动通信 GSM7编码、 8位编码、 UCS2编码, 其中, 所述 GSM7编码为以下之一: GSM7标准编码、 国家自定义编码、 3GPP国家语言扩展编码。 The method according to any one of claims 5 to 7, wherein the predetermined coding format is one of the following: global mobile communication GSM7 coding, 8-bit coding, UCS2 coding, wherein the GSM7 coding For one of the following: GSM7 standard coding, national custom coding, 3GPP national language extension coding.
9. 根据权利要求 8所述的方法, 其特征在于, 在确定所述短信的编码格式 为 8位编码或国家自定义编码的情况下, 在使用与所述编码格式对应的 解码格式对所述短信进行解码之后, 还包括: The method according to claim 8, wherein, in the case that the encoding format of the short message is determined to be 8-bit encoding or national custom encoding, the decoding format corresponding to the encoding format is used. After the SMS is decoded, it also includes:
ΐ己录本次使用的 8位编码或国家自定义编码对应的国家解码格式, 并在下一次接收到使用 8位编码或国家自定义编码进行编码的短信后, 使用所述记录的国家解码格式对所述短信进行解码。 ΐ Have recorded the national decoding format corresponding to the 8-bit code or the national custom code used this time, and after receiving the message encoded by the 8-bit code or the country-custom code next time, use the national decoding format of the record. The short message is decoded.
10. —种短信编码装置, 其特征在于, 包括: 10. A short message encoding device, comprising:
编码方式识别模块, 用于对设置为 UCS2格式的短信中的每一个字 符进行编码格式识别; The coding mode identification module is configured to perform coding format recognition on each character in the short message set to the UCS2 format;
编码模块, 在所述短信中的所有字符均能被同一预定编码格式所识 别的情况下, 使用所述预定编码格式对所述短信进行编码。 The encoding module encodes the short message using the predetermined encoding format if all characters in the short message can be recognized by the same predetermined encoding format.
11. 根据权利要求 10所述的装置, 其特征在于, 所述编码模块包括: The device according to claim 10, wherein the encoding module comprises:
GSM7标准编码模块, 用于使用 GSM7标准编码格式对所述短信进 行编码; a GSM7 standard encoding module for encoding the short message using the GSM7 standard encoding format;
国家自定义编码模块, 用于使用国家自定义编码格式对所述短信进 行编码;
3GPP国家语言扩展编码模块,用于使用 3GPP国家语言扩展编码格 式对所述短信进行编码; a national custom coding module for encoding the short message using a national custom coding format; a 3GPP national language extension coding module, configured to encode the short message by using a 3GPP national language extended coding format;
8位编码模块, 用于使用 8位编码格式对所述短信进行编码; 以及 An 8-bit encoding module for encoding the short message using an 8-bit encoding format;
UCS2编码模块, 用于使用 UCS编码模块对所述短信进行编码。 The UCS2 encoding module is configured to encode the short message by using a UCS encoding module.
12. 一种短信解码装置, 其特征在于, 包括: 12. A short message decoding device, comprising:
解码方式识别模块, 用于确定接收到的短信的编码格式; 解码模块, 用于使用与所述编码格式对应的解码格式对所述短信进 行解码。 a decoding mode identifying module, configured to determine an encoding format of the received short message, and a decoding module, configured to decode the short message by using a decoding format corresponding to the encoding format.
13. 根据权利要求 12所述的装置, 其特征在于, 所述解码模块包括: The device according to claim 12, wherein the decoding module comprises:
GSM7标准解码模块, 用于使用 GSM7标准编码格式对所述短信进 行解码; a GSM7 standard decoding module for decoding the short message using a GSM7 standard encoding format;
国家自定义解码模块, 用于使用国家自定义编码格式对所述短信进 行解码; a national custom decoding module for decoding the short message using a national custom encoding format;
3GPP国家语言扩展解码模块,用于使用 3GPP国家语言扩展编码格 式对所述短信进行解码; a 3GPP national language extension decoding module for decoding the short message using a 3GPP national language extension coding format;
8位解码模块, 用于使用 8位编码格式对所述短信进行解码; 以及 An 8-bit decoding module for decoding the short message using an 8-bit encoding format;
UCS2解码模块, 用于使用 UCS编码模块对所述短信进行解码。 The UCS2 decoding module is configured to decode the short message by using a UCS encoding module.
14. 一种终端, 其特征在于, 包括: 权利要求 10至 11 中任一项所述的短信 编码装置和 /或权利 12至 13中任一项所述的短信解码装置。
A terminal, comprising: the short message encoding device according to any one of claims 10 to 11 and/or the short message decoding device according to any one of claims 12 to 13.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010102758186A CN101938719A (en) | 2010-09-03 | 2010-09-03 | Method for coding and decoding short messages (SMS), device and terminal |
CN201010275818.6 | 2010-09-03 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2012027932A1 true WO2012027932A1 (en) | 2012-03-08 |
Family
ID=43391804
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2010/079351 WO2012027932A1 (en) | 2010-09-03 | 2010-12-01 | Method, equipment and terminal for encoding/decoding short message |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN101938719A (en) |
WO (1) | WO2012027932A1 (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102158832A (en) * | 2011-04-22 | 2011-08-17 | 中兴通讯股份有限公司 | Method and device for self-adaptively coding short message |
CN103327459A (en) * | 2012-03-23 | 2013-09-25 | 宇龙计算机通信科技(深圳)有限公司 | Method and system for sharing short messages and multimedia messages, and mobile terminal |
CN102665184B (en) * | 2012-04-24 | 2014-12-10 | 中兴通讯股份有限公司 | Coding method and device for increasing short message utilization rate |
CN105472107A (en) * | 2014-08-26 | 2016-04-06 | 中兴通讯股份有限公司 | Terminal information processing method and device |
CN104994486B (en) * | 2015-06-23 | 2018-09-11 | 中国联合网络通信集团有限公司 | A kind of point-to-point note receiving/transmission method and system |
CN105634674A (en) * | 2016-01-12 | 2016-06-01 | 青岛海信移动通信技术股份有限公司 | Short message processing method and device |
CN106604246B (en) * | 2016-12-09 | 2020-08-11 | 惠州Tcl移动通信有限公司 | Method, system and mobile terminal for setting short message coding range based on country code |
CN108091354A (en) * | 2017-12-13 | 2018-05-29 | 深圳市沃特沃德股份有限公司 | Onboard system lyrics analysis method and device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004062167A2 (en) * | 2002-12-20 | 2004-07-22 | Motorola, Inc. | Apparatus and method for a coding scheme selection |
CN1816170A (en) * | 2005-11-08 | 2006-08-09 | 杭州华为三康技术有限公司 | Code-conversion method for shortmessage receiving and transmitting and network apparatus used thereof |
WO2007052264A2 (en) * | 2005-10-31 | 2007-05-10 | Myfont Ltd. | Sending and receiving text messages using a variety of fonts |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101047733B (en) * | 2006-06-16 | 2010-09-29 | 华为技术有限公司 | Short message processing method and device |
US20080013712A1 (en) * | 2006-07-11 | 2008-01-17 | Karsten Gopinath | Unified Communication Directory Service |
-
2010
- 2010-09-03 CN CN2010102758186A patent/CN101938719A/en active Pending
- 2010-12-01 WO PCT/CN2010/079351 patent/WO2012027932A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004062167A2 (en) * | 2002-12-20 | 2004-07-22 | Motorola, Inc. | Apparatus and method for a coding scheme selection |
WO2007052264A2 (en) * | 2005-10-31 | 2007-05-10 | Myfont Ltd. | Sending and receiving text messages using a variety of fonts |
CN1816170A (en) * | 2005-11-08 | 2006-08-09 | 杭州华为三康技术有限公司 | Code-conversion method for shortmessage receiving and transmitting and network apparatus used thereof |
Also Published As
Publication number | Publication date |
---|---|
CN101938719A (en) | 2011-01-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2012027932A1 (en) | Method, equipment and terminal for encoding/decoding short message | |
US20170034149A1 (en) | Intelligent Communications Method, Terminal, and System | |
US20070005807A1 (en) | Delta code messaging | |
CN100425081C (en) | Code-conversion method for shortmessage receiving and transmitting and network apparatus used thereof | |
WO2010031329A1 (en) | Method, system and device for receiving and sending instant message | |
WO2013127108A1 (en) | Method and device for saving short message | |
US9294125B2 (en) | Leveraging language structure to dynamically compress a short message service (SMS) message | |
WO2019149006A1 (en) | Method and device for obtaining and providing access information of wireless access point, and medium | |
CN106817689B (en) | High-reliability data subscription and release method and system | |
WO2015117407A1 (en) | Processing method and device for terminal information | |
CN103688558A (en) | Interface between 3gpp networks and 3gpp2 networks for wap text messaging | |
WO2011022980A1 (en) | Method for encoding and decoding message content uniformly and integrated short message center system | |
WO2013182079A1 (en) | Short message transcoding method and device | |
CN101040541B (en) | Adaptive method for transmitting multimedia message between terminals | |
CN103843292B (en) | Networking component and mobile device | |
WO2011017927A1 (en) | Short message encoding method, device and system | |
EP2566292A1 (en) | Method, system and mobile terminal for configuring access point and application information | |
US7224990B2 (en) | Method for transferring a message in a predetermined sending time and related communication system thereof | |
WO2014101530A1 (en) | Method and device for transmitting messages | |
KR100956793B1 (en) | Method and Server for Transforming Message Format for Interworking between Different Message Standards | |
WO2017054496A1 (en) | Method and apparatus for sending and acquiring text information | |
CN105491544A (en) | Short message compression communication method and short message compression communication system | |
CN107734345B (en) | Picture datamation transmission method and system | |
CN112383888A (en) | Short message system, method and equipment | |
CN113490165B (en) | 4G module short message receiving and transmitting method for embedded system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10856618 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10856618 Country of ref document: EP Kind code of ref document: A1 |