CN111339735B - Character string length calculating method and device and computer storage medium - Google Patents

Character string length calculating method and device and computer storage medium Download PDF

Info

Publication number
CN111339735B
CN111339735B CN202010152674.9A CN202010152674A CN111339735B CN 111339735 B CN111339735 B CN 111339735B CN 202010152674 A CN202010152674 A CN 202010152674A CN 111339735 B CN111339735 B CN 111339735B
Authority
CN
China
Prior art keywords
code point
state
code
state machine
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010152674.9A
Other languages
Chinese (zh)
Other versions
CN111339735A (en
Inventor
张宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Cubesili Information Technology Co Ltd
Original Assignee
Guangzhou Cubesili Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Cubesili Information Technology Co Ltd filed Critical Guangzhou Cubesili Information Technology Co Ltd
Priority to CN202010152674.9A priority Critical patent/CN111339735B/en
Publication of CN111339735A publication Critical patent/CN111339735A/en
Application granted granted Critical
Publication of CN111339735B publication Critical patent/CN111339735B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Document Processing Apparatus (AREA)
  • Image Generation (AREA)

Abstract

The application discloses a character string length calculation method and device and a computer storage medium, and belongs to the technical field of electronics. The method comprises the following steps: and acquiring all code points corresponding to the target character string. Dividing all the code points to obtain one or more target code point sets. The number of target code point sets is determined as the length of the target string. Because each target code point set corresponds to one character in the target character string, the character can be pictogram or non-pictogram, the number of the target code point sets is determined as the length of the target character string, that is, the lengths of pictogram and non-pictogram are both determined as 1, and the accuracy of character display can be improved.

Description

Character string length calculating method and device and computer storage medium
Technical Field
The present invention relates to the field of electronic technologies, and in particular, to a method and apparatus for calculating a character string length, and a computer storage medium.
Background
Emoji is a visual emotion symbol used in text, which is encoded using Unicode (Unicode), typically rendered as an icon in a computer device. In instant messaging software, a small yellow facial expression package of emoji is widely used to enrich chat content. The instant messaging software can transmit and display the character string containing emoji, so as to achieve the effect of mixed typesetting of texts and icons.
Calculating the length of a string is a widely used and more frequently used operation. For example: some software will specify that the user's custom nickname cannot be more than 10 characters in length, that the password cannot be less than 6 characters in length, that the talk length cannot be more than 140 characters, etc. These application scenarios all involve the calculation of the string length. In the related art, when calculating the length of a character string containing emoji, the number of Unicode code points constituting emoji is generally taken as the character length corresponding to the emoji.
However, since one emoji may be composed of a plurality of Unicode code points, the character length of the emoji calculated by the calculation method in the related art is greater than 1. The emoji is rendered into an icon in the computer equipment, the computer equipment can judge that the character length of the icon is larger than 1, so that the actual character length of the display content of the computer equipment is not consistent with the calculated character length, and the display accuracy is lower.
Disclosure of Invention
The application provides a character string length calculation method and device and a computer storage medium, which can solve the problem of lower display accuracy in the related technology. The technical scheme is as follows:
In a first aspect, a method for calculating a length of a character string is provided, the method comprising:
acquiring all code points corresponding to the target character string;
dividing all the code points to obtain one or more target code point sets, wherein each target code point set comprises one or more code points, each target code point set corresponds to a character, the character is non-pictogram or pictogram, each target code point set corresponding to the non-pictogram comprises one code point, and each target code point set corresponding to the pictogram comprises one or more code points;
and determining the number of the target code point sets as the length of the target character string.
Optionally, the all code points include one or more of a digital code point, an expressive code point, a country region code point, a modifier code point, and a connector code point.
Optionally, the number of all code points is n, n is a positive integer, and the dividing all code points to obtain one or more target code point sets includes:
acquiring an initial code point set, wherein the initial code point set comprises the j-th code point in all code points, and the initial value of j is 1;
executing a code point set dividing process on the initial code point set, wherein the code point set dividing process comprises the following steps:
Reading state machine states corresponding to the j+1th code point in all the code points;
determining whether the j+1th code point belongs to the initial code point set according to the state machine state corresponding to the j th code point and the state machine state corresponding to the j+1th code point;
when the j+1th code point belongs to the initial code point set, adding the j+1th code point into the initial code point set to obtain an updated initial code point set,
if j+1<n and the j+1th code point is not the country region code point, making j=j+1, and executing the code point set dividing flow again on the updated initial code point set,
if j+1<n and the j+1th code point is a country region code point, taking the updated initial code point set as a target code point set, generating a new code point set, wherein the new code point set comprises the j+2th code point, so that j=j+2, executing the code point set dividing flow on the new initial code point set,
if j+1=n, taking the updated initial code point set as a target code point set;
when the j+1th code point does not belong to the initial code point set, taking the initial code point set as a target code point set, generating a new initial code point set, wherein the new initial code point set comprises the j+1th code point,
And if j+1<n, j=j+1, executing the code point set dividing flow on the new initial code point set, and if j+1=n, taking the new initial code point set as a target code point set.
Optionally, the reading the state machine state corresponding to the j+1th code point in all the code points includes:
if the state of the state machine corresponding to the jth code point is a default state, then:
when the j+1th code point is an expression code point, determining that the state machine state corresponding to the j+1th code point is a pictographic state,
when the j+1th code point is a digital code point, determining that the state machine state corresponding to the j+1th code point is a quasi-pictographic state,
when the j+1th code point is a country region code point, determining that the state machine state corresponding to the j+1th code point is a country region state,
when the j+1th code point is not any one of the expression code point, the digital code point and the country region code point, determining that the state machine state corresponding to the j+1th code point is a default state;
if the state machine state corresponding to the jth code point is a pictographic state or a modifier state, then:
when the j+1th code point is a modifier code point, determining the state machine state corresponding to the j+1th code point as a modifier state,
When the j+1th code point is a connector code point, determining the state machine state corresponding to the j+1th code point as a connector state,
when the j+1th code point is not any one of the modifier code point and the connector code point, reading the state machine state corresponding to the j+1th code point by taking the state machine state corresponding to the j code point as a default state;
if the state machine state corresponding to the j-th code point is a quasi-pictographic state, then:
when the j+1th code point is a modifier code point, determining the state machine state corresponding to the j+1th code point as a modifier state,
when the j+1th code point is not the modifier code point, reading the state machine state corresponding to the j+1th code point by taking the state machine state corresponding to the j code point as a default state;
if the state machine state corresponding to the jth code point is the state of the national region, then:
when the j+1th code point is a country region code point, determining that the state machine state corresponding to the j+1th code point is a country region state,
when the j+1th code point is not the country region code point, reading the state machine state corresponding to the j+1th code point by taking the state machine state corresponding to the j code point as a default state;
If the state machine state corresponding to the j-th code point is a connector state, then:
when the j+1th code point is an expression code point, determining that the state machine state corresponding to the j+1th code point is a pictographic state,
when the j+1th code point is a digital code point, determining that the state machine state corresponding to the j+1th code point is a quasi-pictographic state,
and when the j+1th code point is not any one of the expression code point and the digital code point, taking the state machine state corresponding to the j-1 th code point as a default state, and reading the state machine state corresponding to the j+1th code point after reading the state machine state corresponding to the j-1 th code point again.
Optionally, the determining whether the j+1th code point belongs to the initial code point set according to the state machine state corresponding to the j-th code point and the state machine state corresponding to the j+1th code point includes:
if the state of the state machine corresponding to the jth code point is a default state, then:
when the j+1th code point is an expression code point, a digital code point or a country region code point, determining that the j+1th code point belongs to the initial code point set,
when the j+1th code point is not any one of an expression code point, a digital code point and a country region code point, determining that the j+1th code point is not in the initial code point set;
If the state machine state corresponding to the jth code point is a pictographic state or a modifier state, then:
when the j+1th code point is a modifier code point or a connector code point, determining that the j+1th code point belongs to the initial code point set,
when the j+1th code point is not any one of a modifier code point and a connector code point, determining that the j+1th code point does not belong to the initial code point set;
if the state machine state corresponding to the j-th code point is a quasi-pictographic state, then:
when the j+1th code point is a modifier code point, determining that the j+1th code point belongs to the initial code point set,
when the j+1th code point is not a modifier code point, determining that the j+1th code point is not in the initial code point set;
if the state machine state corresponding to the jth code point is the state of the national region, then:
when the j+1th code point is a country region code point, determining that the j+1th code point belongs to the initial code point set,
when the j+1th code point is not a country region code point, determining that the j+1th code point is not in the initial code point set;
if the state machine state corresponding to the j-th code point is a connector state, then:
When the j+1th code point is an expression code point or a digital code point, determining that the j+1th code point belongs to the initial code point set,
and when the j+1th code point is not any one of the expression code point and the digital code point, determining that the j+1th code point does not belong to the initial code point set.
Optionally, the state machine initial state is a default state, and the method further includes:
when the first code point in all the code points is the expression code point, determining the state machine state corresponding to the first code point as the pictographic state,
when the first code point is a digital code point, determining that the state machine state corresponding to the first code point is a quasi-pictographic state,
when the first code point is the state area code point, determining that the state machine state corresponding to the first code point is the state of the state area,
and when the first code point is not any one of the expression code point, the digital code point and the national region code point, determining that the state machine state corresponding to the first code point is a default state.
Optionally, after dividing all the code points to obtain one or more target code point sets, the method further includes:
and determining the character type corresponding to the target code point set according to the code points in the target code point set.
In a second aspect, there is provided a character string length calculation apparatus, the apparatus comprising:
the acquisition module is used for acquiring all code points corresponding to the target character string;
the dividing module is used for dividing all the code points to obtain one or more target code point sets, wherein each target code point set comprises one or more code points, each target code point set corresponds to a character, the character is non-pictogram or pictogram, each target code point set corresponding to the non-pictogram comprises one code point, and each target code point set corresponding to the pictogram comprises one or more code points;
and the first determining module is used for determining the number of the target code point sets as the length of the target character string.
Optionally, the all code points include one or more of a digital code point, an expressive code point, a country region code point, a modifier code point, and a connector code point.
Optionally, the number of all code points is n, n is a positive integer, and the dividing module is configured to:
acquiring an initial code point set, wherein the initial code point set comprises the j-th code point in all code points, and the initial value of j is 1;
Executing a code point set dividing process on the initial code point set, wherein the code point set dividing process comprises the following steps:
reading state machine states corresponding to the j+1th code point in all the code points;
determining whether the j+1th code point belongs to the initial code point set according to the state machine state corresponding to the j th code point and the state machine state corresponding to the j+1th code point;
when the j+1th code point belongs to the initial code point set, adding the j+1th code point into the initial code point set to obtain an updated initial code point set,
if j+1<n and the j+1th code point is not the country region code point, making j=j+1, and executing the code point set dividing flow again on the updated initial code point set,
if j+1<n and the j+1th code point is a country region code point, taking the updated initial code point set as a target code point set, generating a new code point set, wherein the new code point set comprises the j+2th code point, so that j=j+2, executing the code point set dividing flow on the new initial code point set,
if j+1=n, taking the updated initial code point set as a target code point set;
When the j+1th code point does not belong to the initial code point set, taking the initial code point set as a target code point set, generating a new initial code point set, wherein the new initial code point set comprises the j+1th code point,
and if j+1<n, j=j+1, executing the code point set dividing flow on the new initial code point set, and if j+1=n, taking the new initial code point set as a target code point set.
Optionally, the dividing module is configured to:
if the state of the state machine corresponding to the jth code point is a default state, then:
when the j+1th code point is an expression code point, determining that the state machine state corresponding to the j+1th code point is a pictographic state,
when the j+1th code point is a digital code point, determining that the state machine state corresponding to the j+1th code point is a quasi-pictographic state,
when the j+1th code point is a country region code point, determining that the state machine state corresponding to the j+1th code point is a country region state,
when the j+1th code point is not any one of the expression code point, the digital code point and the country region code point, determining that the state machine state corresponding to the j+1th code point is a default state;
If the state machine state corresponding to the jth code point is a pictographic state or a modifier state, then:
when the j+1th code point is a modifier code point, determining the state machine state corresponding to the j+1th code point as a modifier state,
when the j+1th code point is a connector code point, determining the state machine state corresponding to the j+1th code point as a connector state,
when the j+1th code point is not any one of the modifier code point and the connector code point, reading the state machine state corresponding to the j+1th code point by taking the state machine state corresponding to the j code point as a default state;
if the state machine state corresponding to the j-th code point is a quasi-pictographic state, then:
when the j+1th code point is a modifier code point, determining the state machine state corresponding to the j+1th code point as a modifier state,
when the j+1th code point is not the modifier code point, reading the state machine state corresponding to the j+1th code point by taking the state machine state corresponding to the j code point as a default state;
if the state machine state corresponding to the jth code point is the state of the national region, then:
when the j+1th code point is a country region code point, determining that the state machine state corresponding to the j+1th code point is a country region state,
When the j+1th code point is not the country region code point, reading the state machine state corresponding to the j+1th code point by taking the state machine state corresponding to the j code point as a default state;
if the state machine state corresponding to the j-th code point is a connector state, then:
when the j+1th code point is an expression code point, determining that the state machine state corresponding to the j+1th code point is a pictographic state,
when the j+1th code point is a digital code point, determining that the state machine state corresponding to the j+1th code point is a quasi-pictographic state,
and when the j+1th code point is not any one of the expression code point and the digital code point, taking the state machine state corresponding to the j-1 th code point as a default state, and reading the state machine state corresponding to the j+1th code point after reading the state machine state corresponding to the j-1 th code point again.
Optionally, the dividing module is configured to:
if the state of the state machine corresponding to the jth code point is a default state, then:
when the j+1th code point is an expression code point, a digital code point or a country region code point, determining that the j+1th code point belongs to the initial code point set,
when the j+1th code point is not any one of an expression code point, a digital code point and a country region code point, determining that the j+1th code point is not in the initial code point set;
If the state machine state corresponding to the jth code point is a pictographic state or a modifier state, then:
when the j+1th code point is a modifier code point or a connector code point, determining that the j+1th code point belongs to the initial code point set,
when the j+1th code point is not any one of a modifier code point and a connector code point, determining that the j+1th code point does not belong to the initial code point set;
if the state machine state corresponding to the j-th code point is a quasi-pictographic state, then:
when the j+1th code point is a modifier code point, determining that the j+1th code point belongs to the initial code point set,
when the j+1th code point is not a modifier code point, determining that the j+1th code point is not in the initial code point set;
if the state machine state corresponding to the jth code point is the state of the national region, then:
when the j+1th code point is a country region code point, determining that the j+1th code point belongs to the initial code point set,
when the j+1th code point is not a country region code point, determining that the j+1th code point is not in the initial code point set;
if the state machine state corresponding to the j-th code point is a connector state, then:
When the j+1th code point is an expression code point or a digital code point, determining that the j+1th code point belongs to the initial code point set,
and when the j+1th code point is not any one of the expression code point and the digital code point, determining that the j+1th code point does not belong to the initial code point set.
Optionally, the initial state of the state machine is a default state, and the apparatus further includes a second determining module, configured to:
when the first code point in all the code points is the expression code point, determining the state machine state corresponding to the first code point as the pictographic state,
when the first code point is a digital code point, determining that the state machine state corresponding to the first code point is a quasi-pictographic state,
when the first code point is the state area code point, determining that the state machine state corresponding to the first code point is the state of the state area,
and when the first code point is not any one of the expression code point, the digital code point and the national region code point, determining that the state machine state corresponding to the first code point is a default state.
Optionally, the apparatus further comprises:
and the third determining module is used for determining the character type corresponding to the target code point set according to the code points in the target code point set.
In a third aspect, there is provided a character string length calculation apparatus including: a processor and a memory.
The memory is used for storing a computer program, and the computer program comprises program instructions;
the processor is configured to invoke the computer program to implement the method for calculating a string length according to any one of the first aspect.
In a fourth aspect, there is provided a computer storage medium having instructions stored thereon which, when executed by a processor, implement the method of string length calculation according to any of the first aspects.
The beneficial effects that technical scheme that this application provided brought include:
and dividing all code points corresponding to the obtained target character string to obtain one or more target code point sets, and determining the number of the target code point sets as the length of the target character string. Because each target code point set corresponds to one character in the target character string, the character can be pictogram or non-pictogram, the number of the target code point sets is determined as the length of the target character string, that is, the lengths of pictogram and non-pictogram are both determined as 1, and the accuracy of character display can be improved.
In addition, after the unified code coding specification is modified, when the character string length is calculated by using the character string length calculation method provided by the application, only the type of the code point in the state machine and the state of the state machine corresponding to the code point under each condition are required to be updated, so that maintenance operation is simplified. The state machine can be operated in various operating systems, and has wide application range.
Drawings
Fig. 1 is a flow chart of a method for calculating a character string length according to an embodiment of the present application;
fig. 2 is a flow chart of another method for calculating a character string length according to an embodiment of the present application;
FIG. 3 is a schematic diagram of state machine switching according to a read code point by the state machine according to the embodiment of the present application;
fig. 4 is a schematic structural diagram of a character string length calculating device according to an embodiment of the present application;
FIG. 5 is a schematic diagram of another apparatus for calculating a string length according to an embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a character string length calculating device according to an embodiment of the present disclosure;
fig. 7 is a block diagram of a character string length calculating device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Fig. 1 is a flowchart of a method for calculating a string length according to an embodiment of the present application. The method may be applied to a computer device. As shown in fig. 1, the method includes:
step 101, obtaining all code points corresponding to the target character string.
Step 102, dividing all code points to obtain one or more target code point sets.
Each target code point set comprises one or more code points, and each target code point set corresponds to one character. The character is non-pictogram or pictogram. The target code point set corresponding to each non-pictograph comprises one code point, and the target code point set corresponding to each pictograph comprises one or more code points.
Step 103, determining the number of the target code point sets as the length of the target character string.
In summary, according to the method for calculating the length of the character string provided in the embodiment of the present application, by dividing all code points corresponding to the obtained target character string, one or more target code point sets are obtained, and then the number of the target code point sets is determined as the length of the target character string. Because each target code point set corresponds to one character in the target character string, the character can be pictogram or non-pictogram, the number of the target code point sets is determined as the length of the target character string, that is, the lengths of pictogram and non-pictogram are both determined as 1, and the accuracy of character display can be improved.
Fig. 2 is a flowchart of another method for calculating a string length according to an embodiment of the present application. The method may be applied to a computer device. As shown in fig. 2, the method includes:
step 201, obtaining a target character string.
The target character string in the embodiment of the application comprises one or more characters, and the one or more characters comprise non-pictograms and/or pictograms. Alternatively, the target character string may be a character string entered in an input box, a character string entered through speech recognition, or any character string stored in a computer device, and the method for obtaining the target character string is not limited in this embodiment of the present application.
Step 202, obtaining all code points corresponding to the target character string.
The code point in the embodiment of the application refers to the code point corresponding to each character in unicode. All code points corresponding to the target character string comprise one or more of digital code points, expression code points, country region code points, modifier code points and connector code points. Each non-pictogram in the target character string corresponds to a code point, and each pictogram corresponds to one or more code points.
Illustratively, the pictograph corresponding to the plurality of codepoints may include the following:
First case: the pictogram is composed of country region code points+country region code points, i.e., two consecutive country region code points correspond to one pictogram. The pictogram will be displayed as a national flag on the computer device. The combination of code points in different country regions can show the national flags in different countries.
Second case: the pictogram is composed of an expression code point and one or more modifier code points, namely, one expression code point and one or more modifier code points correspond to one pictogram. Illustratively, the expression code point a is displayed as an engineer avatar on a computer device. When the expression code point a corresponding to the engineer head portrait is followed by a modifier code point b expressing sex girl, the engineer head portrait is displayed as a female engineering head portrait on the computer equipment. When a code point a+b corresponding to a female engineer's head is followed by a modifier code point c expressing brown, the female engineer's head is displayed as a brown skin on the computer device. The code points corresponding to the three pictograms (including the engineer's head portrait, the female engineering head portrait and the female engineering head portrait of brown skin) are respectively: the code point corresponding to the head portrait of the engineer is a; the code point corresponding to the head portrait of the female engineer is a+b; the code points corresponding to the girl's head portraits of brown skin are a+b+c.
Third case: the pictogram is composed of a code point corresponding to the pictogram and a code point corresponding to the connection Fu Madian +pictogram, namely, after the code points corresponding to a plurality of pictograms are connected through the connector code points, the pictogram is displayed as a new pictogram on the computer equipment. Referring to the example in the second case, the corresponding code point of the pictogram is a+b+modified Fu Madian +a+b, which indicates that the pictogram is double-shot by the female engineer.
Fourth case: the pictograph is composed of a number code point and a modifier code point, i.e. one number code point and one or more modifier code points correspond to one pictograph.
In the embodiment of the present application, "+" indicates that two code points located before and after it are consecutive code points.
And 203, dividing all code points corresponding to the target character string to obtain one or more target code point sets.
The number of all code points corresponding to the target character string is n, and n is a positive integer. Optionally, the implementation procedure of step 203 includes: and acquiring an initial code point set, wherein the initial code point set comprises the j-th code point in all code points, and the initial value of j is 1. And executing a code point set dividing flow on the initial code point set.
The code point set dividing process includes the following steps S1 to S4:
In step S1, the state machine state corresponding to the j+1st code point among all code points corresponding to the target character string is read.
In the embodiment of the application, the code point corresponding to the target character string is read through the state machine, and the state of the state machine is switched according to the read code point. The state machine is essentially a piece of code. Optionally, the state machine states include the following six states:
the pictogram state is used for indicating that the character corresponding to the read code point is pictogram and the read code point is expression code point.
The quasi-pictogram state is used for indicating that the character corresponding to the read code point is probably pictogram, and the read code point is a digital code point.
The state of the country region indicates that the character corresponding to the read code point may be a pictogram, and the read code point is the code point of the country region.
And the modifier state is used for indicating that the character corresponding to the read code point is a pictogram and the read code point is a modifier code point.
The connector state is used for indicating that the character corresponding to the read code point is a pictogram, and the read code point is a connector code point.
And the default state is used for indicating that the character corresponding to the read code point is non-pictogram.
In this embodiment, the initial state of the state machine is a default state. When the state machine reads the first code point in all code points corresponding to the target character string, and when the first code point is the expression code point, determining that the state machine state corresponding to the first code point is the pictographic state. When the first code point is a digital code point, determining that the state machine state corresponding to the first code point is a quasi-pictographic state. When the first code point is a country region code point, determining that the state machine state corresponding to the first code point is a country region state. When the first code point is not any one of the expression code point, the digital code point and the country region code point, determining that the state machine state corresponding to the first code point is a default state.
Alternatively, the implementation procedure of step S1 may be divided into the following five cases:
in the first case, if the state of the state machine corresponding to the jth code point is the default state, then:
when the j+1th code point is the expression code point, determining the state machine state corresponding to the j+1th code point as the pictographic state.
When the j+1th code point is a digital code point, determining that the state machine state corresponding to the j+1th code point is a quasi-pictographic state.
When the j+1th code point is the country region code point, determining the state machine state corresponding to the j+1th code point as the country region state.
And when the j+1th code point is not any one of the expression code point, the digital code point and the country region code point, determining the state machine state corresponding to the j+1th code point as a default state.
In the second case, if the state machine state corresponding to the jth code point is a pictographic state or modifier state, then:
when the j+1th code point is the modifier code point, determining the state machine state corresponding to the j+1th code point as the modifier state.
And when the j+1th code point is a connector code point, determining the state machine state corresponding to the j+1th code point as the connector state.
When the j+1th code point is not any one of the modifier code point and the connector code point, the state machine state corresponding to the j+1th code point is read by taking the state machine state corresponding to the j-th code point as a default state.
In the third case, if the state machine state corresponding to the jth code point is the quasi-pictographic state, then:
when the j+1th code point is the modifier code point, determining the state machine state corresponding to the j+1th code point as the modifier state.
When the j+1th code point is not the modifier code point, reading the state machine state corresponding to the j+1th code point by taking the state machine state corresponding to the j code point as a default state.
Fourth, if the state machine state corresponding to the jth code point is the state of the country region, then:
when the j+1th code point is the country region code point, determining the state machine state corresponding to the j+1th code point as the country region state.
When the j+1th code point is not the country region code point, reading the state machine state corresponding to the j+1th code point by taking the state machine state corresponding to the j code point as a default state.
Fifth, if the state machine state corresponding to the jth code point is the connector state, then:
when the j+1th code point is the expression code point, determining the state machine state corresponding to the j+1th code point as the pictographic state.
When the j+1th code point is a digital code point, determining that the state machine state corresponding to the j+1th code point is a quasi-pictographic state.
When the j+1th code point is not any one of the expression code point and the digital code point, the state machine state corresponding to the j-1 th code point is taken as a default state, and after the state machine state corresponding to the j-1 th code point is read again, the state machine state corresponding to the j+1th code point is read.
For example, the initial state of the state machine is the default state, and fig. 3 is a schematic diagram of state machine state switching according to the read code point by the state machine according to the embodiment of the present application. As shown in fig. 3, the state machine state pointed by the arrow is the state machine state corresponding to the currently read code point determined according to the state machine state corresponding to the previous code point. The state machine initial state is the default state.
In step S2, it is determined whether the j+1th code point belongs to the initial code point set according to the state machine state corresponding to the j-th code point and the state machine state corresponding to the j+1th code point.
Alternatively, the implementation procedure of step S2 may be divided into the following five cases:
in the first case, if the state of the state machine corresponding to the jth code point is the default state, then:
when the j+1th code point is an expression code point, a digital code point or a country region code point, it is determined that the j+1th code point belongs to the initial code point set.
When the j+1th code point is not any one of the expression code point, the digital code point and the country region code point, it is determined that the j+1th code point does not belong to the initial code point set.
In the second case, if the state machine state corresponding to the jth code point is a pictographic state or modifier state, then:
When the j+1th code point is a modifier code point or a connector code point, the j+1th code point is determined to belong to the initial code point set.
When the j+1th code point is not any one of the modifier code point and the connector code point, it is determined that the j+1th code point does not belong to the initial code point set.
In the third case, if the state machine state corresponding to the jth code point is the quasi-pictographic state, then:
when the j+1th code point is a modifier code point, it is determined that the j+1th code point belongs to the initial code point set.
When the j+1th code point is not the modifier code point, it is determined that the j+1th code point does not belong to the initial code point set.
Fourth, if the state machine state corresponding to the jth code point is the state of the country region, then:
when the j+1th code point is the country region code point, it is determined that the j+1th code point belongs to the initial code point set.
When the j+1th code point is not the country region code point, it is determined that the j+1th code point does not belong to the initial code point set.
Fifth, if the state machine state corresponding to the jth code point is the connector state, then:
when the j+1th code point is an expression code point or a digital code point, determining that the j+1th code point belongs to the initial code point set.
When the j+1th code point is not any one of the expression code point and the digital code point, it is determined that the j+1th code point does not belong to the initial code point set.
In step S3, when the j+1th code point belongs to the initial code point set, adding the j+1th code point to the initial code point set to obtain an updated initial code point set; if j+1<n and the j+1th code point is not the country region code point, making j=j+1, and executing the code point set dividing flow again on the updated initial code point set; if j+1<n and the j+1th code point is a country region code point, taking the updated initial code point set as a target code point set, wherein the code point set corresponds to a pictogram, generating a new code point set, and the new code point set comprises the j+2th code point, so that j=j+2, and executing a code point set dividing flow on the new initial code point set; if j+1=n, the updated initial code point set is used as a target code point set, and the code point set corresponds to a pictogram.
In step S4, when the j+1th code point does not belong to the initial code point set, the initial code point set is used as a target code point set, the target code point set may correspond to a pictogram or a non-pictogram, the character type corresponding to the target code point set may be determined according to the code points in the target code point set, and a new initial code point set is generated, where the new initial code point set includes the j+1th code point; if j+1<n, let j=j+1, execute the code point set partitioning procedure for the new initial code point set; if j+1=n, the new initial set of code points is taken as a target set of code points, and the set of code points corresponds to a non-pictogram.
For example, in the embodiment of the present application, the code point corresponding to the target character string is u+0031 U+1F93E U+200D U+2640 U+FE0F U+0031, and the number of all code points corresponding to the target character string is 6, which is taken as an example, to describe the implementation process of the step 203. Wherein U+0031 is a digital code point, U+1F93E is an expression code point, U+200D is a connection Fu Madian, U+2640 is an expression code point, U+FE0F is a modification Fu Madian, and U+0031 is a digital code point.
1. The first code point U +0031, which is a digital code point, is read. After the state machine reads the code point, the state machine state is determined to be the quasi-pictographic state.
2. An initial set of code points is obtained, the initial set of code points including a first code point.
3. And reading a second code point U+1F93E, wherein the code point is an expression code point. Since the code point is not a modifier code point, after the state machine reads the code point, the state machine state corresponding to the first code point is taken as a default state, and the state machine state corresponding to the second code point is determined to be a pictographic state. And the second code point does not belong to the initial set of code points where the first code point is located. At this time, the initial code point set including the first code point is used as the first target code point set, and the first target code point set corresponds to a non-pictogram. And generating a new set of initial code points including the second code point.
4. A third code point U +200D is read, which is a connector code point. After the state machine reads the code point, the state machine state is determined to be the connector state. At this time, it is determined that the code point belongs to an initial code point set, and a third code point is added to the initial code point set to obtain an updated initial code point set, where the updated initial code point set includes the second code point and the third code point.
5. A fourth code point U +2640 is read, which is an expression code point. After the state machine reads the code point, the state machine state is determined to be the pictographic state. At this time, it is determined that the code point belongs to an initial code point set, and a fourth code point is added to the initial code point set to obtain an updated initial code point set, where the updated initial code point set includes a second code point, a third code point and a fourth code point.
6. The fifth code point U + FE0F is read, which is the modifier code point. After the state machine reads the code point, the state machine state is determined to be the modifier state. At this time, it is determined that the code point belongs to an initial code point set, and a fifth code point is added to the initial code point set to obtain an updated initial code point set, where the updated initial code point set includes a second code point, a third code point, a fourth code point and the fifth code point.
7. A sixth code point U +0031 is read, which is a digital code point. Since the code point is not either of a modifier code point and a connector code point. After the state machine reads the code point, determining that the state machine state corresponding to the fifth code point is a default state, and the sixth code point does not belong to the initial code point set where the fifth code point is located. At this time, an initial code point set including a second code point, a third code point, a fourth code point and a fifth code point is used as a second target code point set, and the second target code point set corresponds to one pictogram. And generating a new initial set of code points, the new initial set of code points including a sixth code point. Since the target string includes 6 codepoints in total, the initial set of codepoints including the sixth codepoint is taken as a third set of target codepoints, and the third set of target codepoints corresponds to a non-pictogram.
In this embodiment of the present application, a character type corresponding to the target code point set may be determined according to the code points in the target code point set, where the character type is a pictogram or a non-pictogram.
Referring to the above example, the code point corresponding to the target string is u+0031 U+1F93E U+200D U+2640 U+FE0F U+0031, and the three target code point sets are obtained by dividing all the code points corresponding to the target string. Wherein the first set of code points corresponds to a non-pictogram, and the set of code points includes a code point. The second code point set corresponds to a pictogram, and the code point set comprises four code points. The third set of codepoints corresponds to a non-pictographic character, and the set of codepoints includes a codepoint.
Step 204, determining the number of the target code point sets as the length of the target character string.
And obtaining one or more target code point sets by dividing code points corresponding to characters in the target character string. Wherein each target code point set corresponds to one pictogram or non-pictogram. The number of target code point sets is set as the length of the target character string, that is, the length of each character is set to 1. Referring to the example in step 203, the code point corresponding to the target string is u+0031 U+1F93E U+200D U+2640 U+FE0F U+0031, and by dividing all the code points corresponding to the target string, three target code point sets are obtained, so that the length of the target string is 3.
It should be noted that, the sequence of the steps of the method for calculating the length of the character string provided in the embodiment of the present application may be appropriately adjusted, the steps may also be increased or decreased accordingly according to the situation, and any method that is easily conceivable to be changed by those skilled in the art within the technical scope disclosed in the present application should be covered within the protection scope of the present application, so that no further description is provided.
In summary, according to the method for calculating the length of the character string provided in the embodiment of the present application, by dividing all code points corresponding to the obtained target character string, one or more target code point sets are obtained, and then the number of the target code point sets is determined as the length of the target character string. Because each target code point set corresponds to one character in the target character string, the character can be pictogram or non-pictogram, the number of the target code point sets is determined as the length of the target character string, that is, the lengths of pictogram and non-pictogram are both determined as 1, and the accuracy of character display can be improved.
In addition, after the unified code coding specification is modified, when the character string length is calculated by using the character string length calculation method provided by the embodiment of the application, only the type of the code point in the state machine and the state of the state machine corresponding to the code point under each condition are required to be updated, so that maintenance operation is simplified. The state machine can be operated in various operating systems, and has wide application range.
Fig. 4 is a schematic structural diagram of a character string length calculating device according to an embodiment of the present application. The apparatus may be applied to a computer device. As shown in fig. 4, the apparatus 40 includes:
the obtaining module 401 is configured to obtain all code points corresponding to the target character string.
The dividing module 402 is configured to divide all the code points to obtain one or more target code point sets, where each target code point set includes one or more code points, each target code point set corresponds to a character, the character is a non-pictogram or a pictogram, each target code point set corresponding to the non-pictogram includes one code point, and each target code point set corresponding to the pictogram includes one or more code points.
A first determining module 403, configured to determine the number of target code point sets as the length of the target character string.
In summary, according to the character string length calculating device provided in the embodiment of the present application, all code points corresponding to the target character string obtained by the obtaining module are divided by the dividing module, so as to obtain one or more target code point sets, and then the number of the target code point sets is determined as the length of the target character string by the first determining module. Because each target code point set corresponds to one character in the target character string, the character can be pictogram or non-pictogram, the number of the target code point sets is determined as the length of the target character string, that is, the lengths of pictogram and non-pictogram are both determined as 1, and the accuracy of character display can be improved.
Optionally, all code points include one or more of a digital code point, an expressive code point, a country region code point, a modifier code point, and a connector code point.
Optionally, the number of all code points is n, n is a positive integer, and the dividing module 402 is configured to:
and acquiring an initial code point set, wherein the initial code point set comprises the j-th code point in all code points, and the initial value of j is 1.
Executing a code point set dividing flow on the initial code point set, wherein the code point set dividing flow comprises:
And reading the state machine state corresponding to the j+1th code point in all the code points. And determining whether the j+1th code point belongs to the initial code point set according to the state machine state corresponding to the j th code point and the state machine state corresponding to the j+1th code point.
When the j+1th code point belongs to the initial code point set, adding the j+1th code point into the initial code point set to obtain an updated initial code point set.
If j+1<n and the j+1th code point is not the country region code point, let j=j+1, and execute the code point set dividing procedure again for the updated initial code point set. If j+1<n and the j+1th code point is the country region code point, taking the updated initial code point set as a target code point set, generating a new code point set, wherein the new code point set comprises the j+2th code point, so that j=j+2, and executing a code point set dividing flow on the new initial code point set. If j+1=n, the updated initial code point set is taken as a target code point set.
When the j+1th code point does not belong to the initial code point set, the initial code point set is used as a target code point set, a new initial code point set is generated, and the j+1th code point is included in the new initial code point set.
If j+1<n, j=j+1, the code point set division process is performed on the new initial code point set. If j+1=n, the new initial set of code points is taken as a target set of code points.
Optionally, the dividing module 402 is configured to:
if the state of the state machine corresponding to the jth code point is the default state, then:
when the j+1th code point is the expression code point, determining the state machine state corresponding to the j+1th code point as the pictographic state. When the j+1th code point is a digital code point, determining that the state machine state corresponding to the j+1th code point is a quasi-pictographic state. When the j+1th code point is the country region code point, determining the state machine state corresponding to the j+1th code point as the country region state. And when the j+1th code point is not any one of the expression code point, the digital code point and the country region code point, determining the state machine state corresponding to the j+1th code point as a default state.
If the state machine state corresponding to the jth code point is a pictographic state or modifier state, then:
when the j+1th code point is the modifier code point, determining the state machine state corresponding to the j+1th code point as the modifier state. And when the j+1th code point is a connector code point, determining the state machine state corresponding to the j+1th code point as the connector state. When the j+1th code point is not any one of the modifier code point and the connector code point, the state machine state corresponding to the j+1th code point is read by taking the state machine state corresponding to the j-th code point as a default state.
If the state machine state corresponding to the j-th code point is the quasi-pictogram state, then:
when the j+1th code point is the modifier code point, determining the state machine state corresponding to the j+1th code point as the modifier state. When the j+1th code point is not the modifier code point, reading the state machine state corresponding to the j+1th code point by taking the state machine state corresponding to the j code point as a default state.
If the state machine state corresponding to the jth code point is the state of the national region, then:
when the j+1th code point is the country region code point, determining the state machine state corresponding to the j+1th code point as the country region state. When the j+1th code point is not the country region code point, reading the state machine state corresponding to the j+1th code point by taking the state machine state corresponding to the j code point as a default state.
If the state machine state corresponding to the jth code point is the connector state, then:
when the j+1th code point is the expression code point, determining the state machine state corresponding to the j+1th code point as the pictographic state. When the j+1th code point is a digital code point, determining that the state machine state corresponding to the j+1th code point is a quasi-pictographic state. When the j+1th code point is not any one of the expression code point and the digital code point, the state machine state corresponding to the j-1 th code point is taken as a default state, and after the state machine state corresponding to the j-1 th code point is read again, the state machine state corresponding to the j+1th code point is read.
Optionally, the dividing module 402 is configured to:
if the state of the state machine corresponding to the jth code point is the default state, then:
when the j+1th code point is an expression code point, a digital code point or a country region code point, it is determined that the j+1th code point belongs to the initial code point set. When the j+1th code point is not any one of the expression code point, the digital code point and the country region code point, it is determined that the j+1th code point does not belong to the initial code point set.
If the state machine state corresponding to the jth code point is a pictographic state or modifier state, then:
when the j+1th code point is a modifier code point or a connector code point, the j+1th code point is determined to belong to the initial code point set. When the j+1th code point is not any one of the modifier code point and the connector code point, it is determined that the j+1th code point does not belong to the initial code point set.
If the state machine state corresponding to the j-th code point is the quasi-pictogram state, then:
when the j+1th code point is a modifier code point, it is determined that the j+1th code point belongs to the initial code point set. When the j+1th code point is not the modifier code point, it is determined that the j+1th code point does not belong to the initial code point set.
If the state machine state corresponding to the jth code point is the state of the national region, then:
when the j+1th code point is the country region code point, it is determined that the j+1th code point belongs to the initial code point set. When the j+1th code point is not the country region code point, it is determined that the j+1th code point does not belong to the initial code point set.
If the state machine state corresponding to the jth code point is the connector state, then:
when the j+1th code point is an expression code point or a digital code point, determining that the j+1th code point belongs to the initial code point set. When the j+1th code point is not any one of the expression code point and the digital code point, it is determined that the j+1th code point does not belong to the initial code point set.
Optionally, the state machine initial state is a default state, as shown in fig. 5, and the apparatus 40 further includes a second determining module 404, where the second determining module 404 is configured to:
when the first code point in all the code points is the expression code point, determining that the state machine state corresponding to the first code point is the pictographic state. When the first code point is a digital code point, determining that the state machine state corresponding to the first code point is a quasi-pictographic state. When the first code point is the country region code point, determining that the state machine state corresponding to the first code point is the country region state. When the first code point is not any one of the expression code point, the digital code point and the national region code point, determining that the state machine state corresponding to the first code point is a default state.
Optionally, as shown in fig. 6, the apparatus 40 further includes:
and a third determining module 405, configured to determine a character type corresponding to the target code point set according to the code points in the target code point set.
In summary, according to the character string length calculating device provided in the embodiment of the present application, all code points corresponding to the target character string obtained by the obtaining module are divided by the dividing module, so as to obtain one or more target code point sets, and then the number of the target code point sets is determined as the length of the target character string by the first determining module. Because each target code point set corresponds to one character in the target character string, the character can be pictogram or non-pictogram, the number of the target code point sets is determined as the length of the target character string, that is, the lengths of pictogram and non-pictogram are both determined as 1, and the accuracy of character display can be improved.
In addition, after the unified code coding specification is modified, when the character string length calculating device provided by the embodiment of the application is used for calculating the character string length, only the type of the code point in the state machine and the state of the state machine corresponding to the code point under each condition are required to be updated, so that maintenance operation is simplified. The state machine can be operated in various operating systems, and has wide application range.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
The embodiment of the application provides a character string length calculating device, which comprises: a processor and a memory.
A memory for storing a computer program, the computer program comprising program instructions; and the processor is used for calling the computer program to realize the character string length calculation method shown in fig. 1 or 2.
The embodiment of the application also provides a computer storage medium, wherein the computer storage medium stores instructions which, when executed by a processor, implement the method for calculating the length of a character string shown in fig. 1 or fig. 2.
Fig. 7 is a block diagram of a character string length calculating device according to an embodiment of the present application. The string length calculation means may be the computer device 700.
In general, the computer device 700 includes: a processor 701 and a memory 702.
Processor 701 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 701 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 701 may also include a main processor, which is a processor for processing data in an awake state, also referred to as a CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 701 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 701 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.
Memory 702 may include one or more computer-readable storage media, which may be non-transitory. The memory 702 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 702 is used to store at least one instruction for execution by processor 701 to implement the string length calculation method provided by the method embodiments herein.
In some embodiments, the computer device 700 may further optionally include: a peripheral interface 703 and at least one peripheral. The processor 701, the memory 702, and the peripheral interface 703 may be connected by a bus or signal lines. The individual peripheral devices may be connected to the peripheral device interface 703 via buses, signal lines or a circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 704, a display 705, a camera assembly 706, audio circuitry 707, a positioning assembly 708, and a power supply 709.
A peripheral interface 703 may be used to connect I/O (Input/Output) related at least one peripheral device to the processor 701 and memory 702. In some embodiments, the processor 701, memory 702, and peripheral interface 703 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 701, the memory 702, and the peripheral interface 703 may be implemented on separate chips or circuit boards, which are not limited in this embodiment.
The Radio Frequency circuit 704 is configured to receive and transmit RF (Radio Frequency) signals, also referred to as electromagnetic signals. The radio frequency circuitry 704 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 704 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 704 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuitry 704 may communicate with other computer devices via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: the world wide web, metropolitan area networks, intranets, generation mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuitry 704 may also include NFC (Near Field Communication ) related circuitry, which is not limited in this application.
The display screen 705 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 705 is a touch display, the display 705 also has the ability to collect touch signals at or above the surface of the display 705. The touch signal may be input to the processor 701 as a control signal for processing. At this time, the display 705 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 705 may be one, providing a front panel of the computer device 700; in other embodiments, the display 705 may be at least two, disposed on different surfaces of the computer device 700 or in a folded design; in still other embodiments, the display 705 may be a flexible display disposed on a curved surface or a folded surface of the computer device 700. Even more, the display 705 may be arranged in a non-rectangular irregular pattern, i.e. a shaped screen. The display 705 may be made of LCD (Liquid Crystal Display ), OLED (Organic Light-Emitting Diode) or other materials.
The camera assembly 706 is used to capture images or video. Optionally, the camera assembly 706 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the computer device 700 and the rear camera is disposed on the back of the computer device. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 706 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.
The audio circuit 707 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and environments, converting the sound waves into electric signals, and inputting the electric signals to the processor 701 for processing, or inputting the electric signals to the radio frequency circuit 704 for voice communication. The microphone may be provided in a plurality of different locations of the computer device 700 for stereo acquisition or noise reduction purposes. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 701 or the radio frequency circuit 704 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, the audio circuit 707 may also include a headphone jack.
The location component 708 is operative to locate a current geographic location of the computer device 700 for navigation or LBS (Location Based Service, location-based services). The positioning component 708 may be a positioning component based on the United states GPS (Global Positioning System ), the Beidou system of China, or the Galileo system of Russia.
The power supply 709 is used to power the various components in the computer device 700. The power supply 709 may be an alternating current, a direct current, a disposable battery, or a rechargeable battery. When the power supply 709 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the computer device 700 also includes one or more sensors 710. The one or more sensors 710 include, but are not limited to: acceleration sensor 711, gyroscope sensor 712, pressure sensor 713, fingerprint sensor 714, optical sensor 715, and proximity sensor 716.
The acceleration sensor 711 can detect the magnitudes of accelerations on three coordinate axes of the coordinate system established with the computer device 700. For example, the acceleration sensor 711 may be used to detect the components of the gravitational acceleration in three coordinate axes. The processor 701 may control the touch display screen 705 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal acquired by the acceleration sensor 711. The acceleration sensor 711 may also be used for the acquisition of motion data of a game or a user.
The gyro sensor 712 may detect a body direction and a rotation angle of the computer device 700, and the gyro sensor 712 may collect a 3D motion of the user on the computer device 700 in cooperation with the acceleration sensor 711. The processor 701 may implement the following functions based on the data collected by the gyro sensor 712: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.
The pressure sensor 713 may be disposed on a side frame of the computer device 700 and/or on an underlying layer of the touch display screen 705. When the pressure sensor 713 is disposed at a side frame of the computer device 700, a grip signal of the computer device 700 by a user may be detected, and the processor 701 performs left-right hand recognition or quick operation according to the grip signal collected by the pressure sensor 713. When the pressure sensor 713 is disposed at the lower layer of the touch display screen 705, the processor 701 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 705. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.
The fingerprint sensor 714 is used to collect a fingerprint of the user, and the processor 701 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 714, or the fingerprint sensor 714 identifies the identity of the user according to the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the processor 701 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. The fingerprint sensor 714 may be provided on the front, back, or side of the computer device 700. When a physical key or vendor Logo is provided on the computer device 700, the fingerprint sensor 714 may be integrated with the physical key or vendor Logo.
The optical sensor 715 is used to collect the ambient light intensity. In one embodiment, the processor 701 may control the display brightness of the touch display 705 based on the ambient light intensity collected by the optical sensor 715. Specifically, when the intensity of the ambient light is high, the display brightness of the touch display screen 705 is turned up; when the ambient light intensity is low, the display brightness of the touch display screen 705 is turned down. In another embodiment, the processor 701 may also dynamically adjust the shooting parameters of the camera assembly 706 based on the ambient light intensity collected by the optical sensor 715.
A proximity sensor 716, also referred to as a distance sensor, is typically provided on the front panel of the computer device 700. The proximity sensor 716 is used to capture the distance between the user and the front of the computer device 700. In one embodiment, when the proximity sensor 716 detects a gradual decrease in the distance between the user and the front face of the computer device 700, the processor 701 controls the touch display 705 to switch from the bright screen state to the off screen state; when the proximity sensor 716 detects that the distance between the user and the front of the computer device 700 gradually increases, the processor 701 controls the touch display screen 705 to switch from the off-screen state to the on-screen state.
Those skilled in the art will appreciate that the architecture shown in fig. 7 is not limiting of the computer device 700, and may include more or fewer components than shown, or may combine certain components, or employ a different arrangement of components.
The foregoing description of the preferred embodiments is merely exemplary in nature and is in no way intended to limit the invention, its application, to the form and details of construction and the arrangement of the preferred embodiments, and thus, any and all modifications, equivalents, and alternatives falling within the spirit and principles of the present application.

Claims (10)

1. A method for calculating a string length, the method comprising:
acquiring all code points corresponding to a target character string, wherein the code points are code points corresponding to each character in unified code coding, one non-pictogram corresponds to one code point, and one pictogram corresponds to one code point or a plurality of code points;
dividing all the code points to obtain one or more target code point sets, wherein each target code point set comprises one or more code points, each target code point set corresponds to a character, the character is non-pictogram or pictogram, each target code point set corresponding to the non-pictogram comprises one code point, and each target code point set corresponding to the pictogram comprises one or more code points;
And determining the number of the target code point sets as the length of the target character string.
2. The method of claim 1, wherein the all codepoints include one or more of a digital codepoint, an expressive codepoint, a country region codepoint, a modifier codepoint, and a connector codepoint.
3. The method according to claim 1 or 2, wherein the number of all code points is n, n is a positive integer, and the dividing all code points to obtain one or more target code point sets includes:
acquiring an initial code point set, wherein the initial code point set comprises the j-th code point in all code points, and the initial value of j is 1;
executing a code point set dividing process on the initial code point set, wherein the code point set dividing process comprises the following steps:
reading state machine states corresponding to the j+1th code point in all the code points;
determining whether the j+1th code point belongs to the initial code point set according to the state machine state corresponding to the j th code point and the state machine state corresponding to the j+1th code point;
when the j+1th code point belongs to the initial code point set, adding the j+1th code point into the initial code point set to obtain an updated initial code point set,
If j+1<n and the j+1th code point is not the country region code point, making j=j+1, and executing the code point set dividing flow again on the updated initial code point set,
if j+1<n and the j+1th code point is a country region code point, taking the updated initial code point set as a target code point set, generating a new code point set, wherein the new code point set comprises the j+2th code point, so that j=j+2, executing the code point set dividing flow on the new initial code point set,
if j+1=n, taking the updated initial code point set as a target code point set;
when the j+1th code point does not belong to the initial code point set, taking the initial code point set as a target code point set, generating a new initial code point set, wherein the new initial code point set comprises the j+1th code point,
if j +1<n, let j=j +1, execute the code point set partitioning procedure on the new initial code point set,
and if j+1=n, taking the new initial code point set as a target code point set.
4. A method according to claim 3, wherein said reading the state machine state corresponding to the j+1th code point of said all code points comprises:
If the state of the state machine corresponding to the jth code point is a default state, then:
when the j+1th code point is an expression code point, determining that the state machine state corresponding to the j+1th code point is a pictographic state,
when the j+1th code point is a digital code point, determining that the state machine state corresponding to the j+1th code point is a quasi-pictographic state,
when the j+1th code point is a country region code point, determining that the state machine state corresponding to the j+1th code point is a country region state,
when the j+1th code point is not any one of the expression code point, the digital code point and the country region code point, determining that the state machine state corresponding to the j+1th code point is a default state;
if the state machine state corresponding to the jth code point is a pictographic state or a modifier state, then:
when the j+1th code point is a modifier code point, determining the state machine state corresponding to the j+1th code point as a modifier state,
when the j+1th code point is a connector code point, determining the state machine state corresponding to the j+1th code point as a connector state,
when the j+1th code point is not any one of the modifier code point and the connector code point, reading the state machine state corresponding to the j+1th code point by taking the state machine state corresponding to the j code point as a default state;
If the state machine state corresponding to the j-th code point is a quasi-pictographic state, then:
when the j+1th code point is a modifier code point, determining the state machine state corresponding to the j+1th code point as a modifier state,
when the j+1th code point is not the modifier code point, reading the state machine state corresponding to the j+1th code point by taking the state machine state corresponding to the j code point as a default state;
if the state machine state corresponding to the jth code point is the state of the national region, then:
when the j+1th code point is a country region code point, determining that the state machine state corresponding to the j+1th code point is a country region state,
when the j+1th code point is not the country region code point, reading the state machine state corresponding to the j+1th code point by taking the state machine state corresponding to the j code point as a default state;
if the state machine state corresponding to the j-th code point is a connector state, then:
when the j+1th code point is an expression code point, determining that the state machine state corresponding to the j+1th code point is a pictographic state,
when the j+1th code point is a digital code point, determining that the state machine state corresponding to the j+1th code point is a quasi-pictographic state,
And when the j+1th code point is not any one of the expression code point and the digital code point, taking the state machine state corresponding to the j-1 th code point as a default state, and reading the state machine state corresponding to the j+1th code point after reading the state machine state corresponding to the j-1 th code point again.
5. The method of claim 4, wherein determining whether the j+1 th code point belongs to the initial set of code points based on the state machine state corresponding to the j-th code point and the state machine state corresponding to the j+1 th code point comprises:
if the state of the state machine corresponding to the jth code point is a default state, then:
when the j+1th code point is an expression code point, a digital code point or a country region code point, determining that the j+1th code point belongs to the initial code point set,
when the j+1th code point is not any one of an expression code point, a digital code point and a country region code point, determining that the j+1th code point is not in the initial code point set;
if the state machine state corresponding to the jth code point is a pictographic state or a modifier state, then:
when the j+1th code point is a modifier code point or a connector code point, determining that the j+1th code point belongs to the initial code point set,
When the j+1th code point is not any one of a modifier code point and a connector code point, determining that the j+1th code point does not belong to the initial code point set;
if the state machine state corresponding to the j-th code point is a quasi-pictographic state, then:
when the j+1th code point is a modifier code point, determining that the j+1th code point belongs to the initial code point set,
when the j+1th code point is not a modifier code point, determining that the j+1th code point is not in the initial code point set;
if the state machine state corresponding to the jth code point is the state of the national region, then:
when the j+1th code point is a country region code point, determining that the j+1th code point belongs to the initial code point set,
when the j+1th code point is not a country region code point, determining that the j+1th code point is not in the initial code point set;
if the state machine state corresponding to the j-th code point is a connector state, then:
when the j+1th code point is an expression code point or a digital code point, determining that the j+1th code point belongs to the initial code point set,
and when the j+1th code point is not any one of the expression code point and the digital code point, determining that the j+1th code point does not belong to the initial code point set.
6. The method of claim 4 or 5, wherein the state machine initial state is a default state, the method further comprising:
when the first code point in all the code points is the expression code point, determining the state machine state corresponding to the first code point as the pictographic state,
when the first code point is a digital code point, determining that the state machine state corresponding to the first code point is a quasi-pictographic state,
when the first code point is the state area code point, determining that the state machine state corresponding to the first code point is the state of the state area,
and when the first code point is not any one of the expression code point, the digital code point and the national region code point, determining that the state machine state corresponding to the first code point is a default state.
7. The method according to claim 4 or 5, wherein after dividing all code points to obtain one or more target code point sets, the method further comprises:
and determining the character type corresponding to the target code point set according to the code points in the target code point set.
8. A character string length calculation apparatus, the apparatus comprising:
The system comprises an acquisition module, a code point generation module and a code point generation module, wherein the acquisition module is used for acquiring all code points corresponding to a target character string, the code points are code points corresponding to each character in unified code coding, one non-pictogram corresponds to one code point, and one pictogram corresponds to one code point or a plurality of code points;
the dividing module is used for dividing all the code points to obtain one or more target code point sets, wherein each target code point set comprises one or more code points, each target code point set corresponds to a character, the character is non-pictogram or pictogram, each target code point set corresponding to the non-pictogram comprises one code point, and each target code point set corresponding to the pictogram comprises one or more code points;
and the first determining module is used for determining the number of the target code point sets as the length of the target character string.
9. A character string length calculation apparatus, comprising: a processor and a memory;
the memory is used for storing a computer program, and the computer program comprises program instructions;
the processor is configured to invoke the computer program to implement the method for calculating a string length according to any one of claims 1 to 7.
10. A computer storage medium having instructions stored thereon which, when executed by a processor, implement the method of string length calculation of any of claims 1 to 7.
CN202010152674.9A 2020-03-06 2020-03-06 Character string length calculating method and device and computer storage medium Active CN111339735B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010152674.9A CN111339735B (en) 2020-03-06 2020-03-06 Character string length calculating method and device and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010152674.9A CN111339735B (en) 2020-03-06 2020-03-06 Character string length calculating method and device and computer storage medium

Publications (2)

Publication Number Publication Date
CN111339735A CN111339735A (en) 2020-06-26
CN111339735B true CN111339735B (en) 2023-06-20

Family

ID=71185957

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010152674.9A Active CN111339735B (en) 2020-03-06 2020-03-06 Character string length calculating method and device and computer storage medium

Country Status (1)

Country Link
CN (1) CN111339735B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104461054B (en) * 2014-12-16 2017-11-24 飞天诚信科技股份有限公司 A kind of input unit and method of restricted character string length
CN110134920B (en) * 2018-02-02 2023-10-17 中兴通讯股份有限公司 Pictogram compatible display method, device, terminal and computer readable storage medium
CN109933751B (en) * 2019-03-20 2021-07-20 腾讯科技(深圳)有限公司 Image-text drawing method and device, computer-readable storage medium and computer equipment

Also Published As

Publication number Publication date
CN111339735A (en) 2020-06-26

Similar Documents

Publication Publication Date Title
CN110502954B (en) Video analysis method and device
CN109712224B (en) Virtual scene rendering method and device and intelligent device
CN112907725B (en) Image generation, training of image processing model and image processing method and device
CN110944374B (en) Communication mode selection method and device, electronic equipment and medium
CN110288689B (en) Method and device for rendering electronic map
CN108734662B (en) Method and device for displaying icons
CN114579016A (en) Method for sharing input equipment, electronic equipment and system
CN110349527B (en) Virtual reality display method, device and system and storage medium
CN116871982A (en) Device and method for detecting spindle of numerical control machine tool and terminal equipment
CN109117466B (en) Table format conversion method, device, equipment and storage medium
CN112738606B (en) Audio file processing method, device, terminal and storage medium
CN111931712B (en) Face recognition method, device, snapshot machine and system
CN110992268B (en) Background setting method, device, terminal and storage medium
CN111128115B (en) Information verification method and device, electronic equipment and storage medium
CN110297684B (en) Theme display method and device based on virtual character and storage medium
CN117215990A (en) Inter-core communication method and device of multi-core chip and multi-core chip
CN110992954A (en) Method, device, equipment and storage medium for voice recognition
CN113538633B (en) Animation playing method and device, electronic equipment and computer readable storage medium
CN113824902B (en) Method, device, system, equipment and medium for determining time delay of infrared camera system
CN111339735B (en) Character string length calculating method and device and computer storage medium
CN111859549B (en) Method and related equipment for determining weight and gravity center information of single-configuration whole vehicle
CN111369434B (en) Method, device, equipment and storage medium for generating spliced video covers
CN111145723B (en) Method, device, equipment and storage medium for converting audio
CN114595019A (en) Theme setting method, device and equipment of application program and storage medium
CN112329909B (en) Method, apparatus and storage medium for generating neural network model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20210113

Address after: 511442 3108, 79 Wanbo 2nd Road, Nancun Town, Panyu District, Guangzhou City, Guangdong Province

Applicant after: GUANGZHOU CUBESILI INFORMATION TECHNOLOGY Co.,Ltd.

Address before: 511446 24 / F, building B-1, Wanda Plaza, Panyu District, Guangzhou City, Guangdong Province

Applicant before: GUANGZHOU HUADUO NETWORK TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20200626

Assignee: GUANGZHOU HUADUO NETWORK TECHNOLOGY Co.,Ltd.

Assignor: GUANGZHOU CUBESILI INFORMATION TECHNOLOGY Co.,Ltd.

Contract record no.: X2021440000054

Denomination of invention: String length calculation method and device, computer storage medium

License type: Common License

Record date: 20210208

EE01 Entry into force of recordation of patent licensing contract
GR01 Patent grant
GR01 Patent grant