CN107783956A - Composition method, electronic equipment and the computer-readable storage medium of text information - Google Patents
Composition method, electronic equipment and the computer-readable storage medium of text information Download PDFInfo
- Publication number
- CN107783956A CN107783956A CN201711182001.2A CN201711182001A CN107783956A CN 107783956 A CN107783956 A CN 107783956A CN 201711182001 A CN201711182001 A CN 201711182001A CN 107783956 A CN107783956 A CN 107783956A
- Authority
- CN
- China
- Prior art keywords
- letters
- chinese character
- alphabetical
- character
- group
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/189—Automatic justification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
Abstract
The invention discloses a kind of composition method of text information, electronic equipment and computer-readable storage medium, this method includes:The multiple Chinese characters included in text information are directed to respectively and the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, and obtain the character set corresponding with multiple Chinese characters and the set of letters corresponding with multiple phonetic alphabet;According to the spacing between adjacent letters, multiple letters in set of letters are divided into multiple alphabetical groups;The dividing mode of each letter group in set of letters is adjusted according to default regulation rule, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set;The each letter group included in set of letters is associated typesetting with corresponding to the Chinese character of the letter group in character set respectively.The present invention can realize information identification and the format conversion operation during typesetting, and can correspond each Chinese character and phonetic after typesetting.
Description
Technical field
The present invention relates to computer realm, and in particular to a kind of composition method of text information, electronic equipment and computer
Storage medium.
Background technology
With the increased popularity of e-book, increasing original books material is converted into e-book document, with convenient
User reads., it is necessary to be identified for the text information included in original books material in transfer process, and according to identification
Result afterwards carries out typesetting again.For example, because the space of a whole page of the file (such as PDF format file) of format typesetting is fixed, read
Shown all the time with original editor's format in read procedure, typesetting again will not be carried out after scaling automatically according to page width, it is not easily modified,
Security is higher, and is not limited by operating system platform.So many original books materials are format typesetting text
Part, correspondingly, it is necessary to which format type-setting document is converted to stream when user needs to enter edlin to the file of format typesetting
The file of formula typesetting, such as the file by the file translations of PDF format for WORD forms.
But during the present invention is realized, inventor has found that at least there are the following problems in the prior art:In form
During conversion, the entanglement of row or column often occurs so as to bring difficulty to identification process in text information.Especially work as text
In word information simultaneously when including Chinese character and the phonetic corresponding with Chinese character, because of situations such as the dislocation of phonetic and Chinese character, often
Recognition result is caused to malfunction, user, which must manually proofread, can carry out typesetting.As can be seen here, existing type-setting mode can not be directed to
Text information simultaneously comprising Chinese character and phonetic is accurately identified.
The content of the invention
In view of the above problems, it is proposed that the present invention so as to provide one kind overcome above mentioned problem or at least in part solve on
State the composition method, electronic equipment and computer-readable storage medium of the text information of problem.
According to an aspect of the invention, there is provided a kind of composition method of text information, including:Believe respectively for word
The multiple Chinese characters and the multiple phonetic alphabet corresponding with multiple Chinese characters included in breath are identified, and obtain and multiple Chinese character phases
Corresponding character set and the set of letters corresponding with multiple phonetic alphabet;According to the spacing between adjacent letters, by word
Multiple letters in superclass are divided into multiple alphabetical groups;According to default regulation rule to each alphabetical group in set of letters
Dividing mode be adjusted so that each alphabetical group included in set of letters correspond respectively to include in character set one
Individual Chinese character;The each letter group included in set of letters is closed with corresponding to the Chinese character of the letter group in character set respectively
Townhouse version.
According to another aspect of the present invention, there is provided a kind of electronic equipment, including:Processor, memory, communication interface and
Communication bus, processor, memory and communication interface complete mutual communication by communication bus;Memory is used to deposit extremely
A few executable instruction, executable instruction make to operate below computing device:The multiple Chinese included in text information are directed to respectively
Word and the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, obtain the character set corresponding with multiple Chinese characters with
And the set of letters corresponding with multiple phonetic alphabet;According to the spacing between adjacent letters, by multiple words in set of letters
Mother is divided into multiple alphabetical groups;The dividing mode of each letter group in set of letters is adjusted according to default regulation rule
It is whole, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set;Respectively by word
The each letter group included in superclass is associated typesetting with corresponding to the Chinese character of the letter group in character set.
According to another aspect of the invention, there is provided a kind of computer-readable storage medium, be stored with least one in storage medium
Executable instruction, executable instruction make to operate below computing device:Be directed to respectively in text information multiple Chinese characters for including with
And the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, obtain the character set corresponding with multiple Chinese characters and with
The corresponding set of letters of multiple phonetic alphabet;According to the spacing between adjacent letters, multiple letters in set of letters are drawn
It is divided into multiple alphabetical groups;The dividing mode of each letter group in set of letters is adjusted according to default regulation rule,
So that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set;Letter is collected respectively
The each letter group included in conjunction is associated typesetting with corresponding to the Chinese character of the letter group in character set.
According to the composition method, electronic equipment and computer-readable storage medium of text information provided by the invention, pass through difference
Be identified for the multiple Chinese characters and the multiple phonetic alphabet corresponding with multiple Chinese characters included in text information, obtain with
The corresponding character set of multiple Chinese characters and the set of letters corresponding with multiple phonetic alphabet, and according between adjacent letters
Spacing, multiple letters in set of letters are divided into multiple alphabetical groups;Then letter is collected according to default regulation rule
The dividing mode of each letter group in conjunction is adjusted, so that each alphabetical group included in set of letters corresponds respectively to the Chinese
The Chinese character included in word set;Finally respectively by each letter group included in set of letters with corresponding in character set
The Chinese character of the letter group is associated typesetting, so as to obtain the file of streaming typesetting.According to the solution of the present invention, Neng Goushi
Information identification and format conversion operation during existing typesetting, are not in the confusion of row and column, and after typesetting being made
Each Chinese character and phonetic correspond, and eliminate the process manually proofreaded, and can be directed to the word comprising Chinese character and phonetic simultaneously
Information is accurately identified.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention,
And can be practiced according to the content of specification, and in order to allow above and other objects of the present invention, feature and advantage can
Become apparent, below especially exemplified by the embodiment of the present invention.
Brief description of the drawings
By reading the detailed description of hereafter preferred embodiment, it is various other the advantages of and benefit it is common for this area
Technical staff will be clear understanding.Accompanying drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention
Limitation.And in whole accompanying drawing, identical part is denoted by the same reference numerals.In the accompanying drawings:
Fig. 1 shows the flow chart of the composition method for the text information that one embodiment of the invention provides;
Fig. 2 shows the flow chart of the composition method for the text information that another embodiment of the present invention provides;
Fig. 3 shows the structural representation of a kind of electronic equipment provided according to a further embodiment of the invention.
Embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in accompanying drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
Limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
Completely it is communicated to those skilled in the art.
Fig. 1 shows the flow chart of the composition method for the text information that one embodiment of the invention provides.As shown in figure 1,
This method comprises the following steps:
Step S110:The multiple Chinese characters included in text information and the multiple spellings corresponding with multiple Chinese characters are directed to respectively
Sound letter is identified, and obtains the character set corresponding with multiple Chinese characters and the letter corresponding with multiple phonetic alphabet collects
Close.
Wherein, above-mentioned text information can be the information of format typesetting, or the information of other type-setting modes.Due to
The multiple Chinese characters included in above-mentioned text information are multiple Chinese characters being located at a line or same row, and multiple Chinese characters are corresponding
Multiple phonetic alphabet are multiple phonetic alphabet being located at a line or same row.In this step, first in text information
Multiple Chinese characters positioned at same a line are identified, then to being located at multiple phonetic alphabet with a line corresponding to above-mentioned multiple Chinese characters
It is identified, is also taken for multiple phonetic alphabet corresponding to multiple Chinese characters positioned at same row and its positioned at same row
The mode of stating is identified.So as to obtain the character set corresponding with above-mentioned multiple Chinese characters and corresponding with multiple phonetic alphabet
Set of letters.Wherein, above-mentioned character set can be Chinese character row or Chinese character row, as long as can realize for one group
The storage of Chinese character, the present invention do not limit the specific implementation of character set.Similarly, above-mentioned phonetic set both can be with
Phonetic row can also be phonetic row.
Step S120:According to the spacing between adjacent letters, multiple letters in set of letters are divided into multiple letters
Group.
Due in two neighboring Chinese character respectively corresponding phonetic, phonetic corresponding to previous Chinese character last
Spacing between the initial of phonetic corresponding to letter and the latter Chinese character is more than the spacing between average adjacent letters, according to
This feature, spacing threshold values can be pre-set, when the spacing threshold values between two neighboring letter is more than default spacing threshold values
When, a separator is inserted between two adjacent letters, will be multiple in above-mentioned set of letters by inserting separator
Letter is divided into multiple alphabetical groups.Wherein, above-mentioned default spacing threshold determines according to an alphabetical mean breadth, and/or, on
The average headway stated between multiple letters of the default spacing threshold in set of letters determines.The average width of said one letter
Degree can be calculated according to the number of all alphabetical width summations and letter, can also be true by other methods
It is fixed.Alternatively, the average headway that default spacing threshold values can be between adjacent letters, either for slightly less than or more than phase
The distance values of average headway between adjacent letter, it can also be not limited thereto according to the specific setting of other values.
Step S130:The dividing mode of each letter group in set of letters is adjusted according to default regulation rule
It is whole, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set.
Because each phonetic needs to compare with corresponding Chinese character, multiple letters in set of letters are being divided into multiple words
, it is necessary to be verified to it after female group, and according to division of the default regulation rule to each letter group in set of letters
Mode is adjusted, so that each alphabetical group included in a set of letters Chinese for corresponding respectively to include in character set
Word.On the one hand, in format typesetting when the phonetic alphabet corresponding to two Chinese characters are long, and the phonetic corresponding to two Chinese characters
When gap ratio in letter between two adjacent letters is smaller, one alphabetical group obtained is probably that two Chinese characters correspond to phonetic
Letter, regulation rule now can be respectively for each alphabetical group, and the alphabetical quantity for judging to include in this alphabetical group is
It is no to be more than default phonetic quantity;If so, the alphabetical assembling and dismantling are divided at least two alphabetical groups, so that each alphabetical group after splitting
The Chinese character included is corresponded respectively in the character set.Due to the rule of phonetic in itself, its alphabetical quantity included
No more than 6, therefore, default phonetic quantity above-mentioned in the present embodiment could be arranged to 6.On the other hand, in Chinese character
Rule in, some Chinese characters do not have " youngster " word in corresponding phonetic, such as youngster's speech.Therefore, in alphabetical group obtained
In phonetic alphabet group corresponding to other words may correspond on the position of the Chinese character of no phonetic, now need to collect letter
The dividing mode of each letter group in conjunction is adjusted.Then, default regulation rule can include:Judge to wrap in character set
It whether there is tone-off Chinese character in the multiple Chinese characters contained;If so, blank letter group is inserted in set of letters, so that blank letter group
Corresponding to the tone-off Chinese character, so that each alphabetical group included in set of letters correspond respectively to include in character set one
Individual Chinese character.
In addition to above-mentioned regulation rule, those skilled in the art can also flexibly set other regulation rules, this
Invention is not limited this.
Step S140:Respectively by each letter group included in set of letters with corresponding to the letter group in character set
Chinese character is associated typesetting.
By each alphabetical group included in set of letters correspond respectively to a Chinese character being included in character set and then
The each letter group included in set of letters is associated typesetting with corresponding to the Chinese character of the letter group in character set respectively,
To obtain that the file of edit-modify can be carried out by user.So-called association typesetting refers to:The each word that will be included in set of letters
Mother's group is associated with corresponding to the Chinese character of the letter group in character set, then organizes the letter after association with Chinese character using upper
Lower relative or relative left and right mode carries out typesetting.
The composition method of the text information provided according to the present embodiment, by multiple for being included in text information respectively
Chinese character and the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, and obtain the character set corresponding with multiple Chinese characters
And the set of letters corresponding with multiple phonetic alphabet, and according to the spacing between adjacent letters, will be more in set of letters
Individual letter is divided into multiple alphabetical groups;Then the division side according to default regulation rule to each letter group in set of letters
Formula is adjusted, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set;
Finally each letter group included in set of letters is associated with corresponding to the Chinese character of the letter group in character set respectively
Typesetting, so as to obtain the file of streaming typesetting.According to the method for the present embodiment, it can realize that the information during typesetting is known
Not and format conversion operation, be not in the confusion of row and column, and a pair of each Chinese character and phonetic 1 after typesetting can be made
Should, the process manually proofreaded is eliminated, the text information simultaneously comprising Chinese character and phonetic can be directed to and accurately identified.
Fig. 2 shows the flow chart of the composition method for the text information that another embodiment of the present invention provides.Such as Fig. 2 institutes
Show, this method comprises the following steps:
Step S210:The multiple Chinese characters included in text information and the multiple spellings corresponding with multiple Chinese characters are directed to respectively
Sound letter is identified, and obtains the character set corresponding with multiple Chinese characters and the letter corresponding with multiple phonetic alphabet collects
Close.
Wherein, in the present embodiment, above-mentioned text information is the information of format typesetting.Due to being included in above-mentioned text information
Multiple Chinese characters be multiple Chinese characters being located at a line or same row, and the corresponding multiple phonetic alphabet of the multiple Chinese character are
Multiple phonetic alphabet being located at a line or same row.In this step, first to being located at the more of a line in text information
Individual Chinese character is identified, and then the multiple phonetic alphabet being located at corresponding to above-mentioned multiple Chinese characters with a line are identified, so as to
Obtain and the above-mentioned corresponding character set of multiple Chinese characters and the set of letters corresponding with multiple phonetic alphabet.The above-mentioned Chinese
Word set can use all kinds of modes such as Chinese character row, Chinese character row or Chinese character memory cell to realize that the present invention is not limited this, together
Reason, phonetic set can also use all kinds of modes such as phonetic row, phonetic row or phonetic memory cell to realize.
Step S220:Whether the spacing between judging per two adjacent letters is more than default spacing threshold.
Due in two neighboring Chinese character respectively corresponding phonetic, phonetic corresponding to previous Chinese character last
Spacing between the initial of phonetic corresponding to letter and the latter Chinese character is more than the spacing between average adjacent letters, according to
This feature, spacing threshold values can be pre-set, and it is default to judge whether the spacing between every two adjacent letters is more than
Spacing threshold.Wherein, above-mentioned default spacing threshold determines according to an alphabetical mean breadth, and/or, above-mentioned default spacing threshold
Average headway between multiple letters of the value in set of letters determines.The mean breadth of said one letter can be according to institute
There are the width summation of letter and the number of letter to be calculated, can also be determined by other methods.Alternatively, it is default
Spacing threshold values can be between adjacent letters average headway, either for slightly less than or more than being averaged between adjacent letters
The distance values of spacing, it can also be not limited thereto according to the specific setting of other values.
Step S230:If so, a separator is inserted between two adjacent letters, by each separator by word
Multiple letters in superclass are divided into multiple alphabetical groups.
If it is not, then to this alphabetical group without splitting.If so, a separation is inserted between two adjacent letters
Multiple letters in above-mentioned set of letters are divided into multiple alphabetical groups by symbol by inserting separator.Wherein, alphabetical group refers to:
The monogram that some letters opened by separators are formed.
During step S220 and step S230 is performed, it is also noted that following situations:Corresponding to same Chinese character
Due to the mistake of format typesetting between certain two letter of phonetic, it is possible that the spacing between two letters is larger, such as
Fruit inserts separator in two letters corresponding to the Chinese character, then the situation of phonetic and Chinese character dislocation occurs.So often
When the spacing between judging two adjacent letters is more than default spacing threshold, further determined according to text information multiple
Position relationship between Chinese character and multiple phonetic alphabet corresponding with multiple Chinese characters;Judged according to above-mentioned position relationship adjacent
Two letters whether correspond to same Chinese character, if it is not, between two adjacent letters insert a separator;If so,
Then separator is not inserted between two adjacent letters.Wherein, position relationship refers to:Each Chinese character and letter are in format
In the text information of typesetting (i.e. change before original document in) corresponding to position relationship.When it is implemented, it can obtain respectively
The position coordinate value of each Chinese character and each letter in the text information of format typesetting is taken, and analyzes Chinese character and letter accordingly
Between position relationship.When two adjacent letters are respectively positioned on above or below same Chinese character, it is determined that this is adjacent
Two letters correspond to same Chinese character.
Step S240:The dividing mode of each letter group in set of letters is adjusted according to default regulation rule
It is whole, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set.
Because each phonetic needs to compare with corresponding Chinese character, multiple letters in set of letters are being divided into multiple words
, it is necessary to be verified to it after female group, the division side according to default regulation rule to each letter group in set of letters
Formula is adjusted, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set.
Wherein, default regulation rule mainly includes following both sides rule:
The rule of first aspect is:The rule organized for splitting the letter being made up of multiple Chinese characters.Such as when appearance two
When letter group corresponding to Chinese character is long, and when gap ratio between adjacent two letters is smaller, an obtained letter
Group is probably phonetic alphabet corresponding to two Chinese characters, such as the phonetic " adjacent two in duan zhuang " of Chinese character " dignity "
Spacing in alphabetical " n " and " z " is smaller, and two such letter group " duan " may be divided into a word with " zhuang "
Female group " duanzhuang ", now need to be adjusted it.Then above-mentioned default regulation rule can include:Respectively for every
Individual alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic quantity;If so, by the alphabetical assembling and dismantling
It is divided at least two alphabetical groups, so that an each alphabetical group Chinese character for corresponding respectively to include in character set after splitting.
Wherein, presetting phonetic quantity can determine according to the maximum length and/or average length of phonetic transcriptions of Chinese characters.In the present embodiment, send out
A person of good sense has found during the present invention is realized:The alphabetical quantity included in phonetic corresponding to any Chinese character is not over 6
Individual, therefore, default phonetic quantity above-mentioned in the present embodiment is arranged to 6.It is such as alphabetical in alphabetical group " duanzhuang "
Quantity is 10, now needs to be split as two alphabetical group so that each alphabetical group after splitting correspond respectively to it is described
The Chinese character included in character set.Further, it is above-mentioned that the alphabetical assembling and dismantling are divided at least two alphabetical groups, so as to split
The step of each alphabetical group afterwards corresponds respectively in the character set Chinese character included specifically includes:Determine Chinese Character Set
The first Chinese character corresponding with the letter group included in conjunction, inquires about the phonetic transcriptions of Chinese characters corresponding to first Chinese character;According to the first Chinese
The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters corresponding to word, so that each letter group after splitting is right respectively
The Chinese character that should be included in character set.Such as alphabetical group for the dislocation corresponding to " dignity " two word
" duanzhuang ", first Chinese character " end " word in character set is first determined whether out, then inquire about the Chinese character corresponding to " end " word
Phonetic " duan ", the method for inquiry can be searched from the phonetic storehouse of Microsoft, so as to by the phonetic of obtained first Chinese character
" duan " goes to match with alphabetical group " duanzhuang ", " duan " after the match is successful in alphabetical group " duanzhuang "
Separator is inserted afterwards, and alphabetical group " duanzhuang " is split as two alphabetical group " duan " and " zhuang ", so that
" end " word and " village " word that each alphabetical group " duan ", " zhuang " after fractionation are corresponded respectively in character set.Work as appearance
When phonetic alphabet group corresponding to more Chinese characters is connected, the phonetic alphabet group corresponding to first Chinese character and set of letters are matched
Afterwards, remaining Chinese character after first Chinese character is also matched one by one with set of letters, so as to improve accuracy.Such as Chinese Character Set
The phonetic corresponding to " end ", " village ", " virtuous " in conjunction is all connected to one piece, when first Chinese character " end " and alphabetical group
After " duanzhuangxian " matches, further the phonetic alphabet group corresponding to " village " and " virtuous " is collected with letter respectively
Conjunction matches, so as to which alphabetical group " duanzhuangxian " is split as at least two alphabetical groups, so that each letter group difference
Corresponding to the Chinese character included in character set.
To be polyphone also be present due to Chinese character, so, further, in the Chinese character according to corresponding to first Chinese character
The alphabetical assembling and dismantling are divided into during at least two alphabetical groups by phonetic, it is also necessary to further judge that above-mentioned first Chinese character institute is right
Whether the phonetic transcriptions of Chinese characters answered is unique, if so, then match simultaneously basis by the phonetic transcriptions of Chinese characters corresponding to above-mentioned Chinese character and alphabetical group
The alphabetical assembling and dismantling are divided at least two alphabetical groups by matching result.If it is not, then judge multiple phonetics corresponding to first Chinese character
Whether length (alphabetical quantity), if equally, directly can be matched according to the length of phonetic and alphabetical group, from
And save the time of matching.Such as multiple phonetics " zhang " and " chang " corresponding to polyphone " length ", the length of its phonetic
Degree is the same, directly can be matched according to the length of above-mentioned two phonetic and alphabetical group, when can so save matching
Between.When the length of multiple phonetics corresponding to interpretation goes out polyphone is different, such as multiple spellings corresponding to polyphone " all "
Sound be " du ", " dou ", it is necessary to each phonetic transcriptions of Chinese characters corresponding to above-mentioned first Chinese character is matched with alphabetical group, according to
The alphabetical assembling and dismantling are divided at least two alphabetical groups with result.Certainly, when the length of multiple phonetics corresponding to polyphone is the same
When, each phonetic transcriptions of Chinese characters corresponding to above-mentioned first Chinese character can also be matched with alphabetical group, should according to matching result
Alphabetical assembling and dismantling are divided at least two alphabetical groups to improve the accuracy rate of matching.Specific matching way, those skilled in the art can bases
Actual conditions are specifically chosen, are not limited thereto.
Further, the mode of the first Chinese character corresponding with the letter group included in above-mentioned determination character set is divided into
Following two ways.
Mode one:Determined according to text information between multiple Chinese characters and multiple phonetic alphabet corresponding with multiple Chinese characters
Position relationship;According to the position relationship, the first Chinese character corresponding with the letter group included in character set is determined.Its
In, position relationship refers to:Each Chinese character and letter in the text information of format typesetting (i.e. change before original document in)
Corresponding position relationship.When it is implemented, each Chinese character during the text information of format typesetting can be obtained respectively and every
Individual alphabetical position coordinate value, and the position relationship between Chinese character and letter is analyzed accordingly.When two adjacent letters are respectively positioned on
When above or below same Chinese character, it is determined that two adjacent letters correspond to same Chinese character.Such as at alphabetical group
Letter " u " and alphabetical " n " in " duanzhuang " are respectively positioned on the top at Chinese character " end ", so as to judge Chinese character " dignity "
In the first Chinese character corresponding with phonetic alphabet group " duanzhuang " that includes be " end ".
Mode two:The Chinese character corresponding with the letter group upper one alphabetical group that be being included in character set is determined, by the Chinese
Next Chinese character that the upper letter included in word set organizes corresponding Chinese character is defined as the head corresponding with the letter group
Individual Chinese character.The purpose of the manner is, when in the file of format typesetting dislocation than it is more serious when, the position in mode one is closed
System is when can determine that the first Chinese character corresponding with the letter group included in character set, so as to according to mode two come true
Determine the first Chinese character corresponding with the letter group included in character set, improve accuracy.Specifically, it is determined that character set
In the corresponding Chinese character of the upper one checked letter group of the letter group with needing to verify that includes, so as to by character set
Comprising next Chinese character of the corresponding Chinese character of upper one checked letter group of the letter group with needing to verify be defined as
The first Chinese character corresponding with the letter group.Such as when obtained character set is " she is the virtuous woman of a dignity ", when
It needs to be determined that during first Chinese character in the character set to misplace corresponding to alphabetical group " duanzhuang ", alphabetical group will have been verified
Next Chinese character " end " of Chinese character " individual " corresponding to " ge " is defined as needing alphabetical group " duanzhuang " adjusting corresponding
Chinese character first Chinese character.
In addition, when first Chinese character is polyphone, it is also possible to following special circumstances be present:Spelled assuming that first Chinese character has
Sound one and phonetic two, and the phonetic of next Chinese character corresponding to first Chinese character is phonetic three, correct pronunciation combination should be spelling
The combination of sound one and phonetic three, still, because the subalphbet in phonetic two is identical with phonetic one, the part in phonetic three in addition
Letter is combined with the subalphbet in phonetic two when can form a phonetic transcriptions of Chinese characters just, may mistakenly will be first
The phonetic of Chinese character is identified as phonetic two, in turn results in mistake.In order to prevent such mistake, now need above-mentioned first Chinese character institute
The phonetic transcriptions of Chinese characters of corresponding next Chinese character is matched with above-mentioned alphabetical group, i.e.,:First Chinese character is not only matched, it is also further
The each Chinese character of matching thereafter, so as to improve accuracy.
The rule of second aspect is:For identifying the rule of tone-off Chinese character.Specifically, in the sentence that Chinese character is formed,
Some Chinese characters are no corresponding phonetic, and such Chinese character is properly termed as tone-off Chinese character.Such as " youngster " in youngster's speech
Word, if not handling this kind of special circumstances now, the dislocation of Chinese character and phonetic can be caused.Therefore, it is necessary to be directed to this
Situation is adjusted to the dividing mode of each letter group in set of letters.Correspondingly, default regulation rule can include:
It whether there is tone-off Chinese character in the multiple Chinese characters for judging to include in character set;If so, blank letter is inserted in set of letters
Group, so that blank letter group corresponds to tone-off Chinese character, so that each alphabetical group included in set of letters corresponds respectively to the Chinese
The Chinese character included in word set.Further, it whether there is nothing in the multiple Chinese characters included in above-mentioned judgement character set
Sound Chinese character can be realized by following two ways:
Mode one:Determined according to text information between multiple Chinese characters and multiple phonetic alphabet corresponding with multiple Chinese characters
Position relationship;According to the position relationship, judge that the position to intersect vertically corresponding to each Chinese character whether there is phonetic respectively
Letter;If it is not, the Chinese character is defined as tone-off Chinese character.Specifically, position relationship refers to:Each Chinese character and letter are arranged in format
Version text information in (i.e. change before original document in) corresponding to position relationship.When it is implemented, it can obtain respectively
The position coordinate value of each Chinese character in the text information of format typesetting and each letter, and Chinese character and letter are analyzed accordingly
Between position relationship.When two adjacent letters are respectively positioned on above or below same Chinese character, it is determined that this adjacent two
Individual letter corresponds to same Chinese character.According to above-mentioned position relationship, when the top in one of Chinese character is without in contrast
The phonetic alphabet answered, then can be determined that the Chinese character is tone-off Chinese character.
Mode two:The each Chinese character included in character set is directed to respectively, inquires about the phonetic transcriptions of Chinese characters corresponding to the Chinese character;Sentence
Whether alphabetical group to match with the phonetic transcriptions of Chinese characters is included in above-mentioned set of letters of breaking;If it is not, the Chinese character is defined as the tone-off Chinese
Word.The purpose of the manner is, due to typesetting, when multiple Chinese characters phonetic word corresponding with its in the file of format typesetting
Dislocation between mother than it is more serious when, can be with when the mode according to position one can not determine tone-off Chinese character
Tone-off Chinese character is determined so as to improve accuracy according to mode two.Specifically, each Chinese character that will can be included in character set
Corresponding phonetic transcriptions of Chinese characters goes to match with alphabetical group in set of letters, so as to judge in above-mentioned set of letters whether to include with
Match alphabetical group of the phonetic transcriptions of Chinese characters.
Further, blank letter group is inserted in set of letters described above, so that blank letter group corresponds to institute
The step of stating tone-off Chinese character specifically includes:The adjacent Chinese characters of tone-off Chinese character are searched, are determined and adjacent Chinese characters phase in set of letters
Corresponding adjacent letters group;The position adjacent with the adjacent letters group insertion blank letter group in set of letters.Wherein, above-mentioned sky
The female group of wrongly written or mispronounced character is identified by default blank to be represented, default blank mark can be the symbols such as " * " " # ", can also be other
Specific symbol.After blank letter group is inserted, also separator is inserted between adjacent letters group, so as to by the tone-off Chinese
The phonetic alphabet corresponding to blank mark and adjacent Chinese characters corresponding to word are separated.
Step S250:Respectively by each letter group included in set of letters with corresponding to the letter group in character set
Chinese character is associated typesetting.
By each alphabetical group included in set of letters correspond respectively to a Chinese character being included in character set and then
By streaming type-setting mode respectively by each letter group included in set of letters with corresponding to the letter group in character set
Chinese character is associated typesetting, to obtain that the file of edit-modify can be carried out by user.Wherein, streaming typesetting refers to document bag
Word, numeral, form and the graph image contained carries out specific type-setting mode processing, and the content after preservation is original editor's member
Element, user can view the typesetting style after editor by ocr software, and can be adaptive between different zoom ratios
Space of a whole page size is shown.
The composition method of the text information provided according to the present embodiment, by multiple for being included in text information respectively
Chinese character and the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, and obtain the character set corresponding with multiple Chinese characters
And the set of letters corresponding with multiple phonetic alphabet, and it is whether big by the spacing between judging per two adjacent letters
In default spacing threshold, and a separator is inserted between two adjacent letters, by each separator by set of letters
In multiple letters be divided into multiple alphabetical groups.Then according to default regulation rule to each letter group in set of letters
Dividing mode is adjusted, on the one hand, solves the problems, such as that the corresponding phonetic of the multiple Chinese characters occurred in format typesetting is connected;
On the other hand, solve due to occurring tone-off word in Chinese character and cause the Chinese character after streaming typesetting can not be with its phonetic one
One it is corresponding the problem of.So that each alphabetical group included in a set of letters Chinese for corresponding respectively to include in character set
Word, finally each letter group included in set of letters is closed with corresponding to the Chinese character of the letter group in character set respectively
Townhouse version, to obtain the file of streaming typesetting.The method provided according to embodiment, can be converted to streaming by format type-setting document
The file of typesetting, and each Chinese character of streaming type-setting document and phonetic enabled to is mutually corresponding one by one, facilitates user
Editor, it is time saving and energy saving.Which realizes accurately identifying for the text information comprising phonetic, without artificial check and correction, lifting
The efficiency and accuracy of phonetic identification.
Another embodiment of the application provides a kind of nonvolatile computer storage media, and the computer-readable storage medium is deposited
An at least executable instruction is contained, the computer executable instructions can perform the text information in above-mentioned any means embodiment
Composition method.
Executable instruction specifically can be used for so that being operated below computing device:It is directed to what is included in text information respectively
Multiple Chinese characters and the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, and obtain the Chinese character corresponding with multiple Chinese characters
Set and the set of letters corresponding with multiple phonetic alphabet;According to the spacing between adjacent letters, by set of letters
Multiple letters are divided into multiple alphabetical groups;Dividing mode according to default regulation rule to each letter group in set of letters
It is adjusted, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set;Point
The each letter group included in set of letters typesetting is not associated with corresponding to the Chinese character of the letter group in character set.
In a kind of optional mode, executable instruction further makes to operate below computing device:Judge per adjacent
Whether the spacing between two letters is more than default spacing threshold;
If so, a separator is inserted between two adjacent letters, by each separator by set of letters
Multiple letters be divided into multiple alphabetical groups;
Wherein, spacing threshold is preset to be determined according to an alphabetical mean breadth, and/or, spacing threshold is preset according to word
The average headway between multiple letters in superclass determines.
In a kind of optional mode, executable instruction further makes to operate below computing device:Whenever judging phase
When spacing between two adjacent letters is more than default spacing threshold, according to text information determine multiple Chinese characters and with multiple Chinese
Position relationship between the corresponding multiple phonetic alphabet of word;
Judge whether two adjacent letters correspond to same Chinese character according to position relationship, if it is not, at adjacent two
A separator is inserted between letter.
In a kind of optional mode:Wherein, default regulation rule includes:
Respectively for each alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic number
Amount;
If so, the alphabetical assembling and dismantling are divided at least two alphabetical groups, so that each alphabetical group after splitting corresponds respectively to
The Chinese character included in character set.
In a kind of optional mode, executable instruction further makes to operate below computing device:Determine character set
In the first Chinese character corresponding with the letter group that includes, inquire about the phonetic transcriptions of Chinese characters corresponding to first Chinese character;
The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters according to corresponding to first Chinese character, so that after splitting
An each alphabetical group Chinese character for corresponding respectively to include in character set.
In a kind of optional mode, executable instruction further makes to operate below computing device:
Judge whether the phonetic transcriptions of Chinese characters corresponding to first Chinese character is unique;
If it is not, each phonetic transcriptions of Chinese characters corresponding to first Chinese character is matched with alphabetical group respectively, according to matching result
The alphabetical assembling and dismantling are divided at least two alphabetical groups.
In a kind of optional mode, executable instruction further makes to operate below computing device:
After each phonetic transcriptions of Chinese characters corresponding to first Chinese character is matched with alphabetical group respectively, further comprise:
The phonetic transcriptions of Chinese characters of next Chinese character corresponding to first Chinese character is matched with alphabetical group.
In a kind of optional mode, executable instruction further makes to operate below computing device:
Position between multiple Chinese characters and multiple phonetic alphabet corresponding with multiple Chinese characters is determined according to text information
Relation;According to the position relationship, the first Chinese character corresponding with the letter group included in character set is determined;And/or
The Chinese character corresponding with the letter group upper one alphabetical group that be being included in character set is determined, by character set
Comprising a upper letter next Chinese character for organizing corresponding Chinese character be defined as the first Chinese character corresponding with the letter group.
In a kind of optional mode, default regulation rule includes:
It whether there is tone-off Chinese character in the multiple Chinese characters for judging to include in character set;
If so, blank letter group is inserted in set of letters, so that blank letter group corresponds to tone-off Chinese character.
In a kind of optional mode, executable instruction further makes to operate below computing device:
Position between multiple Chinese characters and multiple phonetic alphabet corresponding with multiple Chinese characters is determined according to text information
Relation;
According to the position relationship, judge that the position to intersect vertically corresponding to each Chinese character whether there is phonetic word respectively
It is female;
If it is not, the Chinese character is defined as tone-off Chinese character.
In a kind of optional mode, executable instruction further makes to operate below computing device:
The each Chinese character included in character set is directed to respectively, inquires about the phonetic transcriptions of Chinese characters corresponding to the Chinese character;
Whether judge in set of letters comprising alphabetical group to match with the phonetic transcriptions of Chinese characters;If it is not, the Chinese character is defined as
Tone-off Chinese character.
In a kind of optional mode, executable instruction further makes to operate below computing device:
The adjacent Chinese characters of tone-off Chinese character are searched, the adjacent letters group corresponding with adjacent Chinese characters is determined in set of letters;
The position adjacent with the adjacent letters group insertion blank letter group in set of letters;
Wherein, blank letter group is identified by default blank represents.
In a kind of optional mode, the multiple Chinese characters included in text information are multiple be located at a line or same row
Chinese character, and the corresponding multiple phonetic alphabet of multiple Chinese characters are multiple phonetic alphabet being located at a line or same row.
In a kind of optional mode, text information is the information that typesetting is carried out by format type-setting mode;
Executable instruction further makes to operate below computing device:By streaming type-setting mode, respectively by set of letters
In include it is each letter group with character set in correspond to the letter group Chinese character be associated typesetting.
Fig. 3 shows the structural representation of a kind of electronic equipment provided according to a further embodiment of the invention, the present invention
Specific embodiment is not limited the specific implementation of electronic equipment.
As shown in figure 3, the electronic equipment can include:Processor (processor) 302, communication interface
(Communications Interface) 304, memory (memory) 306 and communication bus 308.
Wherein:Processor 302, communication interface 304 and memory 306 complete mutual lead to by communication bus 308
Letter.Communication interface 304, for being communicated with the network element of miscellaneous equipment such as client or other servers etc..Processor 302, use
In configuration processor 310, the correlation step in the composition method embodiment of above-mentioned text information can be specifically performed.
Specifically, program 310 can include program code, and the program code includes computer-managed instruction.
Processor 302 is probably central processor CPU, or specific integrated circuit ASIC (Application
Specific Integrated Circuit), or it is arranged to implement the integrated electricity of one or more of the embodiment of the present invention
Road.The one or more processors that electronic equipment includes, can be same type of processor, such as one or more CPU;Also may be used
To be different types of processor, such as one or more CPU and one or more ASIC.
Memory 306, for depositing program 310.Memory 306 may include high-speed RAM memory, it is also possible to also include
Nonvolatile memory (non-volatile memory), for example, at least a magnetic disk storage.
Program 310 specifically can be used for so that processor 302 performs following operation:It is directed to what is included in text information respectively
Multiple Chinese characters and the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, and obtain the Chinese character corresponding with multiple Chinese characters
Set and the set of letters corresponding with multiple phonetic alphabet;According to the spacing between adjacent letters, by set of letters
Multiple letters are divided into multiple alphabetical groups;Dividing mode according to default regulation rule to each letter group in set of letters
It is adjusted, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set;Point
The each letter group included in set of letters typesetting is not associated with corresponding to the Chinese character of the letter group in character set.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Judge per adjacent
Two letters between spacing whether be more than default spacing threshold;
If so, a separator is inserted between two adjacent letters, by each separator by set of letters
Multiple letters be divided into multiple alphabetical groups;
Wherein, spacing threshold is preset to be determined according to an alphabetical mean breadth, and/or, spacing threshold is preset according to word
The average headway between multiple letters in superclass determines.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Whenever judging
When spacing between two adjacent letters is more than default spacing threshold, according to text information determine multiple Chinese characters and with it is multiple
Position relationship between the corresponding multiple phonetic alphabet of Chinese character;
Judge whether two adjacent letters correspond to same Chinese character according to position relationship, if it is not, at adjacent two
A separator is inserted between letter.
In a kind of optional mode:Wherein, default regulation rule includes:
Respectively for each alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic number
Amount;
If so, the alphabetical assembling and dismantling are divided at least two alphabetical groups, so that each alphabetical group after splitting corresponds respectively to
The Chinese character included in character set.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Determine Chinese Character Set
The first Chinese character corresponding with the letter group included in conjunction, inquires about the phonetic transcriptions of Chinese characters corresponding to first Chinese character;
The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters according to corresponding to first Chinese character, so that after splitting
An each alphabetical group Chinese character for corresponding respectively to include in character set.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:
Judge whether the phonetic transcriptions of Chinese characters corresponding to first Chinese character is unique;
If it is not, each phonetic transcriptions of Chinese characters corresponding to first Chinese character is matched with alphabetical group respectively, according to matching result
The alphabetical assembling and dismantling are divided at least two alphabetical groups.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Respectively will be first
After each phonetic transcriptions of Chinese characters corresponding to Chinese character is matched with alphabetical group, further comprise:
The phonetic transcriptions of Chinese characters of next Chinese character corresponding to first Chinese character is matched with alphabetical group.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Believed according to word
Breath determines the position relationship between multiple Chinese characters and multiple phonetic alphabet corresponding with multiple Chinese characters;Closed according to the position
System, determines the first Chinese character corresponding with the letter group included in character set;And/or
The Chinese character corresponding with the letter group upper one alphabetical group that be being included in character set is determined, by character set
Comprising a upper letter next Chinese character for organizing corresponding Chinese character be defined as the first Chinese character corresponding with the letter group.
In a kind of optional mode, default regulation rule includes:
It whether there is tone-off Chinese character in the multiple Chinese characters for judging to include in character set;
If so, blank letter group is inserted in set of letters, so that blank letter group corresponds to tone-off Chinese character.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:
Position between multiple Chinese characters and multiple phonetic alphabet corresponding with multiple Chinese characters is determined according to text information
Relation;
According to the position relationship, judge that the position to intersect vertically corresponding to each Chinese character whether there is phonetic word respectively
It is female;
If it is not, the Chinese character is defined as tone-off Chinese character.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:The Chinese is directed to respectively
The each Chinese character included in word set, inquires about the phonetic transcriptions of Chinese characters corresponding to the Chinese character;
Whether judge in set of letters comprising alphabetical group to match with the phonetic transcriptions of Chinese characters;If it is not, the Chinese character is defined as
Tone-off Chinese character.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Search the tone-off Chinese
The adjacent Chinese characters of word, the adjacent letters group corresponding with adjacent Chinese characters is determined in set of letters;
The position adjacent with the adjacent letters group insertion blank letter group in set of letters;
Wherein, blank letter group is identified by default blank represents.
In a kind of optional mode, the multiple Chinese characters included in text information are multiple be located at a line or same row
Chinese character, and the corresponding multiple phonetic alphabet of multiple Chinese characters are multiple phonetic alphabet being located at a line or same row.
In a kind of optional mode, text information is the information that typesetting is carried out by format type-setting mode;
Program 310 is further such that processor 302 performs following operation:By streaming type-setting mode, letter is collected respectively
The each letter group included in conjunction is associated typesetting with corresponding to the Chinese character of the letter group in character set.
In a kind of optional mode, customer attribute information includes at least one in following dimension:Remaining sum dimension, supplement with money
Frequency dimension, recharge amount dimension, consuming frequency dimension, spending amount dimension, read duration dimension, number of activities dimension, with
And the information content dimension pushed.
In a kind of optional mode, when the customer attribute information corresponding with user mark includes multiple dimensions,
Program 310 is further such that processor 302 performs following operation:The customer attribute information of each dimension is directed to respectively, it is determined that with
The corresponding user property subclassification of the customer attribute information of the dimension;With reference to corresponding to the customer attribute information of each dimension
User property subclassification determines the user property classification corresponding with customer attribute information.
In a kind of optional mode, multiple information categories to be pushed include:Book information classification, integration information class
Not, information category in kind, electronic ticket information category, welfare card information classification, supplement favor information classification, action message classification with money
And/or read authority information classification;Then program 310 is further such that processor 302 performs following operation:According to default user
Corresponding relation between each user property classification stored in information MAP table and each information category, it is determined that and user property
The information category that classification matches.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Analyze user's category
The data area belonging to user attribute data included in property information, according to default information content mapping table, belongs to from user
Property multiple information contents for being included of the information category that matches of classification in determine the information content corresponding with the data area;
Wherein, information content mapping table be used to storing multiple information contents for being included in each information category and with each information content
The data area of corresponding user attribute data.
In a kind of optional mode, when the customer attribute information corresponding with user mark includes multiple dimensions,
Program 310 is further such that processor 302 performs following operation:Determine what is included in the customer attribute information of each dimension respectively
Data area belonging to user attribute data;With reference to belonging to the user attribute data included in the customer attribute information of each dimension
Data area, and the weight of the customer attribute information of each dimension determines the corresponding information content.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:It is logical when receiving
When crossing the request PUSH message of default push entrance triggering, the user's mark included in request PUSH message is obtained, it is determined that with
The user identifies corresponding customer attribute information;And/or when the user behavior for monitoring active user meets default word
During the typesetting condition of information, user's mark of active user is obtained, it is determined that the customer attribute information corresponding with user mark;
Wherein, the typesetting condition of default text information includes at least one of the following:User read to abandon read probability be more than it is default
The electronic book section of threshold value, user read chapters and sections and read chapters and sections more than default reading duration, user and reach default chapters and sections.
Algorithm and display be not inherently related to any certain computer, virtual system or miscellaneous equipment provided herein.
Various general-purpose systems can also be used together with teaching based on this.As described above, required by constructing this kind of system
Structure be obvious.In addition, the present invention is not also directed to any certain programmed language.It should be understood that it can utilize various
Programming language realizes the content of invention described herein, and the description done above to language-specific is to disclose this hair
Bright preferred forms.
In the specification that this place provides, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the present invention
Example can be put into practice in the case of these no details.In some instances, known method, structure is not been shown in detail
And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the disclosure and help to understand one or more of each inventive aspect,
Above in the description to the exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes
In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor
The application claims of shield features more more than the feature being expressly recited in each claim.It is more precisely, such as following
Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore,
Thus the claims for following embodiment are expressly incorporated in the embodiment, wherein each claim is in itself
Separate embodiments all as the present invention.
Those skilled in the art, which are appreciated that, to be carried out adaptively to the module in the equipment in embodiment
Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment
Member or component be combined into a module or unit or component, and can be divided into addition multiple submodule or subelement or
Sub-component.In addition at least some in such feature and/or process or unit exclude each other, it can use any
Combination is disclosed to all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so to appoint
Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint power
Profit requires, summary and accompanying drawing) disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generation
Replace.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments
In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention
Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed
One of meaning mode can use in any combination.
It should be noted that the present invention will be described rather than limits the invention for above-described embodiment, and ability
Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims,
Any reference symbol between bracket should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not
Element or step listed in the claims.Word "a" or "an" before element does not exclude the presence of multiple such
Element.The present invention can be by means of including the hardware of some different elements and being come by means of properly programmed computer real
It is existing.In if the unit claim of equipment for drying is listed, several in these devices can be by same hardware branch
To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and run after fame
Claim.
The invention discloses:A1. a kind of composition method of text information, including:
The multiple Chinese characters included in the text information and the multiple spellings corresponding with the multiple Chinese character are directed to respectively
Sound letter is identified, and obtains the character set corresponding with the multiple Chinese character and corresponding with the multiple phonetic alphabet
Set of letters;
According to the spacing between adjacent letters, multiple letters in the set of letters are divided into multiple alphabetical groups;
The dividing mode of each letter group in the set of letters is adjusted according to default regulation rule, so that
Each alphabetical group included in the set of letters corresponds respectively in the character set Chinese character included;
Respectively by each letter group included in the set of letters with corresponding to the letter group in the character set
Chinese character is associated typesetting.
A2. the method according to A1, wherein, the spacing according between adjacent letters, by the set of letters
Multiple letters be divided into it is multiple letter group the step of specifically include:
Whether the spacing between judging per two adjacent letters is more than default spacing threshold;
If so, inserting a separator between two adjacent letters, the letter is collected by each separator
Multiple letters in conjunction are divided into multiple alphabetical groups;
Wherein, the default spacing threshold determines according to an alphabetical mean breadth, and/or, the default spacing threshold
Average headway between multiple letters of the value in the set of letters determines.
A3. the method according to A2, wherein, it is pre- whether the spacing judged between every two adjacent letters is more than
If spacing threshold, if so, a step of separator is inserted between two adjacent letters specifically includes:
When the spacing between judging two adjacent letters is more than default spacing threshold, according to the text information
Determine the position relationship between the multiple Chinese character and multiple phonetic alphabet corresponding with the multiple Chinese character;
Judge whether two adjacent letters correspond to same Chinese character according to the position relationship, if it is not, in institute
State and a separator is inserted between two adjacent letters.
A4. the method according to A1-A3, wherein, the default regulation rule includes:
Respectively for each alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic number
Amount;
If so, the alphabetical assembling and dismantling are divided at least two alphabetical groups, so that each alphabetical group after splitting corresponds respectively to
The Chinese character included in the character set.
A5. the method according to A4, wherein, it is described that the alphabetical assembling and dismantling are divided at least two alphabetical groups, so as to split
The step of each alphabetical group afterwards corresponds respectively in the character set Chinese character included specifically includes:
The first Chinese character corresponding with the letter group included in the character set is determined, inquires about the first Chinese character institute
Corresponding phonetic transcriptions of Chinese characters;
The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters according to corresponding to the first Chinese character, so as to tear open
An each alphabetical group Chinese character for corresponding respectively in the character set include after point.
A6. the method according to A5, wherein, the phonetic transcriptions of Chinese characters according to corresponding to the first Chinese character is by the letter
Assembling and dismantling be divided at least two letter group the step of specifically include:
Judge whether the phonetic transcriptions of Chinese characters corresponding to the first Chinese character is unique;
If it is not, respectively matched each phonetic transcriptions of Chinese characters corresponding to the first Chinese character with described alphabetical group, according to
The alphabetical assembling and dismantling are divided at least two alphabetical groups by matching result.
A7. the method according to A6, wherein, it is described respectively by each phonetic transcriptions of Chinese characters corresponding to the first Chinese character with
After described alphabetical group the step of being matched, further comprise:
The phonetic transcriptions of Chinese characters of next Chinese character corresponding to the first Chinese character is matched with described alphabetical group.
A8. according to any described methods of A5-A7, wherein, it is described determining to include in the character set with the letter
The step of organizing corresponding first Chinese character specifically includes:
The multiple Chinese character and the multiple phonetic words corresponding with the multiple Chinese character are determined according to the text information
Position relationship between mother;According to the position relationship, the head corresponding with the letter group included in the character set is determined
Individual Chinese character;And/or
The Chinese character corresponding with the letter group upper one alphabetical group that be being included in the character set is determined, by the Chinese
Next Chinese character that the upper letter included in word set organizes corresponding Chinese character is defined as corresponding with the letter group
First Chinese character.
A9. according to any described methods of A1-A8, wherein, the default regulation rule includes:
It whether there is tone-off Chinese character in the multiple Chinese characters for judging to include in the character set;
If so, blank letter group is inserted in the set of letters, so that blank letter group corresponds to the tone-off
Chinese character.
A10. the method according to A9, wherein, in the multiple Chinese characters for judging to include in the character set whether
The step of tone-off Chinese character be present specifically includes:
The multiple Chinese character and the multiple phonetic words corresponding with the multiple Chinese character are determined according to the text information
Position relationship between mother;
According to the position relationship, judge that the position to intersect vertically corresponding to each Chinese character whether there is phonetic word respectively
It is female;
If it is not, the Chinese character is defined as tone-off Chinese character.
A11. the method according to A9 or A10, wherein, in the multiple Chinese characters for judging to include in the character set
Specifically included with the presence or absence of the step of tone-off Chinese character:
The each Chinese character included in the character set is directed to respectively, inquires about the phonetic transcriptions of Chinese characters corresponding to the Chinese character;
Whether judge in the set of letters comprising alphabetical group to match with the phonetic transcriptions of Chinese characters;It is if it is not, the Chinese character is true
It is set to tone-off Chinese character.
A12. according to any described methods of A9-A11, wherein, it is described that blank letter group is inserted in the set of letters,
So that the blank letter group specifically includes the step of corresponding to the tone-off Chinese character:
The adjacent Chinese characters of the tone-off Chinese character are searched, are determined in the set of letters corresponding with the adjacent Chinese characters
Adjacent letters group;
The blank letter group is inserted in the position adjacent with the adjacent letters group in the set of letters;
Wherein, the blank letter group is identified by default blank represents.
A13. according to any described methods of A1-A12, wherein, the multiple Chinese characters included in the text information are multiple
Positioned at the Chinese character of same a line or same row, and the corresponding multiple phonetic alphabet of the multiple Chinese character be located at a line to be multiple or
The phonetic alphabet of same row.
A14. according to any described methods of A1-A13, wherein, the text information is to be carried out by format type-setting mode
The information of typesetting;
It is described respectively to organize each letter included in the set of letters with corresponding to the letter in the character set
The step of Chinese character of group is associated typesetting specifically includes:
By streaming type-setting mode, respectively by each letter group included in the set of letters and the character set
Typesetting is associated corresponding to the Chinese character of the letter group.
B15. a kind of electronic equipment, including:Processor, memory, communication interface and communication bus, the processor, institute
State memory and the communication interface and mutual communication is completed by the communication bus;
The memory is used to deposit an at least executable instruction, and the executable instruction makes below the computing device
Operation:
The multiple Chinese characters included in the text information and the multiple spellings corresponding with the multiple Chinese character are directed to respectively
Sound letter is identified, and obtains the character set corresponding with the multiple Chinese character and corresponding with the multiple phonetic alphabet
Set of letters;
According to the spacing between adjacent letters, multiple letters in the set of letters are divided into multiple alphabetical groups;
The dividing mode of each letter group in the set of letters is adjusted according to default regulation rule, so that
Each alphabetical group included in the set of letters corresponds respectively in the character set Chinese character included;
Respectively by each letter group included in the set of letters with corresponding to the letter group in the character set
Chinese character is associated typesetting.
B16. the electronic equipment according to B15, the executable instruction further make to grasp below the computing device
Make:
Whether the spacing between judging per two adjacent letters is more than default spacing threshold;
If so, inserting a separator between two adjacent letters, the letter is collected by each separator
Multiple letters in conjunction are divided into multiple alphabetical groups;
Wherein, the default spacing threshold determines according to an alphabetical mean breadth, and/or, the default spacing threshold
Average headway between multiple letters of the value in the set of letters determines.
B17. the electronic equipment according to B16, the executable instruction further make to grasp below the computing device
Make:
When the spacing between judging two adjacent letters is more than default spacing threshold, according to the text information
Determine the position relationship between the multiple Chinese character and multiple phonetic alphabet corresponding with the multiple Chinese character;
Judge whether two adjacent letters correspond to same Chinese character according to the position relationship, if it is not, in institute
State and a separator is inserted between two adjacent letters.
B18. the electronic equipment according to B15-B17, wherein, the default regulation rule includes:
Respectively for each alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic number
Amount;
If so, the alphabetical assembling and dismantling are divided at least two alphabetical groups, so that each alphabetical group after splitting corresponds respectively to
The Chinese character included in the character set.
B19. the electronic equipment according to B18, wherein, the executable instruction further make the computing device with
Lower operation:
The first Chinese character corresponding with the letter group included in the character set is determined, inquires about the first Chinese character institute
Corresponding phonetic transcriptions of Chinese characters;
The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters according to corresponding to the first Chinese character, so as to tear open
An each alphabetical group Chinese character for corresponding respectively in the character set include after point.
B20. the electronic equipment according to B19, wherein, the executable instruction further make the computing device with
Lower operation:
Judge whether the phonetic transcriptions of Chinese characters corresponding to the first Chinese character is unique;
If it is not, respectively matched each phonetic transcriptions of Chinese characters corresponding to the first Chinese character with described alphabetical group, according to
The alphabetical assembling and dismantling are divided at least two alphabetical groups by matching result.
B21. the electronic equipment according to B20, wherein, the executable instruction further make the computing device with
Lower operation:After each phonetic transcriptions of Chinese characters corresponding to the first Chinese character is matched with described alphabetical group respectively, further
Including:
The phonetic transcriptions of Chinese characters of next Chinese character corresponding to the first Chinese character is matched with described alphabetical group.
B22. according to any described electronic equipments of B19-B21, wherein, the executable instruction further makes the processing
Device performs following operate:
The multiple Chinese character and the multiple phonetic words corresponding with the multiple Chinese character are determined according to the text information
Position relationship between mother;According to the position relationship, the head corresponding with the letter group included in the character set is determined
Individual Chinese character;And/or
The Chinese character corresponding with the letter group upper one alphabetical group that be being included in the character set is determined, by the Chinese
Next Chinese character that the upper letter included in word set organizes corresponding Chinese character is defined as corresponding with the letter group
First Chinese character.
B23. according to any described electronic equipments of B15-B22, wherein, the default regulation rule includes:
It whether there is tone-off Chinese character in the multiple Chinese characters for judging to include in the character set;
If so, blank letter group is inserted in the set of letters, so that blank letter group corresponds to the tone-off
Chinese character.
B24. the electronic equipment according to B23, wherein, the executable instruction further make the computing device with
Lower operation:
The multiple Chinese character and the multiple phonetic words corresponding with the multiple Chinese character are determined according to the text information
Position relationship between mother;
According to the position relationship, judge that the position to intersect vertically corresponding to each Chinese character whether there is phonetic word respectively
It is female;
If it is not, the Chinese character is defined as tone-off Chinese character.
B25. the electronic equipment according to B23 or B24, wherein, the executable instruction further makes the processor
Perform following operate:
The each Chinese character included in the character set is directed to respectively, inquires about the phonetic transcriptions of Chinese characters corresponding to the Chinese character;
Whether judge in the set of letters comprising alphabetical group to match with the phonetic transcriptions of Chinese characters;It is if it is not, the Chinese character is true
It is set to tone-off Chinese character.
B26. according to any described electronic equipments of B23-B25, wherein, the executable instruction further makes the processing
Device performs following operate:
The adjacent Chinese characters of the tone-off Chinese character are searched, are determined in the set of letters corresponding with the adjacent Chinese characters
Adjacent letters group;
The blank letter group is inserted in the position adjacent with the adjacent letters group in the set of letters;
Wherein, the blank letter group is identified by default blank represents.
B27. according to any described electronic equipments of B15-B26, wherein, the multiple Chinese characters included in the text information are
Multiple Chinese characters being located at a line or same row, and the corresponding multiple phonetic alphabet of the multiple Chinese character to be multiple positioned at same
The phonetic alphabet of row or same row.
B28. according to any described electronic equipments of B15-B27, wherein, the text information is to pass through format type-setting mode
Carry out the information of typesetting;
The executable instruction further makes to operate below the computing device:
By streaming type-setting mode, respectively by each letter group included in the set of letters and the character set
Typesetting is associated corresponding to the Chinese character of the letter group.
C29. a kind of computer-readable storage medium, an at least executable instruction is stored with the storage medium, it is described to hold
Row instruction makes to operate below the computing device:
The multiple Chinese characters included in the text information and the multiple spellings corresponding with the multiple Chinese character are directed to respectively
Sound letter is identified, and obtains the character set corresponding with the multiple Chinese character and corresponding with the multiple phonetic alphabet
Set of letters;
According to the spacing between adjacent letters, multiple letters in the set of letters are divided into multiple alphabetical groups;
The dividing mode of each letter group in the set of letters is adjusted according to default regulation rule, so that
Each alphabetical group included in the set of letters corresponds respectively in the character set Chinese character included;
Respectively by each letter group included in the set of letters with corresponding to the letter group in the character set
Chinese character is associated typesetting.
C30. the computer-readable storage medium according to C29, the executable instruction further make the computing device
Operate below:
Whether the spacing between judging per two adjacent letters is more than default spacing threshold;
If so, inserting a separator between two adjacent letters, the letter is collected by each separator
Multiple letters in conjunction are divided into multiple alphabetical groups;
Wherein, the default spacing threshold determines according to an alphabetical mean breadth, and/or, the default spacing threshold
Average headway between multiple letters of the value in the set of letters determines.
C31. the computer-readable storage medium according to C30, the executable instruction further make the computing device
Operate below:
When the spacing between judging two adjacent letters is more than default spacing threshold, according to the text information
Determine the position relationship between the multiple Chinese character and multiple phonetic alphabet corresponding with the multiple Chinese character;
Judge whether two adjacent letters correspond to same Chinese character according to the position relationship, if it is not, in institute
State and a separator is inserted between two adjacent letters.
C32. the computer-readable storage medium according to C29-C31, wherein, the default regulation rule includes:
Respectively for each alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic number
Amount;
If so, the alphabetical assembling and dismantling are divided at least two alphabetical groups, so that each alphabetical group after splitting corresponds respectively to
The Chinese character included in the character set.
C33. the computer-readable storage medium according to C32, wherein, the executable instruction further makes the processor
Perform following operate:
The first Chinese character corresponding with the letter group included in the character set is determined, inquires about the first Chinese character institute
Corresponding phonetic transcriptions of Chinese characters;
The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters according to corresponding to the first Chinese character, so as to tear open
An each alphabetical group Chinese character for corresponding respectively in the character set include after point.
C34. the computer-readable storage medium according to C33, wherein, the executable instruction further makes the processor
Perform following operate:
Judge whether the phonetic transcriptions of Chinese characters corresponding to the first Chinese character is unique;
If it is not, respectively matched each phonetic transcriptions of Chinese characters corresponding to the first Chinese character with described alphabetical group, according to
The alphabetical assembling and dismantling are divided at least two alphabetical groups by matching result.
C35. the computer-readable storage medium according to C34, wherein, the executable instruction further makes the processor
Perform following operate:After each phonetic transcriptions of Chinese characters corresponding to the first Chinese character is matched with described alphabetical group respectively,
Further comprise:
The phonetic transcriptions of Chinese characters of next Chinese character corresponding to the first Chinese character is matched with described alphabetical group.
C36. according to any described computer-readable storage mediums of C33-C35, wherein, the executable instruction further makes institute
State and operated below computing device:
The multiple Chinese character and the multiple phonetic words corresponding with the multiple Chinese character are determined according to the text information
Position relationship between mother;According to the position relationship, the head corresponding with the letter group included in the character set is determined
Individual Chinese character;And/or
The Chinese character corresponding with the letter group upper one alphabetical group that be being included in the character set is determined, by the Chinese
Next Chinese character that the upper letter included in word set organizes corresponding Chinese character is defined as corresponding with the letter group
First Chinese character.
C37. according to any described computer-readable storage mediums of C29-C36, wherein, the default regulation rule includes:
It whether there is tone-off Chinese character in the multiple Chinese characters for judging to include in the character set;
If so, blank letter group is inserted in the set of letters, so that blank letter group corresponds to the tone-off
Chinese character.
C38. the computer-readable storage medium according to C37, wherein, the executable instruction further makes the processor
Perform following operate:
The multiple Chinese character and the multiple phonetic words corresponding with the multiple Chinese character are determined according to the text information
Position relationship between mother;
According to the position relationship, judge that the position to intersect vertically corresponding to each Chinese character whether there is phonetic word respectively
It is female;
If it is not, the Chinese character is defined as tone-off Chinese character.
C39. the computer-readable storage medium according to C37 or C38, wherein, the executable instruction further makes described
Operated below computing device:
The each Chinese character included in the character set is directed to respectively, inquires about the phonetic transcriptions of Chinese characters corresponding to the Chinese character;
Whether judge in the set of letters comprising alphabetical group to match with the phonetic transcriptions of Chinese characters;It is if it is not, the Chinese character is true
It is set to tone-off Chinese character.
C40. according to any described computer-readable storage mediums of C37-C39, wherein, the executable instruction further makes institute
State and operated below computing device:
The adjacent Chinese characters of the tone-off Chinese character are searched, are determined in the set of letters corresponding with the adjacent Chinese characters
Adjacent letters group;
The blank letter group is inserted in the position adjacent with the adjacent letters group in the set of letters;
Wherein, the blank letter group is identified by default blank represents.
C41. according to any described computer-readable storage mediums of C39-C40, wherein, what is included in the text information is multiple
Chinese character is multiple Chinese characters being located at a line or same row, and the corresponding multiple phonetic alphabet of the multiple Chinese character are multiple positions
In the phonetic alphabet of same a line or same row.
C42. according to any described computer-readable storage mediums of C29-C41, wherein, the text information is to be arranged by format
Version mode carries out the information of typesetting;
The executable instruction further makes to operate below the computing device:
By streaming type-setting mode, respectively by each letter group included in the set of letters and the character set
Typesetting is associated corresponding to the Chinese character of the letter group.
Claims (10)
1. a kind of composition method of text information, including:
The multiple Chinese characters included in the text information and the multiple phonetic words corresponding with the multiple Chinese character are directed to respectively
Mother is identified, and obtains the character set corresponding with the multiple Chinese character and the word corresponding with the multiple phonetic alphabet
Superclass;
According to the spacing between adjacent letters, multiple letters in the set of letters are divided into multiple alphabetical groups;
The dividing mode of each letter group in the set of letters is adjusted according to default regulation rule, so that described
Each alphabetical group included in set of letters corresponds respectively in the character set Chinese character included;
Respectively by Chinese character of each letter group included in the set of letters with corresponding to the letter group in the character set
It is associated typesetting.
2. the method according to claim 11, wherein, the spacing according between adjacent letters, by the set of letters
In multiple letters be divided into it is multiple letter group the step of specifically include:
Whether the spacing between judging per two adjacent letters is more than default spacing threshold;
If so, a separator is inserted between two adjacent letters, by each separator by the set of letters
Multiple letters be divided into multiple alphabetical groups;
Wherein, the default spacing threshold determines according to an alphabetical mean breadth, and/or, the default spacing threshold root
Determined according to the average headway between multiple letters in the set of letters.
3. according to the method for claim 2, wherein, whether the spacing judged between every two adjacent letters is more than
Default spacing threshold, if so, a step of separator is inserted between two adjacent letters specifically includes:
When the spacing between judging two adjacent letters is more than default spacing threshold, determined according to the text information
Position relationship between the multiple Chinese character and multiple phonetic alphabet corresponding with the multiple Chinese character;
Judge whether two adjacent letters correspond to same Chinese character according to the position relationship, if it is not, in the phase
A separator is inserted between two adjacent letters.
4. according to the method described in claim 1-3, wherein, the default regulation rule includes:
Respectively for each alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic quantity;
If so, the alphabetical assembling and dismantling are divided into at least two alphabetical groups so that each alphabetical group after splitting correspond respectively to it is described
The Chinese character included in character set.
5. the method according to claim 11, wherein, it is described that the alphabetical assembling and dismantling are divided at least two alphabetical groups, so as to tear open
Each alphabetical group after point specifically includes the step of corresponding respectively in the character set Chinese character included:
The first Chinese character corresponding with the letter group included in the character set is determined, is inquired about corresponding to the first Chinese character
Phonetic transcriptions of Chinese characters;
The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters according to corresponding to the first Chinese character, so that after splitting
An each alphabetical group Chinese character for corresponding respectively in the character set include.
6. according to the method for claim 5, wherein, the phonetic transcriptions of Chinese characters according to corresponding to the first Chinese character is by the word
Female assembling and dismantling be divided at least two letter group the step of specifically include:
Judge whether the phonetic transcriptions of Chinese characters corresponding to the first Chinese character is unique;
If it is not, each phonetic transcriptions of Chinese characters corresponding to the first Chinese character is matched with described alphabetical group respectively, according to matching
As a result the alphabetical assembling and dismantling are divided at least two alphabetical groups.
7. according to the method for claim 6, wherein, it is described respectively by the first Chinese character corresponding to each phonetic transcriptions of Chinese characters
After the step of being matched with described alphabetical group, further comprise:
The phonetic transcriptions of Chinese characters of next Chinese character corresponding to the first Chinese character is matched with described alphabetical group.
8. according to any described methods of claim 5-7, wherein, it is described determining to include in the character set with the letter
The step of organizing corresponding first Chinese character specifically includes:
According to the text information determine the multiple Chinese character and the multiple phonetic alphabet corresponding with the multiple Chinese character it
Between position relationship;According to the position relationship, the first Chinese corresponding with the letter group included in the character set is determined
Word;And/or
The Chinese character corresponding with the letter group upper one alphabetical group that be being included in the character set is determined, by the Chinese Character Set
Next Chinese character that the upper letter included in conjunction organizes corresponding Chinese character is defined as the head corresponding with the letter group
Individual Chinese character.
9. a kind of electronic equipment, including:Processor, memory, communication interface and communication bus, the processor, the storage
Device and the communication interface complete mutual communication by the communication bus;
The memory is used to deposit an at least executable instruction, and the executable instruction makes to grasp below the computing device
Make:
The multiple Chinese characters included in the text information and the multiple phonetic words corresponding with the multiple Chinese character are directed to respectively
Mother is identified, and obtains the character set corresponding with the multiple Chinese character and the word corresponding with the multiple phonetic alphabet
Superclass;
According to the spacing between adjacent letters, multiple letters in the set of letters are divided into multiple alphabetical groups;
The dividing mode of each letter group in the set of letters is adjusted according to default regulation rule, so that described
Each alphabetical group included in set of letters corresponds respectively in the character set Chinese character included;
Respectively by Chinese character of each letter group included in the set of letters with corresponding to the letter group in the character set
It is associated typesetting.
10. a kind of computer-readable storage medium, an at least executable instruction, the executable instruction are stored with the storage medium
Make to operate below the computing device:
The multiple Chinese characters included in the text information and the multiple phonetic words corresponding with the multiple Chinese character are directed to respectively
Mother is identified, and obtains the character set corresponding with the multiple Chinese character and the word corresponding with the multiple phonetic alphabet
Superclass;
According to the spacing between adjacent letters, multiple letters in the set of letters are divided into multiple alphabetical groups;
The dividing mode of each letter group in the set of letters is adjusted according to default regulation rule, so that described
Each alphabetical group included in set of letters corresponds respectively in the character set Chinese character included;
Respectively by Chinese character of each letter group included in the set of letters with corresponding to the letter group in the character set
It is associated typesetting.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711182001.2A CN107783956B (en) | 2017-11-23 | 2017-11-23 | Composition method, electronic equipment and the computer storage medium of text information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711182001.2A CN107783956B (en) | 2017-11-23 | 2017-11-23 | Composition method, electronic equipment and the computer storage medium of text information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107783956A true CN107783956A (en) | 2018-03-09 |
CN107783956B CN107783956B (en) | 2019-03-15 |
Family
ID=61430627
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711182001.2A Active CN107783956B (en) | 2017-11-23 | 2017-11-23 | Composition method, electronic equipment and the computer storage medium of text information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107783956B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109325215A (en) * | 2018-12-04 | 2019-02-12 | 万兴科技股份有限公司 | The output method and device of Word text |
CN113052179A (en) * | 2021-03-09 | 2021-06-29 | 安徽淘云科技股份有限公司 | Polyphone processing method and device, electronic equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101876967A (en) * | 2010-03-25 | 2010-11-03 | 深圳市万兴软件有限公司 | Method for generating PDF text paragraphs |
CN103136186A (en) * | 2011-12-05 | 2013-06-05 | 北大方正集团有限公司 | Method and device of pinyin type setting |
CN103150300A (en) * | 2011-12-06 | 2013-06-12 | 北大方正集团有限公司 | Pinyin typesetting method and device |
CN106598934A (en) * | 2016-12-14 | 2017-04-26 | 掌阅科技股份有限公司 | Electronic book data display method and device, and terminal equipment |
CN106940596A (en) * | 2016-01-04 | 2017-07-11 | 北京峰盛博远科技股份有限公司 | A kind of recognition methods of multiple characters of handwriting input and system |
CN107025215A (en) * | 2017-02-13 | 2017-08-08 | 阿里巴巴集团控股有限公司 | A kind of picture and text composition method and device |
-
2017
- 2017-11-23 CN CN201711182001.2A patent/CN107783956B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101876967A (en) * | 2010-03-25 | 2010-11-03 | 深圳市万兴软件有限公司 | Method for generating PDF text paragraphs |
CN103136186A (en) * | 2011-12-05 | 2013-06-05 | 北大方正集团有限公司 | Method and device of pinyin type setting |
CN103150300A (en) * | 2011-12-06 | 2013-06-12 | 北大方正集团有限公司 | Pinyin typesetting method and device |
CN106940596A (en) * | 2016-01-04 | 2017-07-11 | 北京峰盛博远科技股份有限公司 | A kind of recognition methods of multiple characters of handwriting input and system |
CN106598934A (en) * | 2016-12-14 | 2017-04-26 | 掌阅科技股份有限公司 | Electronic book data display method and device, and terminal equipment |
CN107025215A (en) * | 2017-02-13 | 2017-08-08 | 阿里巴巴集团控股有限公司 | A kind of picture and text composition method and device |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109325215A (en) * | 2018-12-04 | 2019-02-12 | 万兴科技股份有限公司 | The output method and device of Word text |
CN109325215B (en) * | 2018-12-04 | 2023-02-10 | 万兴科技股份有限公司 | Word text output method and device |
CN113052179A (en) * | 2021-03-09 | 2021-06-29 | 安徽淘云科技股份有限公司 | Polyphone processing method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN107783956B (en) | 2019-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7756871B2 (en) | Article extraction | |
Kukich | Techniques for automatically correcting words in text | |
US20160103823A1 (en) | Machine Learning Extraction of Free-Form Textual Rules and Provisions From Legal Documents | |
US8725494B2 (en) | Signal processing approach to sentiment analysis for entities in documents | |
Bakliwal et al. | Towards Enhanced Opinion Classification using NLP Techniques. | |
CN107330071A (en) | A kind of legal advice information intelligent replies method and platform | |
CN108629046A (en) | A kind of fields match method and terminal device | |
CN111125354A (en) | Text classification method and device | |
CN109800414A (en) | Faulty wording corrects recommended method and system | |
CN102576358A (en) | Word pair acquisition device, word pair acquisition method, and program | |
AU2020200410B2 (en) | Mitigation of conflicts between content matchers in automated document analysis | |
US9063923B2 (en) | Method for identifying the integrity of information | |
CN105760359B (en) | Question processing system and method thereof | |
US11048934B2 (en) | Identifying augmented features based on a bayesian analysis of a text document | |
CN110334217A (en) | A kind of element abstracting method, device, equipment and storage medium | |
CN109165386A (en) | A kind of Chinese empty anaphora resolution method and system | |
CN107741972A (en) | A kind of searching method of picture, terminal device and storage medium | |
CN106610990A (en) | Emotional tendency analysis method and apparatus | |
Hussein | Arabic document similarity analysis using n-grams and singular value decomposition | |
CN112668311A (en) | Text error detection method and device | |
CN110489559A (en) | A kind of file classification method, device and storage medium | |
CN107783956A (en) | Composition method, electronic equipment and the computer-readable storage medium of text information | |
CN109062977A (en) | A kind of automatic question answering text matching technique, automatic question-answering method and system based on semantic similarity | |
CN109614623A (en) | A kind of composition processing method and system based on syntactic analysis | |
CN112148862A (en) | Question intention identification method and device, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |