CN107783956A - Composition method, electronic equipment and the computer-readable storage medium of text information - Google Patents

Composition method, electronic equipment and the computer-readable storage medium of text information Download PDF

Info

Publication number
CN107783956A
CN107783956A CN201711182001.2A CN201711182001A CN107783956A CN 107783956 A CN107783956 A CN 107783956A CN 201711182001 A CN201711182001 A CN 201711182001A CN 107783956 A CN107783956 A CN 107783956A
Authority
CN
China
Prior art keywords
letters
chinese character
alphabetical
character
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711182001.2A
Other languages
Chinese (zh)
Other versions
CN107783956B (en
Inventor
张恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhangyue Technology Co Ltd
Original Assignee
Zhangyue Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhangyue Technology Co Ltd filed Critical Zhangyue Technology Co Ltd
Priority to CN201711182001.2A priority Critical patent/CN107783956B/en
Publication of CN107783956A publication Critical patent/CN107783956A/en
Application granted granted Critical
Publication of CN107783956B publication Critical patent/CN107783956B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/189Automatic justification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Abstract

The invention discloses a kind of composition method of text information, electronic equipment and computer-readable storage medium, this method includes:The multiple Chinese characters included in text information are directed to respectively and the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, and obtain the character set corresponding with multiple Chinese characters and the set of letters corresponding with multiple phonetic alphabet;According to the spacing between adjacent letters, multiple letters in set of letters are divided into multiple alphabetical groups;The dividing mode of each letter group in set of letters is adjusted according to default regulation rule, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set;The each letter group included in set of letters is associated typesetting with corresponding to the Chinese character of the letter group in character set respectively.The present invention can realize information identification and the format conversion operation during typesetting, and can correspond each Chinese character and phonetic after typesetting.

Description

Composition method, electronic equipment and the computer-readable storage medium of text information
Technical field
The present invention relates to computer realm, and in particular to a kind of composition method of text information, electronic equipment and computer Storage medium.
Background technology
With the increased popularity of e-book, increasing original books material is converted into e-book document, with convenient User reads., it is necessary to be identified for the text information included in original books material in transfer process, and according to identification Result afterwards carries out typesetting again.For example, because the space of a whole page of the file (such as PDF format file) of format typesetting is fixed, read Shown all the time with original editor's format in read procedure, typesetting again will not be carried out after scaling automatically according to page width, it is not easily modified, Security is higher, and is not limited by operating system platform.So many original books materials are format typesetting text Part, correspondingly, it is necessary to which format type-setting document is converted to stream when user needs to enter edlin to the file of format typesetting The file of formula typesetting, such as the file by the file translations of PDF format for WORD forms.
But during the present invention is realized, inventor has found that at least there are the following problems in the prior art:In form During conversion, the entanglement of row or column often occurs so as to bring difficulty to identification process in text information.Especially work as text In word information simultaneously when including Chinese character and the phonetic corresponding with Chinese character, because of situations such as the dislocation of phonetic and Chinese character, often Recognition result is caused to malfunction, user, which must manually proofread, can carry out typesetting.As can be seen here, existing type-setting mode can not be directed to Text information simultaneously comprising Chinese character and phonetic is accurately identified.
The content of the invention
In view of the above problems, it is proposed that the present invention so as to provide one kind overcome above mentioned problem or at least in part solve on State the composition method, electronic equipment and computer-readable storage medium of the text information of problem.
According to an aspect of the invention, there is provided a kind of composition method of text information, including:Believe respectively for word The multiple Chinese characters and the multiple phonetic alphabet corresponding with multiple Chinese characters included in breath are identified, and obtain and multiple Chinese character phases Corresponding character set and the set of letters corresponding with multiple phonetic alphabet;According to the spacing between adjacent letters, by word Multiple letters in superclass are divided into multiple alphabetical groups;According to default regulation rule to each alphabetical group in set of letters Dividing mode be adjusted so that each alphabetical group included in set of letters correspond respectively to include in character set one Individual Chinese character;The each letter group included in set of letters is closed with corresponding to the Chinese character of the letter group in character set respectively Townhouse version.
According to another aspect of the present invention, there is provided a kind of electronic equipment, including:Processor, memory, communication interface and Communication bus, processor, memory and communication interface complete mutual communication by communication bus;Memory is used to deposit extremely A few executable instruction, executable instruction make to operate below computing device:The multiple Chinese included in text information are directed to respectively Word and the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, obtain the character set corresponding with multiple Chinese characters with And the set of letters corresponding with multiple phonetic alphabet;According to the spacing between adjacent letters, by multiple words in set of letters Mother is divided into multiple alphabetical groups;The dividing mode of each letter group in set of letters is adjusted according to default regulation rule It is whole, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set;Respectively by word The each letter group included in superclass is associated typesetting with corresponding to the Chinese character of the letter group in character set.
According to another aspect of the invention, there is provided a kind of computer-readable storage medium, be stored with least one in storage medium Executable instruction, executable instruction make to operate below computing device:Be directed to respectively in text information multiple Chinese characters for including with And the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, obtain the character set corresponding with multiple Chinese characters and with The corresponding set of letters of multiple phonetic alphabet;According to the spacing between adjacent letters, multiple letters in set of letters are drawn It is divided into multiple alphabetical groups;The dividing mode of each letter group in set of letters is adjusted according to default regulation rule, So that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set;Letter is collected respectively The each letter group included in conjunction is associated typesetting with corresponding to the Chinese character of the letter group in character set.
According to the composition method, electronic equipment and computer-readable storage medium of text information provided by the invention, pass through difference Be identified for the multiple Chinese characters and the multiple phonetic alphabet corresponding with multiple Chinese characters included in text information, obtain with The corresponding character set of multiple Chinese characters and the set of letters corresponding with multiple phonetic alphabet, and according between adjacent letters Spacing, multiple letters in set of letters are divided into multiple alphabetical groups;Then letter is collected according to default regulation rule The dividing mode of each letter group in conjunction is adjusted, so that each alphabetical group included in set of letters corresponds respectively to the Chinese The Chinese character included in word set;Finally respectively by each letter group included in set of letters with corresponding in character set The Chinese character of the letter group is associated typesetting, so as to obtain the file of streaming typesetting.According to the solution of the present invention, Neng Goushi Information identification and format conversion operation during existing typesetting, are not in the confusion of row and column, and after typesetting being made Each Chinese character and phonetic correspond, and eliminate the process manually proofreaded, and can be directed to the word comprising Chinese character and phonetic simultaneously Information is accurately identified.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And can be practiced according to the content of specification, and in order to allow above and other objects of the present invention, feature and advantage can Become apparent, below especially exemplified by the embodiment of the present invention.
Brief description of the drawings
By reading the detailed description of hereafter preferred embodiment, it is various other the advantages of and benefit it is common for this area Technical staff will be clear understanding.Accompanying drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention Limitation.And in whole accompanying drawing, identical part is denoted by the same reference numerals.In the accompanying drawings:
Fig. 1 shows the flow chart of the composition method for the text information that one embodiment of the invention provides;
Fig. 2 shows the flow chart of the composition method for the text information that another embodiment of the present invention provides;
Fig. 3 shows the structural representation of a kind of electronic equipment provided according to a further embodiment of the invention.
Embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in accompanying drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here Limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure Completely it is communicated to those skilled in the art.
Fig. 1 shows the flow chart of the composition method for the text information that one embodiment of the invention provides.As shown in figure 1, This method comprises the following steps:
Step S110:The multiple Chinese characters included in text information and the multiple spellings corresponding with multiple Chinese characters are directed to respectively Sound letter is identified, and obtains the character set corresponding with multiple Chinese characters and the letter corresponding with multiple phonetic alphabet collects Close.
Wherein, above-mentioned text information can be the information of format typesetting, or the information of other type-setting modes.Due to The multiple Chinese characters included in above-mentioned text information are multiple Chinese characters being located at a line or same row, and multiple Chinese characters are corresponding Multiple phonetic alphabet are multiple phonetic alphabet being located at a line or same row.In this step, first in text information Multiple Chinese characters positioned at same a line are identified, then to being located at multiple phonetic alphabet with a line corresponding to above-mentioned multiple Chinese characters It is identified, is also taken for multiple phonetic alphabet corresponding to multiple Chinese characters positioned at same row and its positioned at same row The mode of stating is identified.So as to obtain the character set corresponding with above-mentioned multiple Chinese characters and corresponding with multiple phonetic alphabet Set of letters.Wherein, above-mentioned character set can be Chinese character row or Chinese character row, as long as can realize for one group The storage of Chinese character, the present invention do not limit the specific implementation of character set.Similarly, above-mentioned phonetic set both can be with Phonetic row can also be phonetic row.
Step S120:According to the spacing between adjacent letters, multiple letters in set of letters are divided into multiple letters Group.
Due in two neighboring Chinese character respectively corresponding phonetic, phonetic corresponding to previous Chinese character last Spacing between the initial of phonetic corresponding to letter and the latter Chinese character is more than the spacing between average adjacent letters, according to This feature, spacing threshold values can be pre-set, when the spacing threshold values between two neighboring letter is more than default spacing threshold values When, a separator is inserted between two adjacent letters, will be multiple in above-mentioned set of letters by inserting separator Letter is divided into multiple alphabetical groups.Wherein, above-mentioned default spacing threshold determines according to an alphabetical mean breadth, and/or, on The average headway stated between multiple letters of the default spacing threshold in set of letters determines.The average width of said one letter Degree can be calculated according to the number of all alphabetical width summations and letter, can also be true by other methods It is fixed.Alternatively, the average headway that default spacing threshold values can be between adjacent letters, either for slightly less than or more than phase The distance values of average headway between adjacent letter, it can also be not limited thereto according to the specific setting of other values.
Step S130:The dividing mode of each letter group in set of letters is adjusted according to default regulation rule It is whole, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set.
Because each phonetic needs to compare with corresponding Chinese character, multiple letters in set of letters are being divided into multiple words , it is necessary to be verified to it after female group, and according to division of the default regulation rule to each letter group in set of letters Mode is adjusted, so that each alphabetical group included in a set of letters Chinese for corresponding respectively to include in character set Word.On the one hand, in format typesetting when the phonetic alphabet corresponding to two Chinese characters are long, and the phonetic corresponding to two Chinese characters When gap ratio in letter between two adjacent letters is smaller, one alphabetical group obtained is probably that two Chinese characters correspond to phonetic Letter, regulation rule now can be respectively for each alphabetical group, and the alphabetical quantity for judging to include in this alphabetical group is It is no to be more than default phonetic quantity;If so, the alphabetical assembling and dismantling are divided at least two alphabetical groups, so that each alphabetical group after splitting The Chinese character included is corresponded respectively in the character set.Due to the rule of phonetic in itself, its alphabetical quantity included No more than 6, therefore, default phonetic quantity above-mentioned in the present embodiment could be arranged to 6.On the other hand, in Chinese character Rule in, some Chinese characters do not have " youngster " word in corresponding phonetic, such as youngster's speech.Therefore, in alphabetical group obtained In phonetic alphabet group corresponding to other words may correspond on the position of the Chinese character of no phonetic, now need to collect letter The dividing mode of each letter group in conjunction is adjusted.Then, default regulation rule can include:Judge to wrap in character set It whether there is tone-off Chinese character in the multiple Chinese characters contained;If so, blank letter group is inserted in set of letters, so that blank letter group Corresponding to the tone-off Chinese character, so that each alphabetical group included in set of letters correspond respectively to include in character set one Individual Chinese character.
In addition to above-mentioned regulation rule, those skilled in the art can also flexibly set other regulation rules, this Invention is not limited this.
Step S140:Respectively by each letter group included in set of letters with corresponding to the letter group in character set Chinese character is associated typesetting.
By each alphabetical group included in set of letters correspond respectively to a Chinese character being included in character set and then The each letter group included in set of letters is associated typesetting with corresponding to the Chinese character of the letter group in character set respectively, To obtain that the file of edit-modify can be carried out by user.So-called association typesetting refers to:The each word that will be included in set of letters Mother's group is associated with corresponding to the Chinese character of the letter group in character set, then organizes the letter after association with Chinese character using upper Lower relative or relative left and right mode carries out typesetting.
The composition method of the text information provided according to the present embodiment, by multiple for being included in text information respectively Chinese character and the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, and obtain the character set corresponding with multiple Chinese characters And the set of letters corresponding with multiple phonetic alphabet, and according to the spacing between adjacent letters, will be more in set of letters Individual letter is divided into multiple alphabetical groups;Then the division side according to default regulation rule to each letter group in set of letters Formula is adjusted, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set; Finally each letter group included in set of letters is associated with corresponding to the Chinese character of the letter group in character set respectively Typesetting, so as to obtain the file of streaming typesetting.According to the method for the present embodiment, it can realize that the information during typesetting is known Not and format conversion operation, be not in the confusion of row and column, and a pair of each Chinese character and phonetic 1 after typesetting can be made Should, the process manually proofreaded is eliminated, the text information simultaneously comprising Chinese character and phonetic can be directed to and accurately identified.
Fig. 2 shows the flow chart of the composition method for the text information that another embodiment of the present invention provides.Such as Fig. 2 institutes Show, this method comprises the following steps:
Step S210:The multiple Chinese characters included in text information and the multiple spellings corresponding with multiple Chinese characters are directed to respectively Sound letter is identified, and obtains the character set corresponding with multiple Chinese characters and the letter corresponding with multiple phonetic alphabet collects Close.
Wherein, in the present embodiment, above-mentioned text information is the information of format typesetting.Due to being included in above-mentioned text information Multiple Chinese characters be multiple Chinese characters being located at a line or same row, and the corresponding multiple phonetic alphabet of the multiple Chinese character are Multiple phonetic alphabet being located at a line or same row.In this step, first to being located at the more of a line in text information Individual Chinese character is identified, and then the multiple phonetic alphabet being located at corresponding to above-mentioned multiple Chinese characters with a line are identified, so as to Obtain and the above-mentioned corresponding character set of multiple Chinese characters and the set of letters corresponding with multiple phonetic alphabet.The above-mentioned Chinese Word set can use all kinds of modes such as Chinese character row, Chinese character row or Chinese character memory cell to realize that the present invention is not limited this, together Reason, phonetic set can also use all kinds of modes such as phonetic row, phonetic row or phonetic memory cell to realize.
Step S220:Whether the spacing between judging per two adjacent letters is more than default spacing threshold.
Due in two neighboring Chinese character respectively corresponding phonetic, phonetic corresponding to previous Chinese character last Spacing between the initial of phonetic corresponding to letter and the latter Chinese character is more than the spacing between average adjacent letters, according to This feature, spacing threshold values can be pre-set, and it is default to judge whether the spacing between every two adjacent letters is more than Spacing threshold.Wherein, above-mentioned default spacing threshold determines according to an alphabetical mean breadth, and/or, above-mentioned default spacing threshold Average headway between multiple letters of the value in set of letters determines.The mean breadth of said one letter can be according to institute There are the width summation of letter and the number of letter to be calculated, can also be determined by other methods.Alternatively, it is default Spacing threshold values can be between adjacent letters average headway, either for slightly less than or more than being averaged between adjacent letters The distance values of spacing, it can also be not limited thereto according to the specific setting of other values.
Step S230:If so, a separator is inserted between two adjacent letters, by each separator by word Multiple letters in superclass are divided into multiple alphabetical groups.
If it is not, then to this alphabetical group without splitting.If so, a separation is inserted between two adjacent letters Multiple letters in above-mentioned set of letters are divided into multiple alphabetical groups by symbol by inserting separator.Wherein, alphabetical group refers to: The monogram that some letters opened by separators are formed.
During step S220 and step S230 is performed, it is also noted that following situations:Corresponding to same Chinese character Due to the mistake of format typesetting between certain two letter of phonetic, it is possible that the spacing between two letters is larger, such as Fruit inserts separator in two letters corresponding to the Chinese character, then the situation of phonetic and Chinese character dislocation occurs.So often When the spacing between judging two adjacent letters is more than default spacing threshold, further determined according to text information multiple Position relationship between Chinese character and multiple phonetic alphabet corresponding with multiple Chinese characters;Judged according to above-mentioned position relationship adjacent Two letters whether correspond to same Chinese character, if it is not, between two adjacent letters insert a separator;If so, Then separator is not inserted between two adjacent letters.Wherein, position relationship refers to:Each Chinese character and letter are in format In the text information of typesetting (i.e. change before original document in) corresponding to position relationship.When it is implemented, it can obtain respectively The position coordinate value of each Chinese character and each letter in the text information of format typesetting is taken, and analyzes Chinese character and letter accordingly Between position relationship.When two adjacent letters are respectively positioned on above or below same Chinese character, it is determined that this is adjacent Two letters correspond to same Chinese character.
Step S240:The dividing mode of each letter group in set of letters is adjusted according to default regulation rule It is whole, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set.
Because each phonetic needs to compare with corresponding Chinese character, multiple letters in set of letters are being divided into multiple words , it is necessary to be verified to it after female group, the division side according to default regulation rule to each letter group in set of letters Formula is adjusted, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set.
Wherein, default regulation rule mainly includes following both sides rule:
The rule of first aspect is:The rule organized for splitting the letter being made up of multiple Chinese characters.Such as when appearance two When letter group corresponding to Chinese character is long, and when gap ratio between adjacent two letters is smaller, an obtained letter Group is probably phonetic alphabet corresponding to two Chinese characters, such as the phonetic " adjacent two in duan zhuang " of Chinese character " dignity " Spacing in alphabetical " n " and " z " is smaller, and two such letter group " duan " may be divided into a word with " zhuang " Female group " duanzhuang ", now need to be adjusted it.Then above-mentioned default regulation rule can include:Respectively for every Individual alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic quantity;If so, by the alphabetical assembling and dismantling It is divided at least two alphabetical groups, so that an each alphabetical group Chinese character for corresponding respectively to include in character set after splitting. Wherein, presetting phonetic quantity can determine according to the maximum length and/or average length of phonetic transcriptions of Chinese characters.In the present embodiment, send out A person of good sense has found during the present invention is realized:The alphabetical quantity included in phonetic corresponding to any Chinese character is not over 6 Individual, therefore, default phonetic quantity above-mentioned in the present embodiment is arranged to 6.It is such as alphabetical in alphabetical group " duanzhuang " Quantity is 10, now needs to be split as two alphabetical group so that each alphabetical group after splitting correspond respectively to it is described The Chinese character included in character set.Further, it is above-mentioned that the alphabetical assembling and dismantling are divided at least two alphabetical groups, so as to split The step of each alphabetical group afterwards corresponds respectively in the character set Chinese character included specifically includes:Determine Chinese Character Set The first Chinese character corresponding with the letter group included in conjunction, inquires about the phonetic transcriptions of Chinese characters corresponding to first Chinese character;According to the first Chinese The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters corresponding to word, so that each letter group after splitting is right respectively The Chinese character that should be included in character set.Such as alphabetical group for the dislocation corresponding to " dignity " two word " duanzhuang ", first Chinese character " end " word in character set is first determined whether out, then inquire about the Chinese character corresponding to " end " word Phonetic " duan ", the method for inquiry can be searched from the phonetic storehouse of Microsoft, so as to by the phonetic of obtained first Chinese character " duan " goes to match with alphabetical group " duanzhuang ", " duan " after the match is successful in alphabetical group " duanzhuang " Separator is inserted afterwards, and alphabetical group " duanzhuang " is split as two alphabetical group " duan " and " zhuang ", so that " end " word and " village " word that each alphabetical group " duan ", " zhuang " after fractionation are corresponded respectively in character set.Work as appearance When phonetic alphabet group corresponding to more Chinese characters is connected, the phonetic alphabet group corresponding to first Chinese character and set of letters are matched Afterwards, remaining Chinese character after first Chinese character is also matched one by one with set of letters, so as to improve accuracy.Such as Chinese Character Set The phonetic corresponding to " end ", " village ", " virtuous " in conjunction is all connected to one piece, when first Chinese character " end " and alphabetical group After " duanzhuangxian " matches, further the phonetic alphabet group corresponding to " village " and " virtuous " is collected with letter respectively Conjunction matches, so as to which alphabetical group " duanzhuangxian " is split as at least two alphabetical groups, so that each letter group difference Corresponding to the Chinese character included in character set.
To be polyphone also be present due to Chinese character, so, further, in the Chinese character according to corresponding to first Chinese character The alphabetical assembling and dismantling are divided into during at least two alphabetical groups by phonetic, it is also necessary to further judge that above-mentioned first Chinese character institute is right Whether the phonetic transcriptions of Chinese characters answered is unique, if so, then match simultaneously basis by the phonetic transcriptions of Chinese characters corresponding to above-mentioned Chinese character and alphabetical group The alphabetical assembling and dismantling are divided at least two alphabetical groups by matching result.If it is not, then judge multiple phonetics corresponding to first Chinese character Whether length (alphabetical quantity), if equally, directly can be matched according to the length of phonetic and alphabetical group, from And save the time of matching.Such as multiple phonetics " zhang " and " chang " corresponding to polyphone " length ", the length of its phonetic Degree is the same, directly can be matched according to the length of above-mentioned two phonetic and alphabetical group, when can so save matching Between.When the length of multiple phonetics corresponding to interpretation goes out polyphone is different, such as multiple spellings corresponding to polyphone " all " Sound be " du ", " dou ", it is necessary to each phonetic transcriptions of Chinese characters corresponding to above-mentioned first Chinese character is matched with alphabetical group, according to The alphabetical assembling and dismantling are divided at least two alphabetical groups with result.Certainly, when the length of multiple phonetics corresponding to polyphone is the same When, each phonetic transcriptions of Chinese characters corresponding to above-mentioned first Chinese character can also be matched with alphabetical group, should according to matching result Alphabetical assembling and dismantling are divided at least two alphabetical groups to improve the accuracy rate of matching.Specific matching way, those skilled in the art can bases Actual conditions are specifically chosen, are not limited thereto.
Further, the mode of the first Chinese character corresponding with the letter group included in above-mentioned determination character set is divided into Following two ways.
Mode one:Determined according to text information between multiple Chinese characters and multiple phonetic alphabet corresponding with multiple Chinese characters Position relationship;According to the position relationship, the first Chinese character corresponding with the letter group included in character set is determined.Its In, position relationship refers to:Each Chinese character and letter in the text information of format typesetting (i.e. change before original document in) Corresponding position relationship.When it is implemented, each Chinese character during the text information of format typesetting can be obtained respectively and every Individual alphabetical position coordinate value, and the position relationship between Chinese character and letter is analyzed accordingly.When two adjacent letters are respectively positioned on When above or below same Chinese character, it is determined that two adjacent letters correspond to same Chinese character.Such as at alphabetical group Letter " u " and alphabetical " n " in " duanzhuang " are respectively positioned on the top at Chinese character " end ", so as to judge Chinese character " dignity " In the first Chinese character corresponding with phonetic alphabet group " duanzhuang " that includes be " end ".
Mode two:The Chinese character corresponding with the letter group upper one alphabetical group that be being included in character set is determined, by the Chinese Next Chinese character that the upper letter included in word set organizes corresponding Chinese character is defined as the head corresponding with the letter group Individual Chinese character.The purpose of the manner is, when in the file of format typesetting dislocation than it is more serious when, the position in mode one is closed System is when can determine that the first Chinese character corresponding with the letter group included in character set, so as to according to mode two come true Determine the first Chinese character corresponding with the letter group included in character set, improve accuracy.Specifically, it is determined that character set In the corresponding Chinese character of the upper one checked letter group of the letter group with needing to verify that includes, so as to by character set Comprising next Chinese character of the corresponding Chinese character of upper one checked letter group of the letter group with needing to verify be defined as The first Chinese character corresponding with the letter group.Such as when obtained character set is " she is the virtuous woman of a dignity ", when It needs to be determined that during first Chinese character in the character set to misplace corresponding to alphabetical group " duanzhuang ", alphabetical group will have been verified Next Chinese character " end " of Chinese character " individual " corresponding to " ge " is defined as needing alphabetical group " duanzhuang " adjusting corresponding Chinese character first Chinese character.
In addition, when first Chinese character is polyphone, it is also possible to following special circumstances be present:Spelled assuming that first Chinese character has Sound one and phonetic two, and the phonetic of next Chinese character corresponding to first Chinese character is phonetic three, correct pronunciation combination should be spelling The combination of sound one and phonetic three, still, because the subalphbet in phonetic two is identical with phonetic one, the part in phonetic three in addition Letter is combined with the subalphbet in phonetic two when can form a phonetic transcriptions of Chinese characters just, may mistakenly will be first The phonetic of Chinese character is identified as phonetic two, in turn results in mistake.In order to prevent such mistake, now need above-mentioned first Chinese character institute The phonetic transcriptions of Chinese characters of corresponding next Chinese character is matched with above-mentioned alphabetical group, i.e.,:First Chinese character is not only matched, it is also further The each Chinese character of matching thereafter, so as to improve accuracy.
The rule of second aspect is:For identifying the rule of tone-off Chinese character.Specifically, in the sentence that Chinese character is formed, Some Chinese characters are no corresponding phonetic, and such Chinese character is properly termed as tone-off Chinese character.Such as " youngster " in youngster's speech Word, if not handling this kind of special circumstances now, the dislocation of Chinese character and phonetic can be caused.Therefore, it is necessary to be directed to this Situation is adjusted to the dividing mode of each letter group in set of letters.Correspondingly, default regulation rule can include: It whether there is tone-off Chinese character in the multiple Chinese characters for judging to include in character set;If so, blank letter is inserted in set of letters Group, so that blank letter group corresponds to tone-off Chinese character, so that each alphabetical group included in set of letters corresponds respectively to the Chinese The Chinese character included in word set.Further, it whether there is nothing in the multiple Chinese characters included in above-mentioned judgement character set Sound Chinese character can be realized by following two ways:
Mode one:Determined according to text information between multiple Chinese characters and multiple phonetic alphabet corresponding with multiple Chinese characters Position relationship;According to the position relationship, judge that the position to intersect vertically corresponding to each Chinese character whether there is phonetic respectively Letter;If it is not, the Chinese character is defined as tone-off Chinese character.Specifically, position relationship refers to:Each Chinese character and letter are arranged in format Version text information in (i.e. change before original document in) corresponding to position relationship.When it is implemented, it can obtain respectively The position coordinate value of each Chinese character in the text information of format typesetting and each letter, and Chinese character and letter are analyzed accordingly Between position relationship.When two adjacent letters are respectively positioned on above or below same Chinese character, it is determined that this adjacent two Individual letter corresponds to same Chinese character.According to above-mentioned position relationship, when the top in one of Chinese character is without in contrast The phonetic alphabet answered, then can be determined that the Chinese character is tone-off Chinese character.
Mode two:The each Chinese character included in character set is directed to respectively, inquires about the phonetic transcriptions of Chinese characters corresponding to the Chinese character;Sentence Whether alphabetical group to match with the phonetic transcriptions of Chinese characters is included in above-mentioned set of letters of breaking;If it is not, the Chinese character is defined as the tone-off Chinese Word.The purpose of the manner is, due to typesetting, when multiple Chinese characters phonetic word corresponding with its in the file of format typesetting Dislocation between mother than it is more serious when, can be with when the mode according to position one can not determine tone-off Chinese character Tone-off Chinese character is determined so as to improve accuracy according to mode two.Specifically, each Chinese character that will can be included in character set Corresponding phonetic transcriptions of Chinese characters goes to match with alphabetical group in set of letters, so as to judge in above-mentioned set of letters whether to include with Match alphabetical group of the phonetic transcriptions of Chinese characters.
Further, blank letter group is inserted in set of letters described above, so that blank letter group corresponds to institute The step of stating tone-off Chinese character specifically includes:The adjacent Chinese characters of tone-off Chinese character are searched, are determined and adjacent Chinese characters phase in set of letters Corresponding adjacent letters group;The position adjacent with the adjacent letters group insertion blank letter group in set of letters.Wherein, above-mentioned sky The female group of wrongly written or mispronounced character is identified by default blank to be represented, default blank mark can be the symbols such as " * " " # ", can also be other Specific symbol.After blank letter group is inserted, also separator is inserted between adjacent letters group, so as to by the tone-off Chinese The phonetic alphabet corresponding to blank mark and adjacent Chinese characters corresponding to word are separated.
Step S250:Respectively by each letter group included in set of letters with corresponding to the letter group in character set Chinese character is associated typesetting.
By each alphabetical group included in set of letters correspond respectively to a Chinese character being included in character set and then By streaming type-setting mode respectively by each letter group included in set of letters with corresponding to the letter group in character set Chinese character is associated typesetting, to obtain that the file of edit-modify can be carried out by user.Wherein, streaming typesetting refers to document bag Word, numeral, form and the graph image contained carries out specific type-setting mode processing, and the content after preservation is original editor's member Element, user can view the typesetting style after editor by ocr software, and can be adaptive between different zoom ratios Space of a whole page size is shown.
The composition method of the text information provided according to the present embodiment, by multiple for being included in text information respectively Chinese character and the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, and obtain the character set corresponding with multiple Chinese characters And the set of letters corresponding with multiple phonetic alphabet, and it is whether big by the spacing between judging per two adjacent letters In default spacing threshold, and a separator is inserted between two adjacent letters, by each separator by set of letters In multiple letters be divided into multiple alphabetical groups.Then according to default regulation rule to each letter group in set of letters Dividing mode is adjusted, on the one hand, solves the problems, such as that the corresponding phonetic of the multiple Chinese characters occurred in format typesetting is connected; On the other hand, solve due to occurring tone-off word in Chinese character and cause the Chinese character after streaming typesetting can not be with its phonetic one One it is corresponding the problem of.So that each alphabetical group included in a set of letters Chinese for corresponding respectively to include in character set Word, finally each letter group included in set of letters is closed with corresponding to the Chinese character of the letter group in character set respectively Townhouse version, to obtain the file of streaming typesetting.The method provided according to embodiment, can be converted to streaming by format type-setting document The file of typesetting, and each Chinese character of streaming type-setting document and phonetic enabled to is mutually corresponding one by one, facilitates user Editor, it is time saving and energy saving.Which realizes accurately identifying for the text information comprising phonetic, without artificial check and correction, lifting The efficiency and accuracy of phonetic identification.
Another embodiment of the application provides a kind of nonvolatile computer storage media, and the computer-readable storage medium is deposited An at least executable instruction is contained, the computer executable instructions can perform the text information in above-mentioned any means embodiment Composition method.
Executable instruction specifically can be used for so that being operated below computing device:It is directed to what is included in text information respectively Multiple Chinese characters and the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, and obtain the Chinese character corresponding with multiple Chinese characters Set and the set of letters corresponding with multiple phonetic alphabet;According to the spacing between adjacent letters, by set of letters Multiple letters are divided into multiple alphabetical groups;Dividing mode according to default regulation rule to each letter group in set of letters It is adjusted, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set;Point The each letter group included in set of letters typesetting is not associated with corresponding to the Chinese character of the letter group in character set.
In a kind of optional mode, executable instruction further makes to operate below computing device:Judge per adjacent Whether the spacing between two letters is more than default spacing threshold;
If so, a separator is inserted between two adjacent letters, by each separator by set of letters Multiple letters be divided into multiple alphabetical groups;
Wherein, spacing threshold is preset to be determined according to an alphabetical mean breadth, and/or, spacing threshold is preset according to word The average headway between multiple letters in superclass determines.
In a kind of optional mode, executable instruction further makes to operate below computing device:Whenever judging phase When spacing between two adjacent letters is more than default spacing threshold, according to text information determine multiple Chinese characters and with multiple Chinese Position relationship between the corresponding multiple phonetic alphabet of word;
Judge whether two adjacent letters correspond to same Chinese character according to position relationship, if it is not, at adjacent two A separator is inserted between letter.
In a kind of optional mode:Wherein, default regulation rule includes:
Respectively for each alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic number Amount;
If so, the alphabetical assembling and dismantling are divided at least two alphabetical groups, so that each alphabetical group after splitting corresponds respectively to The Chinese character included in character set.
In a kind of optional mode, executable instruction further makes to operate below computing device:Determine character set In the first Chinese character corresponding with the letter group that includes, inquire about the phonetic transcriptions of Chinese characters corresponding to first Chinese character;
The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters according to corresponding to first Chinese character, so that after splitting An each alphabetical group Chinese character for corresponding respectively to include in character set.
In a kind of optional mode, executable instruction further makes to operate below computing device:
Judge whether the phonetic transcriptions of Chinese characters corresponding to first Chinese character is unique;
If it is not, each phonetic transcriptions of Chinese characters corresponding to first Chinese character is matched with alphabetical group respectively, according to matching result The alphabetical assembling and dismantling are divided at least two alphabetical groups.
In a kind of optional mode, executable instruction further makes to operate below computing device:
After each phonetic transcriptions of Chinese characters corresponding to first Chinese character is matched with alphabetical group respectively, further comprise:
The phonetic transcriptions of Chinese characters of next Chinese character corresponding to first Chinese character is matched with alphabetical group.
In a kind of optional mode, executable instruction further makes to operate below computing device:
Position between multiple Chinese characters and multiple phonetic alphabet corresponding with multiple Chinese characters is determined according to text information Relation;According to the position relationship, the first Chinese character corresponding with the letter group included in character set is determined;And/or
The Chinese character corresponding with the letter group upper one alphabetical group that be being included in character set is determined, by character set Comprising a upper letter next Chinese character for organizing corresponding Chinese character be defined as the first Chinese character corresponding with the letter group.
In a kind of optional mode, default regulation rule includes:
It whether there is tone-off Chinese character in the multiple Chinese characters for judging to include in character set;
If so, blank letter group is inserted in set of letters, so that blank letter group corresponds to tone-off Chinese character.
In a kind of optional mode, executable instruction further makes to operate below computing device:
Position between multiple Chinese characters and multiple phonetic alphabet corresponding with multiple Chinese characters is determined according to text information Relation;
According to the position relationship, judge that the position to intersect vertically corresponding to each Chinese character whether there is phonetic word respectively It is female;
If it is not, the Chinese character is defined as tone-off Chinese character.
In a kind of optional mode, executable instruction further makes to operate below computing device:
The each Chinese character included in character set is directed to respectively, inquires about the phonetic transcriptions of Chinese characters corresponding to the Chinese character;
Whether judge in set of letters comprising alphabetical group to match with the phonetic transcriptions of Chinese characters;If it is not, the Chinese character is defined as Tone-off Chinese character.
In a kind of optional mode, executable instruction further makes to operate below computing device:
The adjacent Chinese characters of tone-off Chinese character are searched, the adjacent letters group corresponding with adjacent Chinese characters is determined in set of letters;
The position adjacent with the adjacent letters group insertion blank letter group in set of letters;
Wherein, blank letter group is identified by default blank represents.
In a kind of optional mode, the multiple Chinese characters included in text information are multiple be located at a line or same row Chinese character, and the corresponding multiple phonetic alphabet of multiple Chinese characters are multiple phonetic alphabet being located at a line or same row.
In a kind of optional mode, text information is the information that typesetting is carried out by format type-setting mode;
Executable instruction further makes to operate below computing device:By streaming type-setting mode, respectively by set of letters In include it is each letter group with character set in correspond to the letter group Chinese character be associated typesetting.
Fig. 3 shows the structural representation of a kind of electronic equipment provided according to a further embodiment of the invention, the present invention Specific embodiment is not limited the specific implementation of electronic equipment.
As shown in figure 3, the electronic equipment can include:Processor (processor) 302, communication interface (Communications Interface) 304, memory (memory) 306 and communication bus 308.
Wherein:Processor 302, communication interface 304 and memory 306 complete mutual lead to by communication bus 308 Letter.Communication interface 304, for being communicated with the network element of miscellaneous equipment such as client or other servers etc..Processor 302, use In configuration processor 310, the correlation step in the composition method embodiment of above-mentioned text information can be specifically performed.
Specifically, program 310 can include program code, and the program code includes computer-managed instruction.
Processor 302 is probably central processor CPU, or specific integrated circuit ASIC (Application Specific Integrated Circuit), or it is arranged to implement the integrated electricity of one or more of the embodiment of the present invention Road.The one or more processors that electronic equipment includes, can be same type of processor, such as one or more CPU;Also may be used To be different types of processor, such as one or more CPU and one or more ASIC.
Memory 306, for depositing program 310.Memory 306 may include high-speed RAM memory, it is also possible to also include Nonvolatile memory (non-volatile memory), for example, at least a magnetic disk storage.
Program 310 specifically can be used for so that processor 302 performs following operation:It is directed to what is included in text information respectively Multiple Chinese characters and the multiple phonetic alphabet corresponding with multiple Chinese characters are identified, and obtain the Chinese character corresponding with multiple Chinese characters Set and the set of letters corresponding with multiple phonetic alphabet;According to the spacing between adjacent letters, by set of letters Multiple letters are divided into multiple alphabetical groups;Dividing mode according to default regulation rule to each letter group in set of letters It is adjusted, so that each alphabetical group included in a set of letters Chinese character for corresponding respectively to include in character set;Point The each letter group included in set of letters typesetting is not associated with corresponding to the Chinese character of the letter group in character set.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Judge per adjacent Two letters between spacing whether be more than default spacing threshold;
If so, a separator is inserted between two adjacent letters, by each separator by set of letters Multiple letters be divided into multiple alphabetical groups;
Wherein, spacing threshold is preset to be determined according to an alphabetical mean breadth, and/or, spacing threshold is preset according to word The average headway between multiple letters in superclass determines.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Whenever judging When spacing between two adjacent letters is more than default spacing threshold, according to text information determine multiple Chinese characters and with it is multiple Position relationship between the corresponding multiple phonetic alphabet of Chinese character;
Judge whether two adjacent letters correspond to same Chinese character according to position relationship, if it is not, at adjacent two A separator is inserted between letter.
In a kind of optional mode:Wherein, default regulation rule includes:
Respectively for each alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic number Amount;
If so, the alphabetical assembling and dismantling are divided at least two alphabetical groups, so that each alphabetical group after splitting corresponds respectively to The Chinese character included in character set.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Determine Chinese Character Set The first Chinese character corresponding with the letter group included in conjunction, inquires about the phonetic transcriptions of Chinese characters corresponding to first Chinese character;
The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters according to corresponding to first Chinese character, so that after splitting An each alphabetical group Chinese character for corresponding respectively to include in character set.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:
Judge whether the phonetic transcriptions of Chinese characters corresponding to first Chinese character is unique;
If it is not, each phonetic transcriptions of Chinese characters corresponding to first Chinese character is matched with alphabetical group respectively, according to matching result The alphabetical assembling and dismantling are divided at least two alphabetical groups.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Respectively will be first After each phonetic transcriptions of Chinese characters corresponding to Chinese character is matched with alphabetical group, further comprise:
The phonetic transcriptions of Chinese characters of next Chinese character corresponding to first Chinese character is matched with alphabetical group.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Believed according to word Breath determines the position relationship between multiple Chinese characters and multiple phonetic alphabet corresponding with multiple Chinese characters;Closed according to the position System, determines the first Chinese character corresponding with the letter group included in character set;And/or
The Chinese character corresponding with the letter group upper one alphabetical group that be being included in character set is determined, by character set Comprising a upper letter next Chinese character for organizing corresponding Chinese character be defined as the first Chinese character corresponding with the letter group.
In a kind of optional mode, default regulation rule includes:
It whether there is tone-off Chinese character in the multiple Chinese characters for judging to include in character set;
If so, blank letter group is inserted in set of letters, so that blank letter group corresponds to tone-off Chinese character.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:
Position between multiple Chinese characters and multiple phonetic alphabet corresponding with multiple Chinese characters is determined according to text information Relation;
According to the position relationship, judge that the position to intersect vertically corresponding to each Chinese character whether there is phonetic word respectively It is female;
If it is not, the Chinese character is defined as tone-off Chinese character.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:The Chinese is directed to respectively The each Chinese character included in word set, inquires about the phonetic transcriptions of Chinese characters corresponding to the Chinese character;
Whether judge in set of letters comprising alphabetical group to match with the phonetic transcriptions of Chinese characters;If it is not, the Chinese character is defined as Tone-off Chinese character.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Search the tone-off Chinese The adjacent Chinese characters of word, the adjacent letters group corresponding with adjacent Chinese characters is determined in set of letters;
The position adjacent with the adjacent letters group insertion blank letter group in set of letters;
Wherein, blank letter group is identified by default blank represents.
In a kind of optional mode, the multiple Chinese characters included in text information are multiple be located at a line or same row Chinese character, and the corresponding multiple phonetic alphabet of multiple Chinese characters are multiple phonetic alphabet being located at a line or same row.
In a kind of optional mode, text information is the information that typesetting is carried out by format type-setting mode;
Program 310 is further such that processor 302 performs following operation:By streaming type-setting mode, letter is collected respectively The each letter group included in conjunction is associated typesetting with corresponding to the Chinese character of the letter group in character set.
In a kind of optional mode, customer attribute information includes at least one in following dimension:Remaining sum dimension, supplement with money Frequency dimension, recharge amount dimension, consuming frequency dimension, spending amount dimension, read duration dimension, number of activities dimension, with And the information content dimension pushed.
In a kind of optional mode, when the customer attribute information corresponding with user mark includes multiple dimensions, Program 310 is further such that processor 302 performs following operation:The customer attribute information of each dimension is directed to respectively, it is determined that with The corresponding user property subclassification of the customer attribute information of the dimension;With reference to corresponding to the customer attribute information of each dimension User property subclassification determines the user property classification corresponding with customer attribute information.
In a kind of optional mode, multiple information categories to be pushed include:Book information classification, integration information class Not, information category in kind, electronic ticket information category, welfare card information classification, supplement favor information classification, action message classification with money And/or read authority information classification;Then program 310 is further such that processor 302 performs following operation:According to default user Corresponding relation between each user property classification stored in information MAP table and each information category, it is determined that and user property The information category that classification matches.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:Analyze user's category The data area belonging to user attribute data included in property information, according to default information content mapping table, belongs to from user Property multiple information contents for being included of the information category that matches of classification in determine the information content corresponding with the data area; Wherein, information content mapping table be used to storing multiple information contents for being included in each information category and with each information content The data area of corresponding user attribute data.
In a kind of optional mode, when the customer attribute information corresponding with user mark includes multiple dimensions, Program 310 is further such that processor 302 performs following operation:Determine what is included in the customer attribute information of each dimension respectively Data area belonging to user attribute data;With reference to belonging to the user attribute data included in the customer attribute information of each dimension Data area, and the weight of the customer attribute information of each dimension determines the corresponding information content.
In a kind of optional mode, program 310 is further such that processor 302 performs following operation:It is logical when receiving When crossing the request PUSH message of default push entrance triggering, the user's mark included in request PUSH message is obtained, it is determined that with The user identifies corresponding customer attribute information;And/or when the user behavior for monitoring active user meets default word During the typesetting condition of information, user's mark of active user is obtained, it is determined that the customer attribute information corresponding with user mark; Wherein, the typesetting condition of default text information includes at least one of the following:User read to abandon read probability be more than it is default The electronic book section of threshold value, user read chapters and sections and read chapters and sections more than default reading duration, user and reach default chapters and sections.
Algorithm and display be not inherently related to any certain computer, virtual system or miscellaneous equipment provided herein. Various general-purpose systems can also be used together with teaching based on this.As described above, required by constructing this kind of system Structure be obvious.In addition, the present invention is not also directed to any certain programmed language.It should be understood that it can utilize various Programming language realizes the content of invention described herein, and the description done above to language-specific is to disclose this hair Bright preferred forms.
In the specification that this place provides, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the present invention Example can be put into practice in the case of these no details.In some instances, known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the disclosure and help to understand one or more of each inventive aspect, Above in the description to the exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor The application claims of shield features more more than the feature being expressly recited in each claim.It is more precisely, such as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following embodiment are expressly incorporated in the embodiment, wherein each claim is in itself Separate embodiments all as the present invention.
Those skilled in the art, which are appreciated that, to be carried out adaptively to the module in the equipment in embodiment Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment Member or component be combined into a module or unit or component, and can be divided into addition multiple submodule or subelement or Sub-component.In addition at least some in such feature and/or process or unit exclude each other, it can use any Combination is disclosed to all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so to appoint Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint power Profit requires, summary and accompanying drawing) disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generation Replace.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed One of meaning mode can use in any combination.
It should be noted that the present invention will be described rather than limits the invention for above-described embodiment, and ability Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims, Any reference symbol between bracket should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not Element or step listed in the claims.Word "a" or "an" before element does not exclude the presence of multiple such Element.The present invention can be by means of including the hardware of some different elements and being come by means of properly programmed computer real It is existing.In if the unit claim of equipment for drying is listed, several in these devices can be by same hardware branch To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and run after fame Claim.
The invention discloses:A1. a kind of composition method of text information, including:
The multiple Chinese characters included in the text information and the multiple spellings corresponding with the multiple Chinese character are directed to respectively Sound letter is identified, and obtains the character set corresponding with the multiple Chinese character and corresponding with the multiple phonetic alphabet Set of letters;
According to the spacing between adjacent letters, multiple letters in the set of letters are divided into multiple alphabetical groups;
The dividing mode of each letter group in the set of letters is adjusted according to default regulation rule, so that Each alphabetical group included in the set of letters corresponds respectively in the character set Chinese character included;
Respectively by each letter group included in the set of letters with corresponding to the letter group in the character set Chinese character is associated typesetting.
A2. the method according to A1, wherein, the spacing according between adjacent letters, by the set of letters Multiple letters be divided into it is multiple letter group the step of specifically include:
Whether the spacing between judging per two adjacent letters is more than default spacing threshold;
If so, inserting a separator between two adjacent letters, the letter is collected by each separator Multiple letters in conjunction are divided into multiple alphabetical groups;
Wherein, the default spacing threshold determines according to an alphabetical mean breadth, and/or, the default spacing threshold Average headway between multiple letters of the value in the set of letters determines.
A3. the method according to A2, wherein, it is pre- whether the spacing judged between every two adjacent letters is more than If spacing threshold, if so, a step of separator is inserted between two adjacent letters specifically includes:
When the spacing between judging two adjacent letters is more than default spacing threshold, according to the text information Determine the position relationship between the multiple Chinese character and multiple phonetic alphabet corresponding with the multiple Chinese character;
Judge whether two adjacent letters correspond to same Chinese character according to the position relationship, if it is not, in institute State and a separator is inserted between two adjacent letters.
A4. the method according to A1-A3, wherein, the default regulation rule includes:
Respectively for each alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic number Amount;
If so, the alphabetical assembling and dismantling are divided at least two alphabetical groups, so that each alphabetical group after splitting corresponds respectively to The Chinese character included in the character set.
A5. the method according to A4, wherein, it is described that the alphabetical assembling and dismantling are divided at least two alphabetical groups, so as to split The step of each alphabetical group afterwards corresponds respectively in the character set Chinese character included specifically includes:
The first Chinese character corresponding with the letter group included in the character set is determined, inquires about the first Chinese character institute Corresponding phonetic transcriptions of Chinese characters;
The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters according to corresponding to the first Chinese character, so as to tear open An each alphabetical group Chinese character for corresponding respectively in the character set include after point.
A6. the method according to A5, wherein, the phonetic transcriptions of Chinese characters according to corresponding to the first Chinese character is by the letter Assembling and dismantling be divided at least two letter group the step of specifically include:
Judge whether the phonetic transcriptions of Chinese characters corresponding to the first Chinese character is unique;
If it is not, respectively matched each phonetic transcriptions of Chinese characters corresponding to the first Chinese character with described alphabetical group, according to The alphabetical assembling and dismantling are divided at least two alphabetical groups by matching result.
A7. the method according to A6, wherein, it is described respectively by each phonetic transcriptions of Chinese characters corresponding to the first Chinese character with After described alphabetical group the step of being matched, further comprise:
The phonetic transcriptions of Chinese characters of next Chinese character corresponding to the first Chinese character is matched with described alphabetical group.
A8. according to any described methods of A5-A7, wherein, it is described determining to include in the character set with the letter The step of organizing corresponding first Chinese character specifically includes:
The multiple Chinese character and the multiple phonetic words corresponding with the multiple Chinese character are determined according to the text information Position relationship between mother;According to the position relationship, the head corresponding with the letter group included in the character set is determined Individual Chinese character;And/or
The Chinese character corresponding with the letter group upper one alphabetical group that be being included in the character set is determined, by the Chinese Next Chinese character that the upper letter included in word set organizes corresponding Chinese character is defined as corresponding with the letter group First Chinese character.
A9. according to any described methods of A1-A8, wherein, the default regulation rule includes:
It whether there is tone-off Chinese character in the multiple Chinese characters for judging to include in the character set;
If so, blank letter group is inserted in the set of letters, so that blank letter group corresponds to the tone-off Chinese character.
A10. the method according to A9, wherein, in the multiple Chinese characters for judging to include in the character set whether The step of tone-off Chinese character be present specifically includes:
The multiple Chinese character and the multiple phonetic words corresponding with the multiple Chinese character are determined according to the text information Position relationship between mother;
According to the position relationship, judge that the position to intersect vertically corresponding to each Chinese character whether there is phonetic word respectively It is female;
If it is not, the Chinese character is defined as tone-off Chinese character.
A11. the method according to A9 or A10, wherein, in the multiple Chinese characters for judging to include in the character set Specifically included with the presence or absence of the step of tone-off Chinese character:
The each Chinese character included in the character set is directed to respectively, inquires about the phonetic transcriptions of Chinese characters corresponding to the Chinese character;
Whether judge in the set of letters comprising alphabetical group to match with the phonetic transcriptions of Chinese characters;It is if it is not, the Chinese character is true It is set to tone-off Chinese character.
A12. according to any described methods of A9-A11, wherein, it is described that blank letter group is inserted in the set of letters, So that the blank letter group specifically includes the step of corresponding to the tone-off Chinese character:
The adjacent Chinese characters of the tone-off Chinese character are searched, are determined in the set of letters corresponding with the adjacent Chinese characters Adjacent letters group;
The blank letter group is inserted in the position adjacent with the adjacent letters group in the set of letters;
Wherein, the blank letter group is identified by default blank represents.
A13. according to any described methods of A1-A12, wherein, the multiple Chinese characters included in the text information are multiple Positioned at the Chinese character of same a line or same row, and the corresponding multiple phonetic alphabet of the multiple Chinese character be located at a line to be multiple or The phonetic alphabet of same row.
A14. according to any described methods of A1-A13, wherein, the text information is to be carried out by format type-setting mode The information of typesetting;
It is described respectively to organize each letter included in the set of letters with corresponding to the letter in the character set The step of Chinese character of group is associated typesetting specifically includes:
By streaming type-setting mode, respectively by each letter group included in the set of letters and the character set Typesetting is associated corresponding to the Chinese character of the letter group.
B15. a kind of electronic equipment, including:Processor, memory, communication interface and communication bus, the processor, institute State memory and the communication interface and mutual communication is completed by the communication bus;
The memory is used to deposit an at least executable instruction, and the executable instruction makes below the computing device Operation:
The multiple Chinese characters included in the text information and the multiple spellings corresponding with the multiple Chinese character are directed to respectively Sound letter is identified, and obtains the character set corresponding with the multiple Chinese character and corresponding with the multiple phonetic alphabet Set of letters;
According to the spacing between adjacent letters, multiple letters in the set of letters are divided into multiple alphabetical groups;
The dividing mode of each letter group in the set of letters is adjusted according to default regulation rule, so that Each alphabetical group included in the set of letters corresponds respectively in the character set Chinese character included;
Respectively by each letter group included in the set of letters with corresponding to the letter group in the character set Chinese character is associated typesetting.
B16. the electronic equipment according to B15, the executable instruction further make to grasp below the computing device Make:
Whether the spacing between judging per two adjacent letters is more than default spacing threshold;
If so, inserting a separator between two adjacent letters, the letter is collected by each separator Multiple letters in conjunction are divided into multiple alphabetical groups;
Wherein, the default spacing threshold determines according to an alphabetical mean breadth, and/or, the default spacing threshold Average headway between multiple letters of the value in the set of letters determines.
B17. the electronic equipment according to B16, the executable instruction further make to grasp below the computing device Make:
When the spacing between judging two adjacent letters is more than default spacing threshold, according to the text information Determine the position relationship between the multiple Chinese character and multiple phonetic alphabet corresponding with the multiple Chinese character;
Judge whether two adjacent letters correspond to same Chinese character according to the position relationship, if it is not, in institute State and a separator is inserted between two adjacent letters.
B18. the electronic equipment according to B15-B17, wherein, the default regulation rule includes:
Respectively for each alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic number Amount;
If so, the alphabetical assembling and dismantling are divided at least two alphabetical groups, so that each alphabetical group after splitting corresponds respectively to The Chinese character included in the character set.
B19. the electronic equipment according to B18, wherein, the executable instruction further make the computing device with Lower operation:
The first Chinese character corresponding with the letter group included in the character set is determined, inquires about the first Chinese character institute Corresponding phonetic transcriptions of Chinese characters;
The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters according to corresponding to the first Chinese character, so as to tear open An each alphabetical group Chinese character for corresponding respectively in the character set include after point.
B20. the electronic equipment according to B19, wherein, the executable instruction further make the computing device with Lower operation:
Judge whether the phonetic transcriptions of Chinese characters corresponding to the first Chinese character is unique;
If it is not, respectively matched each phonetic transcriptions of Chinese characters corresponding to the first Chinese character with described alphabetical group, according to The alphabetical assembling and dismantling are divided at least two alphabetical groups by matching result.
B21. the electronic equipment according to B20, wherein, the executable instruction further make the computing device with Lower operation:After each phonetic transcriptions of Chinese characters corresponding to the first Chinese character is matched with described alphabetical group respectively, further Including:
The phonetic transcriptions of Chinese characters of next Chinese character corresponding to the first Chinese character is matched with described alphabetical group.
B22. according to any described electronic equipments of B19-B21, wherein, the executable instruction further makes the processing Device performs following operate:
The multiple Chinese character and the multiple phonetic words corresponding with the multiple Chinese character are determined according to the text information Position relationship between mother;According to the position relationship, the head corresponding with the letter group included in the character set is determined Individual Chinese character;And/or
The Chinese character corresponding with the letter group upper one alphabetical group that be being included in the character set is determined, by the Chinese Next Chinese character that the upper letter included in word set organizes corresponding Chinese character is defined as corresponding with the letter group First Chinese character.
B23. according to any described electronic equipments of B15-B22, wherein, the default regulation rule includes:
It whether there is tone-off Chinese character in the multiple Chinese characters for judging to include in the character set;
If so, blank letter group is inserted in the set of letters, so that blank letter group corresponds to the tone-off Chinese character.
B24. the electronic equipment according to B23, wherein, the executable instruction further make the computing device with Lower operation:
The multiple Chinese character and the multiple phonetic words corresponding with the multiple Chinese character are determined according to the text information Position relationship between mother;
According to the position relationship, judge that the position to intersect vertically corresponding to each Chinese character whether there is phonetic word respectively It is female;
If it is not, the Chinese character is defined as tone-off Chinese character.
B25. the electronic equipment according to B23 or B24, wherein, the executable instruction further makes the processor Perform following operate:
The each Chinese character included in the character set is directed to respectively, inquires about the phonetic transcriptions of Chinese characters corresponding to the Chinese character;
Whether judge in the set of letters comprising alphabetical group to match with the phonetic transcriptions of Chinese characters;It is if it is not, the Chinese character is true It is set to tone-off Chinese character.
B26. according to any described electronic equipments of B23-B25, wherein, the executable instruction further makes the processing Device performs following operate:
The adjacent Chinese characters of the tone-off Chinese character are searched, are determined in the set of letters corresponding with the adjacent Chinese characters Adjacent letters group;
The blank letter group is inserted in the position adjacent with the adjacent letters group in the set of letters;
Wherein, the blank letter group is identified by default blank represents.
B27. according to any described electronic equipments of B15-B26, wherein, the multiple Chinese characters included in the text information are Multiple Chinese characters being located at a line or same row, and the corresponding multiple phonetic alphabet of the multiple Chinese character to be multiple positioned at same The phonetic alphabet of row or same row.
B28. according to any described electronic equipments of B15-B27, wherein, the text information is to pass through format type-setting mode Carry out the information of typesetting;
The executable instruction further makes to operate below the computing device:
By streaming type-setting mode, respectively by each letter group included in the set of letters and the character set Typesetting is associated corresponding to the Chinese character of the letter group.
C29. a kind of computer-readable storage medium, an at least executable instruction is stored with the storage medium, it is described to hold Row instruction makes to operate below the computing device:
The multiple Chinese characters included in the text information and the multiple spellings corresponding with the multiple Chinese character are directed to respectively Sound letter is identified, and obtains the character set corresponding with the multiple Chinese character and corresponding with the multiple phonetic alphabet Set of letters;
According to the spacing between adjacent letters, multiple letters in the set of letters are divided into multiple alphabetical groups;
The dividing mode of each letter group in the set of letters is adjusted according to default regulation rule, so that Each alphabetical group included in the set of letters corresponds respectively in the character set Chinese character included;
Respectively by each letter group included in the set of letters with corresponding to the letter group in the character set Chinese character is associated typesetting.
C30. the computer-readable storage medium according to C29, the executable instruction further make the computing device Operate below:
Whether the spacing between judging per two adjacent letters is more than default spacing threshold;
If so, inserting a separator between two adjacent letters, the letter is collected by each separator Multiple letters in conjunction are divided into multiple alphabetical groups;
Wherein, the default spacing threshold determines according to an alphabetical mean breadth, and/or, the default spacing threshold Average headway between multiple letters of the value in the set of letters determines.
C31. the computer-readable storage medium according to C30, the executable instruction further make the computing device Operate below:
When the spacing between judging two adjacent letters is more than default spacing threshold, according to the text information Determine the position relationship between the multiple Chinese character and multiple phonetic alphabet corresponding with the multiple Chinese character;
Judge whether two adjacent letters correspond to same Chinese character according to the position relationship, if it is not, in institute State and a separator is inserted between two adjacent letters.
C32. the computer-readable storage medium according to C29-C31, wherein, the default regulation rule includes:
Respectively for each alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic number Amount;
If so, the alphabetical assembling and dismantling are divided at least two alphabetical groups, so that each alphabetical group after splitting corresponds respectively to The Chinese character included in the character set.
C33. the computer-readable storage medium according to C32, wherein, the executable instruction further makes the processor Perform following operate:
The first Chinese character corresponding with the letter group included in the character set is determined, inquires about the first Chinese character institute Corresponding phonetic transcriptions of Chinese characters;
The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters according to corresponding to the first Chinese character, so as to tear open An each alphabetical group Chinese character for corresponding respectively in the character set include after point.
C34. the computer-readable storage medium according to C33, wherein, the executable instruction further makes the processor Perform following operate:
Judge whether the phonetic transcriptions of Chinese characters corresponding to the first Chinese character is unique;
If it is not, respectively matched each phonetic transcriptions of Chinese characters corresponding to the first Chinese character with described alphabetical group, according to The alphabetical assembling and dismantling are divided at least two alphabetical groups by matching result.
C35. the computer-readable storage medium according to C34, wherein, the executable instruction further makes the processor Perform following operate:After each phonetic transcriptions of Chinese characters corresponding to the first Chinese character is matched with described alphabetical group respectively, Further comprise:
The phonetic transcriptions of Chinese characters of next Chinese character corresponding to the first Chinese character is matched with described alphabetical group.
C36. according to any described computer-readable storage mediums of C33-C35, wherein, the executable instruction further makes institute State and operated below computing device:
The multiple Chinese character and the multiple phonetic words corresponding with the multiple Chinese character are determined according to the text information Position relationship between mother;According to the position relationship, the head corresponding with the letter group included in the character set is determined Individual Chinese character;And/or
The Chinese character corresponding with the letter group upper one alphabetical group that be being included in the character set is determined, by the Chinese Next Chinese character that the upper letter included in word set organizes corresponding Chinese character is defined as corresponding with the letter group First Chinese character.
C37. according to any described computer-readable storage mediums of C29-C36, wherein, the default regulation rule includes:
It whether there is tone-off Chinese character in the multiple Chinese characters for judging to include in the character set;
If so, blank letter group is inserted in the set of letters, so that blank letter group corresponds to the tone-off Chinese character.
C38. the computer-readable storage medium according to C37, wherein, the executable instruction further makes the processor Perform following operate:
The multiple Chinese character and the multiple phonetic words corresponding with the multiple Chinese character are determined according to the text information Position relationship between mother;
According to the position relationship, judge that the position to intersect vertically corresponding to each Chinese character whether there is phonetic word respectively It is female;
If it is not, the Chinese character is defined as tone-off Chinese character.
C39. the computer-readable storage medium according to C37 or C38, wherein, the executable instruction further makes described Operated below computing device:
The each Chinese character included in the character set is directed to respectively, inquires about the phonetic transcriptions of Chinese characters corresponding to the Chinese character;
Whether judge in the set of letters comprising alphabetical group to match with the phonetic transcriptions of Chinese characters;It is if it is not, the Chinese character is true It is set to tone-off Chinese character.
C40. according to any described computer-readable storage mediums of C37-C39, wherein, the executable instruction further makes institute State and operated below computing device:
The adjacent Chinese characters of the tone-off Chinese character are searched, are determined in the set of letters corresponding with the adjacent Chinese characters Adjacent letters group;
The blank letter group is inserted in the position adjacent with the adjacent letters group in the set of letters;
Wherein, the blank letter group is identified by default blank represents.
C41. according to any described computer-readable storage mediums of C39-C40, wherein, what is included in the text information is multiple Chinese character is multiple Chinese characters being located at a line or same row, and the corresponding multiple phonetic alphabet of the multiple Chinese character are multiple positions In the phonetic alphabet of same a line or same row.
C42. according to any described computer-readable storage mediums of C29-C41, wherein, the text information is to be arranged by format Version mode carries out the information of typesetting;
The executable instruction further makes to operate below the computing device:
By streaming type-setting mode, respectively by each letter group included in the set of letters and the character set Typesetting is associated corresponding to the Chinese character of the letter group.

Claims (10)

1. a kind of composition method of text information, including:
The multiple Chinese characters included in the text information and the multiple phonetic words corresponding with the multiple Chinese character are directed to respectively Mother is identified, and obtains the character set corresponding with the multiple Chinese character and the word corresponding with the multiple phonetic alphabet Superclass;
According to the spacing between adjacent letters, multiple letters in the set of letters are divided into multiple alphabetical groups;
The dividing mode of each letter group in the set of letters is adjusted according to default regulation rule, so that described Each alphabetical group included in set of letters corresponds respectively in the character set Chinese character included;
Respectively by Chinese character of each letter group included in the set of letters with corresponding to the letter group in the character set It is associated typesetting.
2. the method according to claim 11, wherein, the spacing according between adjacent letters, by the set of letters In multiple letters be divided into it is multiple letter group the step of specifically include:
Whether the spacing between judging per two adjacent letters is more than default spacing threshold;
If so, a separator is inserted between two adjacent letters, by each separator by the set of letters Multiple letters be divided into multiple alphabetical groups;
Wherein, the default spacing threshold determines according to an alphabetical mean breadth, and/or, the default spacing threshold root Determined according to the average headway between multiple letters in the set of letters.
3. according to the method for claim 2, wherein, whether the spacing judged between every two adjacent letters is more than Default spacing threshold, if so, a step of separator is inserted between two adjacent letters specifically includes:
When the spacing between judging two adjacent letters is more than default spacing threshold, determined according to the text information Position relationship between the multiple Chinese character and multiple phonetic alphabet corresponding with the multiple Chinese character;
Judge whether two adjacent letters correspond to same Chinese character according to the position relationship, if it is not, in the phase A separator is inserted between two adjacent letters.
4. according to the method described in claim 1-3, wherein, the default regulation rule includes:
Respectively for each alphabetical group, whether the alphabetical quantity for judging to include in this alphabetical group is more than default phonetic quantity;
If so, the alphabetical assembling and dismantling are divided into at least two alphabetical groups so that each alphabetical group after splitting correspond respectively to it is described The Chinese character included in character set.
5. the method according to claim 11, wherein, it is described that the alphabetical assembling and dismantling are divided at least two alphabetical groups, so as to tear open Each alphabetical group after point specifically includes the step of corresponding respectively in the character set Chinese character included:
The first Chinese character corresponding with the letter group included in the character set is determined, is inquired about corresponding to the first Chinese character Phonetic transcriptions of Chinese characters;
The alphabetical assembling and dismantling are divided at least two alphabetical groups by the phonetic transcriptions of Chinese characters according to corresponding to the first Chinese character, so that after splitting An each alphabetical group Chinese character for corresponding respectively in the character set include.
6. according to the method for claim 5, wherein, the phonetic transcriptions of Chinese characters according to corresponding to the first Chinese character is by the word Female assembling and dismantling be divided at least two letter group the step of specifically include:
Judge whether the phonetic transcriptions of Chinese characters corresponding to the first Chinese character is unique;
If it is not, each phonetic transcriptions of Chinese characters corresponding to the first Chinese character is matched with described alphabetical group respectively, according to matching As a result the alphabetical assembling and dismantling are divided at least two alphabetical groups.
7. according to the method for claim 6, wherein, it is described respectively by the first Chinese character corresponding to each phonetic transcriptions of Chinese characters After the step of being matched with described alphabetical group, further comprise:
The phonetic transcriptions of Chinese characters of next Chinese character corresponding to the first Chinese character is matched with described alphabetical group.
8. according to any described methods of claim 5-7, wherein, it is described determining to include in the character set with the letter The step of organizing corresponding first Chinese character specifically includes:
According to the text information determine the multiple Chinese character and the multiple phonetic alphabet corresponding with the multiple Chinese character it Between position relationship;According to the position relationship, the first Chinese corresponding with the letter group included in the character set is determined Word;And/or
The Chinese character corresponding with the letter group upper one alphabetical group that be being included in the character set is determined, by the Chinese Character Set Next Chinese character that the upper letter included in conjunction organizes corresponding Chinese character is defined as the head corresponding with the letter group Individual Chinese character.
9. a kind of electronic equipment, including:Processor, memory, communication interface and communication bus, the processor, the storage Device and the communication interface complete mutual communication by the communication bus;
The memory is used to deposit an at least executable instruction, and the executable instruction makes to grasp below the computing device Make:
The multiple Chinese characters included in the text information and the multiple phonetic words corresponding with the multiple Chinese character are directed to respectively Mother is identified, and obtains the character set corresponding with the multiple Chinese character and the word corresponding with the multiple phonetic alphabet Superclass;
According to the spacing between adjacent letters, multiple letters in the set of letters are divided into multiple alphabetical groups;
The dividing mode of each letter group in the set of letters is adjusted according to default regulation rule, so that described Each alphabetical group included in set of letters corresponds respectively in the character set Chinese character included;
Respectively by Chinese character of each letter group included in the set of letters with corresponding to the letter group in the character set It is associated typesetting.
10. a kind of computer-readable storage medium, an at least executable instruction, the executable instruction are stored with the storage medium Make to operate below the computing device:
The multiple Chinese characters included in the text information and the multiple phonetic words corresponding with the multiple Chinese character are directed to respectively Mother is identified, and obtains the character set corresponding with the multiple Chinese character and the word corresponding with the multiple phonetic alphabet Superclass;
According to the spacing between adjacent letters, multiple letters in the set of letters are divided into multiple alphabetical groups;
The dividing mode of each letter group in the set of letters is adjusted according to default regulation rule, so that described Each alphabetical group included in set of letters corresponds respectively in the character set Chinese character included;
Respectively by Chinese character of each letter group included in the set of letters with corresponding to the letter group in the character set It is associated typesetting.
CN201711182001.2A 2017-11-23 2017-11-23 Composition method, electronic equipment and the computer storage medium of text information Active CN107783956B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711182001.2A CN107783956B (en) 2017-11-23 2017-11-23 Composition method, electronic equipment and the computer storage medium of text information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711182001.2A CN107783956B (en) 2017-11-23 2017-11-23 Composition method, electronic equipment and the computer storage medium of text information

Publications (2)

Publication Number Publication Date
CN107783956A true CN107783956A (en) 2018-03-09
CN107783956B CN107783956B (en) 2019-03-15

Family

ID=61430627

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711182001.2A Active CN107783956B (en) 2017-11-23 2017-11-23 Composition method, electronic equipment and the computer storage medium of text information

Country Status (1)

Country Link
CN (1) CN107783956B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325215A (en) * 2018-12-04 2019-02-12 万兴科技股份有限公司 The output method and device of Word text
CN113052179A (en) * 2021-03-09 2021-06-29 安徽淘云科技股份有限公司 Polyphone processing method and device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101876967A (en) * 2010-03-25 2010-11-03 深圳市万兴软件有限公司 Method for generating PDF text paragraphs
CN103136186A (en) * 2011-12-05 2013-06-05 北大方正集团有限公司 Method and device of pinyin type setting
CN103150300A (en) * 2011-12-06 2013-06-12 北大方正集团有限公司 Pinyin typesetting method and device
CN106598934A (en) * 2016-12-14 2017-04-26 掌阅科技股份有限公司 Electronic book data display method and device, and terminal equipment
CN106940596A (en) * 2016-01-04 2017-07-11 北京峰盛博远科技股份有限公司 A kind of recognition methods of multiple characters of handwriting input and system
CN107025215A (en) * 2017-02-13 2017-08-08 阿里巴巴集团控股有限公司 A kind of picture and text composition method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101876967A (en) * 2010-03-25 2010-11-03 深圳市万兴软件有限公司 Method for generating PDF text paragraphs
CN103136186A (en) * 2011-12-05 2013-06-05 北大方正集团有限公司 Method and device of pinyin type setting
CN103150300A (en) * 2011-12-06 2013-06-12 北大方正集团有限公司 Pinyin typesetting method and device
CN106940596A (en) * 2016-01-04 2017-07-11 北京峰盛博远科技股份有限公司 A kind of recognition methods of multiple characters of handwriting input and system
CN106598934A (en) * 2016-12-14 2017-04-26 掌阅科技股份有限公司 Electronic book data display method and device, and terminal equipment
CN107025215A (en) * 2017-02-13 2017-08-08 阿里巴巴集团控股有限公司 A kind of picture and text composition method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325215A (en) * 2018-12-04 2019-02-12 万兴科技股份有限公司 The output method and device of Word text
CN109325215B (en) * 2018-12-04 2023-02-10 万兴科技股份有限公司 Word text output method and device
CN113052179A (en) * 2021-03-09 2021-06-29 安徽淘云科技股份有限公司 Polyphone processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN107783956B (en) 2019-03-15

Similar Documents

Publication Publication Date Title
US7756871B2 (en) Article extraction
Kukich Techniques for automatically correcting words in text
US20160103823A1 (en) Machine Learning Extraction of Free-Form Textual Rules and Provisions From Legal Documents
US8725494B2 (en) Signal processing approach to sentiment analysis for entities in documents
Bakliwal et al. Towards Enhanced Opinion Classification using NLP Techniques.
CN107330071A (en) A kind of legal advice information intelligent replies method and platform
CN108629046A (en) A kind of fields match method and terminal device
CN111125354A (en) Text classification method and device
CN109800414A (en) Faulty wording corrects recommended method and system
CN102576358A (en) Word pair acquisition device, word pair acquisition method, and program
AU2020200410B2 (en) Mitigation of conflicts between content matchers in automated document analysis
US9063923B2 (en) Method for identifying the integrity of information
CN105760359B (en) Question processing system and method thereof
US11048934B2 (en) Identifying augmented features based on a bayesian analysis of a text document
CN110334217A (en) A kind of element abstracting method, device, equipment and storage medium
CN109165386A (en) A kind of Chinese empty anaphora resolution method and system
CN107741972A (en) A kind of searching method of picture, terminal device and storage medium
CN106610990A (en) Emotional tendency analysis method and apparatus
Hussein Arabic document similarity analysis using n-grams and singular value decomposition
CN112668311A (en) Text error detection method and device
CN110489559A (en) A kind of file classification method, device and storage medium
CN107783956A (en) Composition method, electronic equipment and the computer-readable storage medium of text information
CN109062977A (en) A kind of automatic question answering text matching technique, automatic question-answering method and system based on semantic similarity
CN109614623A (en) A kind of composition processing method and system based on syntactic analysis
CN112148862A (en) Question intention identification method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant