CN106649246B - Line breaking method and device - Google Patents

Line breaking method and device Download PDF

Info

Publication number
CN106649246B
CN106649246B CN201510729273.4A CN201510729273A CN106649246B CN 106649246 B CN106649246 B CN 106649246B CN 201510729273 A CN201510729273 A CN 201510729273A CN 106649246 B CN106649246 B CN 106649246B
Authority
CN
China
Prior art keywords
row
line
field
adjacent
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510729273.4A
Other languages
Chinese (zh)
Other versions
CN106649246A (en
Inventor
刘建军
王学武
于芬芬
袁朝
任珊珊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhiwen Artificial Intelligence Software Technology Co ltd
Original Assignee
Founder International Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Founder International Beijing Co Ltd filed Critical Founder International Beijing Co Ltd
Priority to CN201510729273.4A priority Critical patent/CN106649246B/en
Publication of CN106649246A publication Critical patent/CN106649246A/en
Application granted granted Critical
Publication of CN106649246B publication Critical patent/CN106649246B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)
  • Orthopedics, Nursing, And Contraception (AREA)

Abstract

The invention relates to the technical field of software, in particular to a line breaking method and a line breaking device, which are used for solving the problem that the line breaking mode of text content in the prior art can not carry out intelligent line breaking according to the semantics of a line in the text content, and the method comprises the following steps: aiming at one line of the target content which is typeset in a broken line, determining a target field and an adjusting mode which need to be adjusted in the line according to a word segmentation word bank and/or a preset grammar rule, and then adjusting the target field in the line to the rearmost of the adjacent upper line or the foremost of the next line according to the determined adjusting mode, so that the text content of each line can be adjusted based on the word segmentation word bank and/or the preset grammar rule, and the text content of each line keeps semantic consistency and completeness.

Description

Line breaking method and device
Technical Field
The invention belongs to the technical field of software, and particularly relates to a line breaking method and device.
Background
For a given text content, the text content cannot be arranged in one line in many cases, so that corresponding measures need to be taken for processing, line changing processing is generally carried out, the text content occupies a plurality of lines, and therefore, line changing position selection is a problem, proper selection is carried out, reading is natural and clear, semantics are smooth, and the text content achieves the effects of attractiveness and complete semantics.
In the prior art, when typesetting text content, line breaking can be performed in the following two ways:
the first method is as follows: automatically breaking a line according to the maximum number of words allowed to be displayed in one line, namely, fixedly displaying the maximum number of words allowed in each line;
the second method comprises the following steps: and manually carrying out carriage return and line breaking according to context semantics.
Both of the above two line breaking methods have certain defects, wherein:
the first method is as follows: the intelligent line breaking can not be carried out according to the semantics of the text content, so that the integrity and the attractiveness of the semantics of one line of text content can be damaged;
the second method comprises the following steps: the manual line breaking consumes more time, wastes resources, and the line breaking effect varies from person to person and cannot be unified.
In summary, the method for breaking the text content in the prior art cannot intelligently break the text content according to the semantics of one line in the text content.
Disclosure of Invention
The invention provides a line breaking method and device, which are used for solving the problem that the line breaking mode of text content in the prior art cannot carry out intelligent line breaking according to the semantics of one line in the text content.
On one hand, a line breaking method provided by the embodiment of the application includes:
aiming at one line of the target content after line breaking typesetting, determining a target field and an adjusting mode which need to be adjusted in the line according to a word segmentation word bank and/or a preset grammar rule;
and adjusting the target fields in the row to the rearmost of the adjacent upper row or the foremost of the adjacent lower row according to the determined adjustment mode.
According to the line breaking method provided by the embodiment of the application, aiming at one line of the target content which is subjected to line breaking typesetting, the target field and the adjusting mode which need to be adjusted in the line are determined according to the word segmentation word bank and/or the preset grammar rule, and then the target field in the line is adjusted to the rearmost of the adjacent upper line or the foremost of the next line according to the determined adjusting mode, so that the text content of each line can be adjusted based on the word segmentation word bank and/or the preset grammar rule, and the text content of each line keeps semantic consistency and completeness.
Optionally, determining a target field and an adjustment mode that need to be adjusted in the row according to the word segmentation lexicon includes:
if the phrase formed by the field at the tail of the line and the field at the head of the next line adjacent to the field at the tail of the line is determined to belong to the word segmentation word stock, determining that the target field is the field at the tail of the line, and determining that the adjustment mode is to move the target field to the head of the next line adjacent to the field at the head of the next line; or
And if the phrase formed by the field at the head of the line and the field at the tail of the line in the adjacent previous line is determined to belong to the word segmentation word stock, determining that the target field is the field at the head of the line, and determining that the adjustment mode is to move the target field to the head of the line in the adjacent previous line.
Optionally, determining a target field and an adjustment mode that need to be adjusted in the line according to a preset syntax rule includes:
if the field at the tail of the line and the field at the head of the adjacent next line meet the preset grammar rule, determining that a target field is the field at the tail of the line, and determining that an adjustment mode is to move the target field to the head of the adjacent next line; or
And if the field at the head of the line and the field at the tail of the adjacent previous line meet the preset grammar rule, determining that the target field is the field at the head of the line, and determining that the adjustment mode is to move the target field to the tail of the adjacent previous line.
Optionally, adjusting the target field in the line to be at the rearmost of the adjacent previous line or before the foremost of the next line according to the determined adjustment manner, further includes:
determining that the number of words in the adjacent last row is not greater than the maximum number of words in the row after the target fields in the row are adjusted to the rearmost of the adjacent last row according to the adjustment manner; or
Determining that the number of words in the next adjacent row is not greater than the maximum number of words in the row after the target fields in the row are adjusted to the forefront of the next adjacent row according to the adjustment.
Optionally, the method further includes:
if the word number of the adjacent previous line is larger than the maximum word number of the previous line after the target fields in the line are adjusted to the rearmost of the adjacent previous line according to the adjusting mode, determining the adjusting mode of the adjacent previous line and the target fields of the adjacent previous line, and adjusting the target fields of the adjacent previous line according to the adjusting mode of the adjacent previous line; or
And if the target field in the line is adjusted to the forefront of the next adjacent line according to the determined adjustment mode, and the word number of the next adjacent line is greater than the maximum word number of the line, determining the adjustment mode of the next adjacent line and the target field of the next adjacent line, and adjusting the target field of the next adjacent line according to the adjustment mode of the next adjacent line.
Optionally, the adjacent previous row target field is: the phrase formed by the field and the field at the head of the line belongs to a word segmentation word bank or meets a preset grammar rule;
the adjustment mode is as follows: moving the adjacent previous row target field to the row head of the row;
the next adjacent row target fields are: the phrase formed by the fields and the fields at the line tail of the line belongs to a word segmentation word bank or meets a preset grammar rule;
the adjustment mode is as follows: and moving the next adjacent row target field to the row tail of the row.
On the other hand, the embodiment of the present application provides a breaking device, including:
the determining unit is used for determining a target field and an adjusting mode which need to be adjusted in a row according to a word segmentation word bank and/or a preset grammar rule aiming at the row of the target content after line breaking and typesetting;
and the adjusting unit is used for adjusting the target fields in the line to the rearmost of the adjacent previous line or the foremost of the next line according to the determined adjusting mode.
Optionally, the determining unit is specifically configured to:
if the phrase formed by the field at the tail of the line and the field at the head of the next line adjacent to the field at the tail of the line is determined to belong to the word segmentation word stock, determining that the target field is the field at the tail of the line, and determining that the adjustment mode is to move the target field to the head of the next line adjacent to the field at the head of the next line; or
And if the phrase formed by the field at the head of the line and the field at the tail of the line in the adjacent previous line is determined to belong to the word segmentation word stock, determining that the target field is the field at the head of the line, and determining that the adjustment mode is to move the target field to the head of the line in the adjacent previous line.
Optionally, the determining unit is specifically configured to:
if the field at the tail of the line and the field at the head of the adjacent next line meet the preset grammar rule, determining that a target field is the field at the tail of the line, and determining that an adjustment mode is to move the target field to the head of the adjacent next line; or
And if the field at the head of the line and the field at the tail of the adjacent previous line meet the preset grammar rule, determining that the target field is the field at the head of the line, and determining that the adjustment mode is to move the target field to the tail of the adjacent previous line.
Optionally, the determining unit is specifically configured to:
if the number of words in the adjacent previous line is not more than the maximum number of words in the line after the target fields in the line are adjusted to the rearmost of the adjacent previous line according to the adjusting mode, adjusting the target fields in the line to the rearmost of the adjacent previous line or the foremost of the next line according to the determined adjusting mode; or
And if the number of words in the next adjacent row is not more than the maximum number of words in the next adjacent row after the target fields in the row are adjusted to the forefront of the next adjacent row according to the adjusting mode, adjusting the target fields in the row to the rearmost of the previous adjacent row or the forefront of the next adjacent row according to the determined adjusting mode.
Optionally, the determining unit is specifically configured to:
if the word number of the adjacent previous line is larger than the maximum word number of the previous line after the target fields in the line are adjusted to the rearmost of the adjacent previous line according to the adjusting mode, determining the adjusting mode of the adjacent previous line and the target fields of the adjacent previous line, and adjusting the target fields of the adjacent previous line according to the adjusting mode of the adjacent previous line; or
And if the target field in the line is adjusted to the forefront of the next adjacent line according to the determined adjustment mode, and the word number of the next adjacent line is greater than the maximum word number of the line, determining the adjustment mode of the next adjacent line and the target field of the next adjacent line, and adjusting the target field of the next adjacent line according to the adjustment mode of the next adjacent line.
Optionally, the adjacent previous row target field is: the phrase formed by the field and the field at the head of the line belongs to a word segmentation word bank or meets a preset grammar rule;
the adjustment mode is as follows: moving the adjacent previous row target field to the row head of the row;
the next adjacent row target fields are: the phrase formed by the fields and the fields at the line tail of the line belongs to a word segmentation word bank or meets a preset grammar rule;
the adjustment mode is as follows: and moving the next adjacent row target field to the row tail of the row.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a flow chart of a method for breaking a line according to an embodiment of the present invention;
FIG. 2 is a detailed flowchart of a line breaking method according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a circuit breaker according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
According to the line breaking method provided by the embodiment of the application, aiming at one line of the target content which is subjected to line breaking typesetting, the target field and the adjusting mode which need to be adjusted in the line are determined according to the word segmentation word bank and/or the preset grammar rule, and then the target field in the line is adjusted to the rearmost of the adjacent upper line or the foremost of the next line according to the determined adjusting mode, so that the text content of each line can be adjusted based on the word segmentation word bank and/or the preset grammar rule, and the text content of each line keeps semantic consistency and completeness.
The embodiments of the present application will be described in further detail with reference to the drawings attached hereto.
As shown in fig. 1, a flowchart of a line breaking method according to an embodiment of the present invention is provided, where the method is executed by a line breaking apparatus, and includes:
step 101, aiming at one row of the target content after line breaking typesetting, determining a target field and an adjusting mode which need to be adjusted in the row according to a word segmentation word bank and/or a preset grammar rule;
and step 102, adjusting the target field in the line to the rearmost of the adjacent upper line or the foremost of the next line according to the determined adjusting mode.
In step 101, the target content after the line break typesetting may be content that needs to be typeset again in a semantic line break mode after the line break typesetting is performed in a certain mode, for example, for an article, after the line break typesetting is performed in a conventional mode that a line allows at most the number of words to be displayed, for the convenience of semantic understanding, the title content therein needs to be typeset again in a semantic line break mode again, and the title after the line break typesetting is the target content in the present invention; for another example, for a congratulatory word in a greeting card, firstly, typesetting the congratulatory word into a character with a certain shape, such as an love shape, and then, typesetting all the character contents after typesetting again in a semantic line-breaking manner, so that all the character contents after line-breaking typesetting are the target contents in the invention.
Specifically, a target field and an adjustment mode which need to be adjusted in a row are determined according to a word segmentation word bank and/or a preset grammar rule aiming at the row of the target content after line breaking and typesetting.
It should be particularly noted that all the line adjustment manners must be unified, that is, all the adjustment manners must be based on the word segmentation lexicon and the target field of the line head, or all the adjustment manners must be based on the preset grammar rule and the target field of the line head; in addition, the combination of the word segmentation word bank and the mode of adjusting the target field of the line end, and the like can be performed.
The word segmentation word stock comprises word groups which can form complete meanings, the source of the word groups can be words in a dictionary, and can also be words which are manually updated to the word segmentation word stock, such as latest popular languages, such as 'building owner' and 'you understand' and the like; or industry terminology such as "frequency modulation," "decoding," etc.; or abbreviations with a particular meaning, such as "review" or "Olympic Commission" or the like. By continuous updating, the word segmentation can be carried out more accurately on the basis of the word segmentation word bank.
Optionally, determining a target field and an adjustment mode that need to be adjusted in the row according to the word segmentation lexicon includes:
if the phrase formed by the field at the tail of the line and the field at the head of the next line adjacent to the field at the tail of the line is determined to belong to the word segmentation word stock, determining that the target field is the field at the tail of the line, and determining that the adjustment mode is to move the target field to the head of the next line adjacent to the field at the head of the next line; or
And if the phrase formed by the field at the head of the line and the field at the tail of the line in the adjacent previous line is determined to belong to the word segmentation word stock, determining that the target field is the field at the head of the line, and determining that the adjustment mode is to move the target field to the head of the line in the adjacent previous line.
The method determines a target field and an adjusting mode which need to be adjusted in one row according to the word segmentation word bank, wherein the method comprises two modes:
the first method is as follows: the field at the end of the line for each line is adjusted.
And if the phrase formed by the field at the tail of one row and the field at the head of the next row is determined to belong to the word segmentation lexicon, determining that the target field is the field at the tail of the row, and determining that the adjustment mode is to move the target field to the head of the next row. Specifically, based on the word segmentation word stock, a reverse matching algorithm can be adopted for the field at the tail of a line, and whether the last word dirty of the line and the first word mud at the head of the next line form a word group of the word segmentation word stock is matched firstly, if so, the matching is finished, otherwise, the matching is continued, and the matching is successful because the matched mud is the word group belonging to the word segmentation word stock. Of course, if the primary matching is not successful, the backward matching is continued, for example, if "sludge" is not the phrase of the word segmentation thesaurus, the "dirty" and "mud" are used for matching, if not, the "dirty" and "mud again" matching is continued, the "dirty" and "mud" matching is used, and the like, and finally the matching can be successful.
For example, as shown in table 1, an example of moving the target field at the end of a line to the head of the next line.
Figure BDA0000835438290000081
TABLE 1 moving the target field at the end of a line to the head of the next line
According to the first mode, the first line in table 1 is adjusted, the content of the first line before adjustment is "stain obtained by treating printing water by coagulation method", and according to the first mode, the word "stain" at the end of the line and the word "mud" at the head of the next line can be determined to constitute the word "mud" belonging to the participle word bank, so that the "stain" in the first line is determined as the target field, and the adjustment mode is determined to move the target field "stain" to the head of the next line adjacent thereto.
The second method comprises the following steps: the adjustment is made for the fields of the head of the line for each line.
And if the phrase formed by the field at the head of the row and the field at the tail of the row in the adjacent previous row belongs to the word segmentation lexicon, determining that the target field is the field at the head of the row which is changed, and determining that the adjustment mode is to move the target field to the head of the row in the adjacent previous row.
Also taking table 1 as an example, according to the second mode, the second row in table 1 is adjusted, the content of the second row before adjustment is "study of recycling of mud", and according to the second mode, the word "mud" at the head of the row and the field "dirty" at the tail of the row in the previous row can be determined to constitute the word "mud" belonging to the participle word bank, so that the "mud" in the second row is determined as the target field, and the adjustment mode is determined to move the target field "mud" to the tail of the adjacent previous row. The adjusted target content is shown in table 2.
Figure BDA0000835438290000091
TABLE 2 moving the target field at the head of the line to the end of the line at the previous line
As can be seen from the above, in the above-mentioned first and second manners for each line of the target content, the target field of each line and the adjustment manner can be determined, and further, the target field can be adjusted according to the adjustment manner. The method can realize the disconnection of the target content according to the semantics based on the word segmentation word bank, and manual adjustment is not needed, so that the time is saved, and the efficiency is improved.
Optionally, determining a target field and an adjustment mode that need to be adjusted in the line according to a preset syntax rule includes:
if the field at the tail of the line and the field at the head of the adjacent next line meet the preset grammar rule, determining that a target field is the field at the tail of the line, and determining that an adjustment mode is to move the target field to the head of the adjacent next line; or
And if the field at the head of the line and the field at the tail of the adjacent previous line meet the preset grammar rule, determining that the target field is the field at the head of the line, and determining that the adjustment mode is to move the target field to the tail of the adjacent previous line.
The method determines a target field and an adjusting mode which need to be adjusted in a line according to a grammar rule, wherein the method comprises two modes:
the first method is as follows: the field at the end of the line for each line is adjusted.
And if the field at the tail of the line and the field at the head of the adjacent next line meet the preset grammar rule, determining that the target field is the field at the tail of the line, and determining that the adjustment mode is to move the target field to the head of the adjacent next line.
The second method comprises the following steps: the adjustment is made for the fields of the head of the line for each line.
And if the field at the head of the line and the field at the tail of the adjacent previous line meet the preset grammar rule, determining that the target field is the field at the head of the line, and determining that the adjustment mode is to move the target field to the tail of the adjacent previous line.
For example, the following grammar rules, but not limited to the following grammar rules, are provided in the embodiment of the present invention to perform semantic adjustment on the target content. Whether the sentences in the target content meet the preset grammar rules or not can be judged through a word segmentation method based on a dictionary library in the prior art, and the part of speech of the words or the words is recorded in the dictionary library in the word segmentation method based on the dictionary library, so that the part of speech of the words or the words in one line of the target content can be firstly determined through the word segmentation method based on the dictionary library, and then whether the preset grammar rules are met or not is determined.
Grammar rule one: the verb is in the same line as the object.
For example, if the target content after the line-breaking layout is a title, the target content is compared before and after being adjusted according to the mode one and the grammar rule with reference to table 3-1.
Figure BDA0000835438290000101
Table 3-1 adjusts the targeted content according to mode one and grammar rules
In Table 3-1, for the first line in the target content before adjustment, it can be determined that the field "determine" at the end of the first line is a verb and the field "deamination content" at the head of the next line is a "measure" object. According to the first mode, the verb "measure" of the end of the line of the first line can be determined as the target field, and the adjustment mode is determined as moving the target field to the head of the next line adjacent to the end of the line, so that the adjusted target content is shown in table 3-1, and the target field "measure" in the target content is adjusted to the head of the next line.
In Table 3-1, for the second row in the target content before adjustment, the "deamination content" field at the head of the second row can be determined as the object of the "measurement" field at the end of the previous row. And determining that the field 'deamination content' at the head of the line of the second line is a target field according to the second mode, and determining that the adjustment mode is to move the target field to the tail of the adjacent previous line, wherein the adjusted target content is shown in the table 3-2, and the target field 'deamination content' in the target content is adjusted to the tail position of the previous line.
Target content before adjustment Adjusted target content
Determination using amino acid analyzer Determination of the content of des-amino acids using an amino acid analyzer
Research and feasibility analysis of content of desamic acid Study and feasibility analysis of
Table 3-2 adjusts the target content according to the second mode and the grammar rule
Grammar rule two: the structural assistant is in the same line as the modifier before the structural assistant.
For example, if the target content after the line-breaking typesetting is a title, the target content is compared before and after being adjusted according to the mode one and the grammar rule with reference to the table 4-1.
Figure BDA0000835438290000111
Table 4-1 adjusts the target content according to the first mode and the grammar rule
In Table 4-1, for the first row in the target content before adjustment, it can be determined that the field "measure" at the end of the first row and the field "at the beginning of the next row" can constitute the relationship between the modifier and the structural co-word of the modifier. According to the second mode, the modifier "measure" at the end of the line of the first line can be determined as the target field, and the adjustment mode is determined as moving the target field to the head of the next line adjacent to the end of the line, so that the adjusted target content is shown in table 3-1, and the target field "measure" in the target content is adjusted to the head of the next line.
In table 4-1, for the second row in the target content before adjustment, the relationship between the modifier and the structural auxiliary word of the modifier can be determined by determining the field "at the head of the second row" and the field "measure" at the tail of the previous row. According to the second mode, the field 'of the head of the line of the second line' can be determined as the target field, and the adjustment mode is determined to move the target field to the tail of the adjacent previous line, so that the adjusted target content is as shown in table 4-2, and the 'of the target field' in the target content is adjusted to the tail of the previous line.
Figure BDA0000835438290000112
Table 4-2 adjusts the target content according to the second mode and the grammar rule
Grammar rule three: the preposition and the component nouns, verbs and pronouns behind the preposition are in the same row.
For example, if the target content after the line-breaking typesetting is a title, the target content is compared before and after being adjusted according to the first mode and the third grammar rule with reference to table 5-1.
Figure BDA0000835438290000121
Table 5-1 adjusts the target content according to the first mode and the third grammar rule
In table 5-1, for the first row in the target content before adjustment, it can be determined that the field "at the end of the first row is a preposition, and the field" dutch "at the head of the next row is a noun, and forms a preposition + noun structure with" at ". According to the first method, the preposition "at the end of the line of the first line can be determined as the target field, and the adjustment method is determined as moving the target field to the head of the next line adjacent to the end of the line, so that the adjusted target content is shown in table 5-1, and the target field" at "in the target content is adjusted to the head of the next line.
In Table 5-1, for the second row in the target content before the adjustment, a noun in the field "Netherlands" at the head of the second row can be determined, and a preposition is in the field "at the tail of the previous row. And according to the second mode, the field 'dutch' at the head of the line of the second line can be determined as the target field, and the adjustment mode is determined to move the target field to the tail of the adjacent previous line, so that the adjusted target content is shown in table 5-2, and the target field 'dutch' in the target content is adjusted to the tail position of the previous line.
Figure BDA0000835438290000122
Table 5-2 adjusts the target content according to the second mode and the third grammar rule
Grammar rule four: the conjuncts are in the same row as the phrase following the conjuncts.
For example, if the target content after the line-breaking layout is a title, the target content is compared before and after being adjusted according to the mode one and the grammar rule with reference to table 6-1.
Figure BDA0000835438290000123
Table 6-1 adjusts the target content according to the first mode and the fourth grammar rule
In table 6-1, for the first line in the target content before adjustment, it is determined that the field "college student and" at the end of the first line is a conjunction, and the head field "adult education" at the next line is a connection that connects "college student and". According to the first mode, the verbs "college student and" at the end of the first line can be determined as the target field, and the adjustment mode is determined to move the target field to the head of the next line adjacent to the first line, so that the adjusted target content is shown in table 6-1, and the target fields "college student and" in the target content are adjusted to the head position of the next line.
In table 6-1, for the second line in the target content before adjustment, it can be determined that the field "adult education" at the head of the second line and the field "college student and" at the end of the previous line constitute a connection relationship. According to the second mode, the field "adult education" at the head of the second row is determined as the target field, and the adjustment mode is determined to move the target field to the end of the row adjacent to the previous row, and the adjusted target content is shown in table 6-2, and the target field "adult education" in the target content is adjusted to the position of the end of the row above.
Figure BDA0000835438290000131
Table 6-2 adjusts the target content according to the second mode and the fourth grammar rule
It should be noted that the above syntax rules are only examples, and other syntax rules are also applicable to the scheme of the embodiment of the present invention. And in implementation, grammar rules may also be updated.
As can be seen from the above, in the above-mentioned first and second manners for each line of the target content, the target field of each line and the adjustment manner can be determined, and further, the target field can be adjusted according to the adjustment manner. The method can realize the disconnection of the target content according to the semantics based on the preset grammar rule, and manual adjustment is not needed, so that the time is saved, and the efficiency is improved.
Optionally, adjusting the target field in the line to be at the rearmost of the adjacent previous line or before the foremost of the next line according to the determined adjustment manner, further includes:
determining that the number of words in the adjacent last row is not greater than the maximum number of words in the row after the target fields in the row are adjusted to the rearmost of the adjacent last row according to the adjustment manner; or
Determining that the number of words in the next adjacent row is not greater than the maximum number of words in the row after the target fields in the row are adjusted to the forefront of the next adjacent row according to the adjustment.
The above-mentioned mode ensures that after the target field of one row is adjusted to the adjacent previous row, when the number of words in the adjacent previous row is not more than the maximum number of words in one row, the adjustment can be carried out according to the mode; or to ensure that the number of words in the next row is not greater than the maximum number of words in a row after the target field in a row is adjusted to the next row. Both ways thus ensure that the target field is adjusted to the next row or the next row without causing the number of words in the next row or the next row to exceed the maximum number of words in one row, thereby ensuring a normal display.
Optionally, the method further includes:
if the word number of the adjacent previous line is larger than the maximum word number of the previous line after the target fields in the line are adjusted to the rearmost of the adjacent previous line according to the adjusting mode, determining the adjusting mode of the adjacent previous line and the target fields of the adjacent previous line, and adjusting the target fields of the adjacent previous line according to the adjusting mode of the adjacent previous line; or
And if the target field in the line is adjusted to the forefront of the next adjacent line according to the determined adjustment mode, and the word number of the next adjacent line is greater than the maximum word number of the line, determining the adjustment mode of the next adjacent line and the target field of the next adjacent line, and adjusting the target field of the next adjacent line according to the adjustment mode of the next adjacent line.
In the above manner, when it is determined that the target field is adjusted to the rearmost of the adjacent previous line, and the number of words in the adjacent previous line is greater than the maximum number of words in one line, the rearmost field of the adjacent previous line is adjusted to the head of the line, that is, it is necessary to determine an adjustment manner of the adjacent previous line and an adjacent target field in the previous line, and adjust the target field of the adjacent previous line according to the adjustment manner of the adjacent previous line. For example, in table 2, the target field in the second row is "mud", and if moving the target field "mud" to the rearmost of the previous row would result in the number of words in the previous row being greater than the maximum number of words in the previous row, the target field "mud" cannot be moved to the rearmost position of the previous row, which requires moving the target field at the tail of the previous row to the head of the current row. Of course, if moving the target field at the end of the previous row to the head of the current row would cause the word count of the current row to exceed the maximum word count of one row, it is considered to move the target field at the end of the previous row to the head of the current row and then to move the field at the end of the current row to the next row to ensure that the word count of the current row is not greater than the maximum word count of one row.
Similarly, when it is determined that the target field is adjusted to the forefront of the next adjacent row, and the number of words of the next adjacent row is greater than the maximum number of words of the next adjacent row, the adjustment mode of the next adjacent row and the target field of the next adjacent row need to be determined, and the target field of the next adjacent row needs to be adjusted according to the adjustment mode of the next adjacent row. For example, in table 1, the target field of the first row is dirty, and if moving the target field to the head of the next row would result in the number of words in the next row being greater than the maximum number of words in the next row, the target field cannot be moved to the head of the next row, which requires moving the target field at the head of the next row to the tail of the current row. Of course, if moving the target field at the head of the next row to the end of the current row would cause the number of words in the current row to exceed the maximum number of words in the next row, the field at the end of the next row may be moved to the next row after moving the field at the end of the current row to the head of the next row, so as to ensure that the number of words in the next row is not greater than the maximum number of words in the next row.
Optionally, the adjacent previous row target field is: the phrase formed by the field and the field at the head of the line belongs to a word segmentation word bank or meets a preset grammar rule;
the adjustment mode is as follows: moving the adjacent previous row target field to the row head of the row;
the next adjacent row target fields are: the phrase formed by the fields and the fields at the line tail of the line belongs to a word segmentation word bank or meets a preset grammar rule;
the adjustment mode is as follows: and moving the next adjacent row target field to the row tail of the row.
In the above mode, the target field and the adjustment mode of the line end of the previous line are determined, and the target field and the adjustment mode of the current line head are determined to be corresponding; and determining a target field and an adjusting mode of the head of the next line, wherein the target field and the adjusting mode of the tail of the current line are also corresponding to the target field and the adjusting mode. For example, in table 1, if the current row is the second row, based on the word segmentation lexicon, the target field at the head of the second row is determined to be "mud", the target field at the tail of the previous row is determined to be "dirty", and the target field is moved to the head of the next row, so that it can be ensured that when the current row, i.e., the target field "mud" of the second row cannot be moved to the tail of the adjacent previous row, the target field at the adjacent previous row is determined to be "dirty" according to the adjacent previous row, and the target field is adjusted to the head of the next row, so that the target field can be correctly adjusted. In addition, the case of the next line of adjacent behaviors and the case of the next line based on the preset grammar rule are basically the same, and are not described again here. Therefore, by the method, the correct adjustment target field can be ensured, and the adjustment correctness can be ensured.
In the above step 102, the target fields in the line are adjusted to the rearmost of the adjacent previous line or the foremost of the next line according to the determined adjustment mode.
The line breaking method of the embodiment of the present application is described in detail below.
Fig. 2 is a detailed flowchart of the line breaking method according to the embodiment of the present invention.
Step 201, obtaining the target content after the line breaking typesetting.
Step 202, for each line in the target content, determining a target field and an adjustment mode based on the word segmentation word bank/preset grammar rules.
Step 203, adjusting each line of the target content based on the determined target field and the adjustment mode.
Based on the same technical concept, the embodiment of the invention also provides a line breaking device, and the line breaking device can execute the method embodiment. The breaking device provided by the embodiment of the invention is shown in fig. 3.
A determining unit 301, configured to determine, according to a word segmentation lexicon and/or a preset grammar rule, a target field and an adjustment mode that need to be adjusted in a row of the target content after the line breaking and typesetting;
an adjusting unit 302, configured to adjust the target field in the line to a rearmost part of an adjacent previous line or a foremost part of a next line according to the determined adjustment manner.
Optionally, the determining unit 301 is specifically configured to:
if the phrase formed by the field at the tail of the line and the field at the head of the next line adjacent to the field at the tail of the line is determined to belong to the word segmentation word stock, determining that the target field is the field at the tail of the line, and determining that the adjustment mode is to move the target field to the head of the next line adjacent to the field at the head of the next line; or
And if the phrase formed by the field at the head of the line and the field at the tail of the line in the adjacent previous line is determined to belong to the word segmentation word stock, determining that the target field is the field at the head of the line, and determining that the adjustment mode is to move the target field to the head of the line in the adjacent previous line.
Optionally, the determining unit 301 is specifically configured to:
if the field at the tail of the line and the field at the head of the adjacent next line meet the preset grammar rule, determining that a target field is the field at the tail of the line, and determining that an adjustment mode is to move the target field to the head of the adjacent next line; or
And if the field at the head of the line and the field at the tail of the adjacent previous line meet the preset grammar rule, determining that the target field is the field at the head of the line, and determining that the adjustment mode is to move the target field to the tail of the adjacent previous line.
Optionally, the determining unit 301 is specifically configured to:
if the number of words in the adjacent previous line is not more than the maximum number of words in the line after the target fields in the line are adjusted to the rearmost of the adjacent previous line according to the adjusting mode, adjusting the target fields in the line to the rearmost of the adjacent previous line or the foremost of the next line according to the determined adjusting mode; or
And if the number of words in the next adjacent row is not more than the maximum number of words in the next adjacent row after the target fields in the row are adjusted to the forefront of the next adjacent row according to the adjusting mode, adjusting the target fields in the row to the rearmost of the previous adjacent row or the forefront of the next adjacent row according to the determined adjusting mode.
Optionally, the determining unit 301 is specifically configured to:
if the word number of the adjacent previous line is larger than the maximum word number of the previous line after the target fields in the line are adjusted to the rearmost of the adjacent previous line according to the adjusting mode, determining the adjusting mode of the adjacent previous line and the target fields of the adjacent previous line, and adjusting the target fields of the adjacent previous line according to the adjusting mode of the adjacent previous line; or
And if the target field in the line is adjusted to the forefront of the next adjacent line according to the determined adjustment mode, and the word number of the next adjacent line is greater than the maximum word number of the line, determining the adjustment mode of the next adjacent line and the target field of the next adjacent line, and adjusting the target field of the next adjacent line according to the adjustment mode of the next adjacent line.
Optionally, the adjacent previous row target field is: the phrase formed by the field and the field at the head of the line belongs to a word segmentation word bank or meets a preset grammar rule;
the adjustment mode is as follows: moving the adjacent previous row target field to the row head of the row;
the next adjacent row target fields are: the phrase formed by the fields and the fields at the line tail of the line belongs to a word segmentation word bank or meets a preset grammar rule;
the adjustment mode is as follows: and moving the next adjacent row target field to the row tail of the row.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (5)

1. A method of line breaking, comprising:
aiming at one line of the target content after line breaking typesetting, determining a target field and an adjusting mode which need to be adjusted in the line according to a word segmentation word bank and/or a preset grammar rule;
according to the determined adjusting mode, adjusting the target field in the row to the rearmost of the adjacent upper row or the foremost of the adjacent lower row;
determining a target field and an adjusting mode which need to be adjusted in the row according to the word segmentation word bank, wherein the method comprises the following steps:
if the phrase formed by the field at the tail of the row and the field at the head of the next row adjacent to the field at the tail of the row belongs to the word segmentation word bank and/or the field at the tail of the row and the field at the head of the next row adjacent to the field at the head of the row meet a preset grammar rule, determining that the target field is the field at the tail of the row, and determining that the adjustment mode is to move the target field to the head of the next row adjacent to the field at the tail of the row; or
If the phrase formed by the field at the head of the line and the field at the tail of the line adjacent to the head of the previous line is determined to belong to the word segmentation word bank and/or the field at the tail of the line and the field at the head of the line adjacent to the next line meet a preset grammar rule, determining that the target field is the field at the head of the line and determining that the adjustment mode is to move the target field to the tail of the line adjacent to the previous line;
according to the determined adjustment mode, adjusting the target field in the row to be at the rearmost of the adjacent upper row or before the foremost of the next row, further comprising:
determining that the number of words in the adjacent last row is not greater than the maximum number of words in the row after the target fields in the row are adjusted to the rearmost of the adjacent last row according to the adjustment manner; alternatively, the first and second electrodes may be,
determining that the number of words in the next row is not greater than the maximum number of words in the row after the destination field in the row is adjusted to the forefront of the next row in accordance with the adjustment.
2. The method of claim 1, further comprising:
if the word number of the adjacent previous row is larger than the maximum word number of the previous row after the target fields in the row are adjusted to the rearmost of the adjacent previous row according to the adjusting mode, determining the adjusting mode of the adjacent previous row and the target fields of the adjacent previous row, and adjusting the target fields of the adjacent previous row according to the adjusting mode of the adjacent previous row; or
If the target field in the row is adjusted to the forefront of the next adjacent row according to the determined adjustment mode, and the word number of the next adjacent row is greater than the maximum word number of the row, determining the adjustment mode of the next adjacent row and the target field of the next adjacent row, and adjusting the target field of the next adjacent row according to the adjustment mode of the next adjacent row;
the adjacent previous row target field is: the phrase formed by the field and the field at the head of the line belongs to a word segmentation word bank or meets a preset grammar rule;
the adjustment mode of the adjacent previous row is as follows: moving the adjacent previous row target field to the row head of the row;
the next adjacent row target fields are: the phrase formed by the fields and the fields at the line tail of the next line belongs to a word segmentation word bank or meets a preset grammar rule;
the adjustment mode of the adjacent next row is as follows: and moving the next adjacent row target field to the row tail of the row.
3. A travel breaking apparatus, comprising:
the determining unit is used for determining a target field and an adjusting mode which need to be adjusted in one line according to a word segmentation word bank and/or a preset grammar rule aiming at one line of the target content after line breaking and typesetting;
the adjusting unit is used for adjusting the target field in the line to the rearmost of the adjacent previous line or the foremost of the next line according to the determined adjusting mode;
the determining unit is specifically configured to:
if the phrase formed by the field at the tail of the row and the field at the head of the next row adjacent to the field at the tail of the row belongs to the word segmentation word bank and/or the field at the head of the row and the field at the tail of the previous row adjacent to the field at the head of the row meet the preset grammar rule, determining that the target field is the field at the tail of the row, and determining that the adjustment mode is to move the target field to the head of the next row adjacent to the field at the tail of the row; or
If the phrase formed by the field at the head of the line and the field at the tail of the line in the adjacent upper line is determined to belong to the word segmentation word bank and/or the field at the head of the line and the field at the tail of the line in the adjacent upper line meet the preset grammar rule, determining that the target field is the field at the head of the line in the line and determining that the adjustment mode is to move the target field to the tail of the line in the adjacent upper line;
according to the determined adjustment mode, adjusting the target field in the row to be at the rearmost of the adjacent upper row or before the foremost of the next row, further comprising:
determining that the number of words in the adjacent last row is not greater than the maximum number of words in the row after the target fields in the row are adjusted to the rearmost of the adjacent last row according to the adjustment manner; alternatively, it is determined that the number of words in the next row is not greater than the maximum number of words in the row after the target field in the row is adjusted to the forefront of the next row next to the row according to the adjustment manner.
4. The apparatus of claim 3, wherein the determining unit is specifically configured to:
if the word number of the adjacent previous row is not more than the maximum word number of the row after the target field in the row is adjusted to the rearmost of the adjacent previous row according to the adjusting mode, adjusting the target field in the row to the rearmost of the adjacent previous row according to the determined adjusting mode; or
If it is determined that the number of words in the next adjacent row is not greater than the maximum number of words in the next adjacent row after the target field in the row is adjusted to the forefront of the next adjacent row according to the adjustment manner, the target field in the row is adjusted to the forefront of the next adjacent row according to the determined adjustment manner.
5. The apparatus of claim 4, wherein the determining unit is specifically configured to:
if the word number of the adjacent previous row is larger than the maximum word number of the previous row after the target fields in the row are adjusted to the rearmost of the adjacent previous row according to the adjusting mode, determining the adjusting mode of the adjacent previous row and the target fields of the adjacent previous row, and adjusting the target fields of the adjacent previous row according to the adjusting mode of the adjacent previous row; or
If the target field in the row is adjusted to the forefront of the next adjacent row according to the determined adjustment mode, and the word number of the next adjacent row is greater than the maximum word number of the row, determining the adjustment mode of the next adjacent row and the target field of the next adjacent row, and adjusting the target field of the next adjacent row according to the adjustment mode of the next adjacent row;
the adjacent previous row target field is: the phrase formed by the field and the field at the head of the line belongs to a word segmentation word bank or meets a preset grammar rule;
the adjustment mode of the adjacent previous row is as follows: moving the adjacent previous row target field to the row head of the row;
the next adjacent row target fields are: the phrase formed by the fields and the fields at the line tail of the next line belongs to a word segmentation word bank or meets a preset grammar rule;
the adjustment mode of the adjacent next row is as follows: and moving the next adjacent row target field to the row tail of the row.
CN201510729273.4A 2015-10-30 2015-10-30 Line breaking method and device Active CN106649246B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510729273.4A CN106649246B (en) 2015-10-30 2015-10-30 Line breaking method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510729273.4A CN106649246B (en) 2015-10-30 2015-10-30 Line breaking method and device

Publications (2)

Publication Number Publication Date
CN106649246A CN106649246A (en) 2017-05-10
CN106649246B true CN106649246B (en) 2021-09-28

Family

ID=58809430

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510729273.4A Active CN106649246B (en) 2015-10-30 2015-10-30 Line breaking method and device

Country Status (1)

Country Link
CN (1) CN106649246B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110889267A (en) * 2019-11-29 2020-03-17 北京金山安全软件有限公司 Method and device for editing characters in picture, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1731389A (en) * 2004-08-04 2006-02-08 华建电子有限责任公司 Braille-Chinese contrapositive editing/typesetting system and editing/typesetting method
CN102081600A (en) * 2011-01-25 2011-06-01 珠海全志科技有限公司 E-book typesetting method and e-book typesetting system
CN102169591A (en) * 2011-05-20 2011-08-31 中国科学院计算技术研究所 Line selecting method and drawing method of text note in drawing
CN104166655A (en) * 2013-05-17 2014-11-26 北京四维图新科技股份有限公司 Electronic map lettering line folding method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6279017B1 (en) * 1996-08-07 2001-08-21 Randall C. Walker Method and apparatus for displaying text based upon attributes found within the text

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1731389A (en) * 2004-08-04 2006-02-08 华建电子有限责任公司 Braille-Chinese contrapositive editing/typesetting system and editing/typesetting method
CN102081600A (en) * 2011-01-25 2011-06-01 珠海全志科技有限公司 E-book typesetting method and e-book typesetting system
CN102169591A (en) * 2011-05-20 2011-08-31 中国科学院计算技术研究所 Line selecting method and drawing method of text note in drawing
CN104166655A (en) * 2013-05-17 2014-11-26 北京四维图新科技股份有限公司 Electronic map lettering line folding method and device

Also Published As

Publication number Publication date
CN106649246A (en) 2017-05-10

Similar Documents

Publication Publication Date Title
CN109726293B (en) Causal event map construction method, system, device and storage medium
CN106534548B (en) Voice error correction method and device
US11016966B2 (en) Semantic analysis-based query result retrieval for natural language procedural queries
CN112016304A (en) Text error correction method and device, electronic equipment and storage medium
CN104750687B (en) Improve method and device, machine translation method and the device of bilingualism corpora
CN103970765B (en) Correct mistakes model training method, device and text of one is corrected mistakes method, device
CN111428488A (en) Resume data information analyzing and matching method and device, electronic equipment and medium
CN110909122B (en) Information processing method and related equipment
EP2953038A1 (en) Interactive searching method and apparatus
CN106649612B (en) Method and device for automatically matching question and answer templates
CN109033074B (en) News abstract generation method, device, equipment and computer readable medium
CN110738997B (en) Information correction method and device, electronic equipment and storage medium
CN105068993A (en) Method for evaluating text difficulty
CN115048944B (en) Open domain dialogue reply method and system based on theme enhancement
CN103324621A (en) Method and device for correcting spelling of Thai texts
CN104035918A (en) Chinese organization name abbreviation recognition system adopting context feature matching
CN103678288A (en) Automatic proper noun translation method
CN110633724A (en) Intention recognition model dynamic training method, device, equipment and storage medium
CN110020429B (en) Semantic recognition method and device
CN109902286B (en) Entity identification method and device and electronic equipment
CN109299463B (en) Emotion score calculation method and related equipment
Samardžić et al. Automatic interlinear glossing as two-level sequence classification
CN106649246B (en) Line breaking method and device
CN110929514B (en) Text collation method, text collation apparatus, computer-readable storage medium, and electronic device
CN112287077A (en) Statement extraction method and device for combining RPA and AI for document, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240111

Address after: Room 1120, 11th Floor, Building 3, Courtyard 3, Jinguan North Second Street, Shunyi District, Beijing, 101300

Patentee after: Beijing Zhiwen Artificial Intelligence Software Technology Co.,Ltd.

Address before: 100080, Beijing City, Haidian District, No. 52 West Fourth Ring Road, SMIC building 19

Patentee before: Founder International Co.,Ltd. (Beijing)

TR01 Transfer of patent right