CN106681979A - Article layout method and device, programmable device and article publishing platform - Google Patents

Article layout method and device, programmable device and article publishing platform Download PDF

Info

Publication number
CN106681979A
CN106681979A CN201611046564.4A CN201611046564A CN106681979A CN 106681979 A CN106681979 A CN 106681979A CN 201611046564 A CN201611046564 A CN 201611046564A CN 106681979 A CN106681979 A CN 106681979A
Authority
CN
China
Prior art keywords
text
typesetting
symbol
article
newline
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611046564.4A
Other languages
Chinese (zh)
Inventor
艾瑞坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Alibaba Literature Information Technology Co Ltd
Original Assignee
Guangzhou Alibaba Literature Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Alibaba Literature Information Technology Co Ltd filed Critical Guangzhou Alibaba Literature Information Technology Co Ltd
Priority to CN201611046564.4A priority Critical patent/CN106681979A/en
Publication of CN106681979A publication Critical patent/CN106681979A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/189Automatic justification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention discloses an article layout method and device, a programmable device and an article publishing platform. The article layout method includes the steps: responding to layout requests of target articles, and clearing specific format characters in the target articles to obtain preprocessed texts; clearing redundant newline characters in the preprocessed texts to obtain middle texts; setting layout format characters for the middle texts to obtain layout texts meeting preset layout formats. According to the article layout method, respective detection and corresponding processing for various layout errors possibly generated in the articles to be laid out can be omitted, the layout texts meeting demands are acquired by the aid of the technical scheme of uniform layout, complexity is low, and response is rapid.

Description

Article composition method, equipment, programmable device and article distribution platform
Technical field
The present invention relates to electronic reading technical field, more particularly, to a kind of article composition method, equipment and programmable Equipment.
Background technology
With developing rapidly for computer technology and Internet technology is write, the rapid emergence of electronic reading, people prefer News, novel, article etc. are read in mobile phone, online literature also flourishes therewith, and a large amount of network authors are by each big net Network article distribution platform (such as literature website, novel website etc.) issues the web documents of oneself creation, is read for website caller Read, be greatly enriched the electronic reading experience of user.
But, due to network group of authority, the good and bad jumbled together, and educational level height differs, and very different, many authors are direct The article content for writing out is because form is chaotic, abuse punctuate, the originals such as sensitive content, wrong word spread unchecked, paragraph lacks unity and coherence occur Cause, directly cannot issue on network.Each web documents distribution platform, can all set up editorial department by editorial staff Article on network to be published to is audited, typesetting, error correction, the removal work such as sensitive content, it is this kind of intricate operation, superfluous It is remaining, repeat, it is uninteresting, consume in a large amount of company's manpowers and fund cost and review process also error-prone, inefficiency.
On some current web documents distribution platforms, article is issued to network author needs, there is provided the work(of Automatic Typesetting Energy (such as key row version function), but the developers of web documents distribution platform are generally needed, for hundreds of text Error format is targetedly programmed and repairs one by one, and each error situation will write a kind of corresponding pol-icy code, Therefore the algorithm realized very complicated and huge, algorithm performs inefficiency, as the situation of mistake becomes many, can also cause generation Code is difficult to safeguard to the later stage, and because error situation is difficult to consider thorough, can there is that to omit some rare extreme forms wrong Situation, so as to cause cannot to be repaired to which extreme format error by mistake.Additionally, for the network of hundreds of thousands word easily Article, typesetting is carried out by current Automatic Typesetting scheme, and because algorithm is complicated and huge, response speed is very slow, and influence is used Family is experienced, accordingly, it is desirable to consume more process resource, also increases the central back-end server for implementing Automatic Typesetting scheme Pressure.
Therefore, it has been recognised by the inventors that being necessary to be improved for above-mentioned problems of the prior art.
The content of the invention
It is an object of the present invention to provide a kind of new solution for article typesetting.
According to the first aspect of the invention, there is provided a kind of article composition method, including:
In response to the typesetting request of target article, the specific format symbol included in the target article is removed, obtain pre- place Reason text, at least includes space character, indentation symbol and carriage return character in the specific format symbol;
The newline of the redundancy included in the preprocessed text is removed, internal expression text is obtained;
Typesetting format symbol is set to the internal expression text, obtains meeting the typesetting text of predetermined typesetting format, wherein, it is described Typesetting format symbol at least includes predetermined placeholder.
Alternatively, the predetermined typesetting format is that every section of first trip is retracted two character bits and paragraph is at intervals of a line, described Typesetting format symbol includes the placeholder, the newline and the indentation symbol, described to set typesetting to the internal expression text Format character, includes the step of obtain the typesetting text for meeting predetermined typesetting format:
To the single newline included in the internal expression text, execution character replacement step is continuously changed with replacing with two Row symbol and two continuous placeholders;
To the placeholder, execution character replacement step obtains the typesetting text to replace with the indentation symbol.
Alternatively, described to remove the specific format symbol included in target article, obtain preprocessed text is:To the mesh The specific format symbol included in mark article, execution character replacement step obtains preprocessed text to replace with NUL.
Alternatively, the newline for removing the redundancy included in the preprocessed text, the step of obtain internal expression text For:To the newline included in the preprocessed text, the character replacement step of pre-determined number is repeated, to realize multiple Continuous newline replaces with single newline, obtains the internal expression text.
Alternatively, the placeholder is also included in the specific format symbol.
Alternatively, the character replacement step, is performed by str.replace () function of JavaScript.
According to the second aspect of the invention, there is provided a kind of article type-setting equipment, including:
Pretreatment unit, for the typesetting request in response to target article, the spy included in the text for removing target article Determine format character, obtain preprocessed text, space character, indentation symbol and carriage return character are at least included in the specific format symbol;
Intermediate treatment unit, the newline for removing the redundancy included in the preprocessed text, obtains internal expression text;
Typesetting setting unit, for setting the internal expression text typesetting format symbol, obtains meeting predetermined typesetting format Typesetting text, wherein, the typesetting format symbol at least includes predetermined placeholder.
Alternatively, the predetermined typesetting format is that every section of first trip is retracted two character bits and paragraph is at intervals of a line, described Typesetting format symbol includes the placeholder, the newline and the indentation symbol, and the typesetting setting unit includes:
For the single newline to being included in the internal expression text, execution character replacement step is continuous to replace with two Newline and two devices of continuous placeholder;And
For to the placeholder, execution character replacement step to obtain the typesetting text to replace with the indentation symbol Device.
Alternatively, the pretreatment unit, for the specific format symbol to being included in the target article, performs word Symbol replacement step obtains preprocessed text to replace with NUL.
Alternatively, the intermediate treatment unit, for the newline to being included in the preprocessed text, repeats pre- Determine the character replacement step of number of times, to realize for the multiple continuous newline replacing with single newline, obtain it is described in Between text.
According to the third aspect of the invention we, there is provided a kind of programmable device, it is characterised in that including memory and treatment Device, wherein, the memory is used for store instruction, and the instruction is used to control the processor to be operated to perform the present invention First aspect any one article composition method.
According to the third aspect of the invention we, there is provided a kind of article distribution platform, including according to a second aspect of the present invention appoint One article type-setting equipment of meaning.
It was found by the inventors of the present invention that in the prior art, not yet there is a kind of article composition method, equipment, may be programmed Equipment and article distribution platform, it may not be necessary to respectively for the article for treating typesetting it is possible that various printer's errors carry out Corresponding treatment is detected and done, but the typesetting text for meeting demand is obtained by unified typesetting technique scheme, realized complicated Spend low and fast response time.Therefore, the technical assignment or technical problem to be solved that the present invention to be realized are abilities It is that field technique personnel never expect or it is not expected that, therefore the present invention is a kind of new technical scheme.
By referring to the drawings to the detailed description of exemplary embodiment of the invention, further feature of the invention and its Advantage will be made apparent from.
Brief description of the drawings
The accompanying drawing for being combined in the description and constituting a part for specification shows embodiments of the invention, and even It is used to explain principle of the invention together with its explanation.
Fig. 1 is the block diagram of the example for showing the hardware configuration that can be used for the computing system for realizing embodiments of the invention.
Fig. 2 shows the flow chart of article composition method in the embodiment of the present invention.
Fig. 3 is the schematic diagram of the target article of the example of article composition method in the embodiment of the present invention.
Fig. 4 shows the flow chart of the example of article composition method in the embodiment of the present invention.
Fig. 5 is the schematic diagram of the preprocessed text of the example of article composition method in the embodiment of the present invention.
Fig. 6 is the schematic diagram of the internal expression text of the example of article composition method in the embodiment of the present invention.
Fig. 7 is the schematic diagram of the typesetting treatment of the example of article composition method in the embodiment of the present invention.
Fig. 8 is the schematic diagram of the typesetting text of the example of article composition method in the embodiment of the present invention.
Fig. 9 shows the schematic block diagram of article type-setting equipment in the embodiment of the present invention.
Specific embodiment
Describe various exemplary embodiments of the invention in detail now with reference to accompanying drawing.It should be noted that:Unless had in addition Body illustrates that the part and the positioned opposite of step, numerical expression and numerical value for otherwise illustrating in these embodiments do not limit this The scope of invention.
The description only actually at least one exemplary embodiment is illustrative below, never as to the present invention And its any limitation applied or use.
May be not discussed in detail for technology, method and apparatus known to person of ordinary skill in the relevant, but suitable In the case of, the technology, method and apparatus should be considered as a part for specification.
In all examples shown here and discussion, any occurrence should be construed as merely exemplary, without It is as limitation.Therefore, other examples of exemplary embodiment can have different values.
It should be noted that:Similar label and letter represents similar terms in following accompanying drawing, therefore, once a certain Xiang Yi It is defined in individual accompanying drawing, then it need not be further discussed in subsequent accompanying drawing.
<Hardware configuration>
Fig. 1 is the block diagram of the hardware configuration for showing that the computer system 1000 of embodiments of the invention can be realized.
As shown in figure 1, computer system 1000 includes computer 1110.Computer 1110 is included via system bus 1121 The processor 1120 of connection, memory 1130, fixed non-volatile memory interface 1140, mobile non-volatile memory interface 1150th, user input interface 1160, network interface 1170, video interface 1190 and peripheral interface 1195.
Memory 1130 includes ROM (read-only storage) and RAM (random access memory).BIOS (basic input and output System) reside in ROM.Operating system, application program, other program modules and some routine datas are resided in RAM.
The fixed non-volatile memory of such as hard disk is connected to fixed non-volatile memory interface 1140.Fixation is non-easily The property lost memory for example can be with storage program area, application program, other program modules and some routine datas.
The mobile nonvolatile memory of such as floppy disk and CD-ROM drive is connected to and moves non-volatile depositing Memory interface 1150.For example, floppy disk can be inserted into floppy disk, and CD (CD) can be inserted into CD-ROM In driver.
The input equipment of such as mouse and keyboard is connected to user input interface 1160.
Computer 1110 can be connected to remote computer 1180 by network interface 1170.For example, network interface 1170 Remote computer can be connected to by LAN.Or, network interface 1170 may be coupled to modem (modulator- Demodulator), and modem is connected to remote computer 1180 via wide area network.
Remote computer 1180 can include the memory of such as hard disk, and it can store remote application.
Video interface 1190 is connected to monitor.
Peripheral interface 1195 is connected to printer and loudspeaker.
Computer system shown in Fig. 1 is merely illustrative and is in no way intended to the invention, its application, or uses Any limitation.It is applied in embodiments of the invention, the memory 1130 of computer 1110 is used for store instruction, described Instruct the loading for controlling the processor 1120 to be operated to perform any one webpage provided in an embodiment of the present invention The generation method of method or webpage.Although showing multiple devices to computer 1110 in Fig. 1, the present invention can be only It is related to partial devices therein, for example, computer 1110 pertains only to processor 1120 and memory 1130.Technical staff can be with root Instructed according to presently disclosed conceptual design.How control process device is operated for instruction, and this is it is known in the art that therefore herein not Describe in detail again.
<Embodiment>
A kind of article composition method is provided in the present embodiment, as shown in Fig. 2 including:
Step S2100, in response to the typesetting request of target article, removes the specific format included in the target article Symbol, obtains preprocessed text, and space character, indentation symbol and carriage return character are at least included in the specific format symbol.
In the present embodiment, target article be author by text edit software or text editor or The text of the electrical file form of presentation, content of text can be news, novel, eight-legged essay etc. literary works, it will usually pass through The article distribution platform that network is provided is issued, and the article distribution platform can be online literature website, forum on literature, tool There is issue article to download the mobile reading software of the application program of function for example on mobile phone etc. for user.
The a variety of causes such as educational level, editor's custom due to the author of target article, the text of target article may be deposited Form carelessly, paragraph the reason such as lack unity and coherence, do not meet the requirement of article issue.Generation can accordingly be triggered to target article The request of typesetting is carried out, the request can not met by detecting target article inside the article distribution platform of issue target article Automatically generated during issue demand, it is also possible to operate triggering to produce by the author of target article, for example, issuing flat by clicking on target Platform provide typesetting function interface and trigger generation.
In response to the typesetting request of target article, the specific format symbol included in target article, the specific format are removed Symbol is the character that possible cause target article the printer's errors such as typesetting format is chaotic, paragraph lacks unity and coherence occur, typically target The clerical mistake of author or undesirable editor custom are caused, for example, the sky that mistake occurs in normal text content Lattice symbol, the space character for replacing indentation, or indentation symbol (such as unnecessary or omission contracting against regulation before paragraph Enter symbol).It is also likely to be due to text edit software or text editor is inconsistent causes format error, for example, generally in text In this editor be used for enter a new line carriage return character, but some copy editor's softwares or editing machine do not support the carriage return character realize line feed or Support that the form of carriage return character line feed is different, it is possible to cause typesetting format chaotic.Therefore, in the present embodiment, the particular bin Formula symbol at least includes space character, indentation symbol and carriage return character.Cause typesetting format chaotic or section the possibility in target article The specific format symbol of printer's error of falling to lacking unity and coherence etc. removes, it may not be necessary to respectively for being caused due to specific format symbol Various printer's errors are solved one by one, but can be uniformly processed and be solved the printer's error that various specific format symbols are caused, and Can be realized with less size of code, implementation complexity be reduced, while also reducing the consumption to process resource.
The step of specific format included in the removing target article is accorded with, can directly delete the specific format Symbol, or the specific format symbol to being included in the target article, execution character replacement step is replacing with sky word Symbol, therefore, it is in one example, described to remove the specific format symbol included in target article, the step of obtain preprocessed text, Can be the specific format symbol to being included in the target article, execution character replacement step is obtained with replacing with NUL To preprocessed text.
Wherein, the specific format symbol at least includes the space character, indentation symbol, carriage return character, the character replacement step Typically ignore character boundary and write into capable character replacement, the character replacement function that can be write by developer or existing meter The character replacement function provided in calculation machine text mechanisms is performed.
For example, the space character is character " ", the indentation symbol is character " ", and the carriage return character is typically character " r ", described NUL is character " ", can be replaced by str.replace () the function execution character in JavaScript (JavaScript is a kind of literal translation formula script, is a kind of regime type, weak type, the language based on prototype, built-in support Type, there is provided str.replace () function supports that character is replaced):
To space character " ", can realize ignoring character boundary by str.replace (//ig, " ") writing and replace space character It is changed to NUL;
To indentation symbol " ", can realize ignoring character boundary by str.replace (//ig, " ") and write and replace indentation symbol It is changed to NUL;
To carriage return character " r ", can realize that ignoring character boundary writes carriage return by str.replace (/ r/ig, " ") Symbol replaces with NUL, it is also possible to according to generally the carriage return character is connected what is occurred with newline in the text, by str.replace (/ r n/ig, " n ") is replaced and will be removed the carriage return character with realizing ignorecase by character.
After obtaining preprocessed text by step S2100, into step S2200, remove and wrap in the preprocessed text The newline of the redundancy for containing, obtains internal expression text.
The newline of the redundancy, the bad habit of the clerical mistake or editor that are often as the author of target article causes, Can cause the printer's error that various typesetting formats confusions that cannot be unified or paragraph lack unity and coherence occur, for example, text stage casing Fall interval one when three rows for the moment two rows one when four five-element, there is the printer's error that various paragraphs interval causes paragraph to lack unity and coherence.
In the present embodiment, by removing the newline of redundancy, it may not be necessary to recognize various paragraphs intervals mistake point respectively Do not solved, but the various paragraph interval mistakes of solution can be uniformly processed, and can be realized with less size of code, reduced real Existing complexity, while also reducing the consumption to process resource.Specifically, the newline of the redundancy in preprocessed text is removed, can Being recognized in preprocessed text comprising being deleted after multiple continuous newlines to only remaining single newline, or to institute The newline included in preprocessed text is stated, the character replacement step of pre-determined number is repeated, can be to each two in text Continuous newline execution character replacement step replaces with single newline, after repeating pre-determined number, it is possible to achieve will be many Individual continuous newline replaces with single newline.
Therefore, in one example, the newline for removing the redundancy included in the preprocessed text, obtains centre The step of text can be:To the newline included in the preprocessed text, the character for repeating pre-determined number replaces step Suddenly, to realize for multiple continuous newlines replacing with single newline, the internal expression text is obtained.
Wherein, the character replacement step, typically ignores character boundary and writes into capable character replacement, can be by exploitation The character replacement function provided in character replacement function that person writes or existing computer literal present mechanism is performed.
For example, newline is " n ", performed by the str.replace () function in JavaScript, specifically, passed through The continuous newline of each two of str.replace (/ n n/ig, " n ") to being included in text realizes that ignorecase is replaced with Single newline, repeats pre-determined number, so that all multiple continuous newlines in text are replaced with into single newline. The pre-determined number can based on experience value or experiment value choose.According to the experiment value of inventor, when pre-determined number is 4 times, The accuracy rate that all multiple continuous newlines in text are replaced with into single newline is more than 98%, and therefore, it can will be pre- Determine number of times to be set to 4 times.
After step S2200 obtains internal expression text, into step S2300, typesetting format symbol is set to the internal expression text, Obtain meeting the typesetting text of predetermined typesetting format, wherein, the typesetting format symbol at least includes predetermined placeholder.
In the present embodiment, the typesetting format symbol is to be arranged in text the row for meeting predetermined typesetting format to be formed The character of version text, the typesetting format symbol can be selected or combined according to predetermined typesetting format, but at least be included Predetermined placeholder.Based on typesetting format symbol, unified typesetting treatment can be implemented to the internal expression text, to be accorded with The typesetting text of typesetting format is closed, and can be realized with less size of code, reduce implementation complexity, while also reducing to treatment The consumption of resource.
The predetermined placeholder, is for occupying character bit and content of text typesetting format will not be caused chaotic or language The character that justice is obscured, being arranged in text can cause there is fixed distance between character, it is to avoid set other typesetting formats It is chaotic that symbol introduces typesetting format.The placeholder with the extremely low character of probability of occurrence in text editing or can be reconfigured Character.For example, placeholder can be characterIt is made up of two inverted exclamation marks.
And the predetermined typesetting format can be set according to specific application scenarios or application demand, for example, according to The typesetting rule setting of the article distribution platform that target article is issued, or practised according to the reading of the reading user of target article It is used to set.
For example, predetermined typesetting format is every section of first trip is retracted two character bits and paragraph is at intervals of a line, the typesetting Format character includes the placeholder, the newline and the indentation symbol, described to set typesetting format to the internal expression text Symbol, includes the step of obtain the typesetting text for meeting predetermined typesetting format:
To the single newline included in the internal expression text, execution character replacement step is continuously changed with replacing with two Row symbol and two continuous placeholders;
To the placeholder, execution character replacement step obtains the typesetting text to replace with the indentation symbol.
Wherein, the character replacement step, typically ignores character boundary and writes into capable character replacement, can be by exploitation The character replacement function provided in character replacement function that person writes or existing computer literal present mechanism is performed.
Specifically, newline is character " n ", and placeholder is characterCan be by JavaScript Str.replace () function performs above-mentioned character replacement step, more specifically, can be by str.replaceRealize that the single newline in internal expression text is replaced with two continuously by ignorecase Newline and two placeholders, realize segmentation typesetting and preliminary first trip indentation, it is not necessary to as in the prior art, it is necessary to write big Amount code, for be likely to occur at least 20 kinds segmentation typesetting and first trip indentation present in printer's errors detected respectively with And do correspondingly text reparation;Then, then by str.replaceRealize ignorecase by occupy-place Symbol replaces with indentation symbol, to fully achieve first trip indentation, obtains meeting the typesetting text of predetermined typesetting format.
And in the application scenarios of some minimum probability, can exist in original content in target article comprising the occupy-place Symbol, is so likely to result in carrying out introducing typesetting format confusion when step S2300 sets internal expression text typesetting format symbol, can To avoid the occurrence of such situation by removing predetermined placeholder as specific format symbol in step S2100, for example, Predetermined placeholder is characterCan be by str.replaceWith realizing ignorecase by occupy-place Symbol replaces with NUL, to realize removing the purpose of placeholder.Therefore, in one example, also wrapped in the specific format symbol Include the placeholder.
<Example>
Offer article composition method in the present embodiment is be provided below with reference to example, wherein, treat the mesh of typesetting Mark article is retracted two character bits and paragraph is at intervals of a line as shown in figure 3, predetermined typesetting format is every section of first trip, to described Target article implements the article composition method provided in the present embodiment, as shown in figure 4, including:
Step S401, in response to the typesetting request of target article, the specific format symbol that will be included in target article is replaced with NUL, obtains preprocessed text;
In this example, specific format symbol includes placeholder, space character, indentation symbol and newline, wherein, space character is word Symbol " ", indentation symbol is character " ", and the carriage return character is character " r ", and placeholder is characterBy in JavaScript Str.replace () function performs ignorecase ground and specific format symbol is replaced with into NUL, reaches removing specific format symbol Purpose, step S401 can include:
Step S401-1, to the space character " " included in target article, realizes neglecting by str.replace (//ig, " ") Bigger small letter replaces with NUL;
Step S401-2, to the placeholder included in target articleBy str.replaceRealize that ignorecase replaces with NUL;
Step S401-3, the indentation to being included in target article accords with " ", realizes neglecting by str.replace (//ig, " ") Bigger small letter replaces with NUL;
Step S401-4, the carriage return " r " to being included in target article can be by str.replace (/ r/ig, " ") Or str.replace (/ r n/ig, " n ") realizes that ignorecase replaces with NUL;
Can be with wherein in step S401, the step of remove the space character of target article, placeholder, indentation symbol, carriage return character Synchronously carry out, it is also possible to carried out according to certain order, for example, can be by can also be by str.replaceIt is synchronous to realize understanding space character and placeholder, the step of when not limiting its specific implementation in this example sequentially, Simply realize removing space character, placeholder, indentation symbol, the purpose of carriage return character of target article, obtain preprocessed text;
In this example, the preprocessed text for being obtained by step S401 as shown in figure 5, enter step S402 afterwards;
Step S402, repeat pre-determined number by str.replace (/ n n/ig, " n ") to preprocessed text In the continuous newline of each two that includes realize the step of ignorecase replaces with single newline, in this example, make a reservation for time Number is 4 times, the internal expression text for obtaining as shown in fig. 6, entering step S403 afterwards;
Step S403, by str.replaceRealize ignorecase by middle text Single newline in this replaces with two continuous newlines and two placeholders, realizes segmentation typesetting and preliminary first trip Indentation, the text that obtains as shown in fig. 7, into step S404,
Step S404, by str.replaceRealize that ignorecase replaces with placeholder Indentation symbol, obtains meeting the typesetting text of predetermined typesetting format, as shown in Figure 8.
Above-mentioned combined accompanying drawing and example illustrate the article composition method provided in the present embodiment, by removing target Specific format symbol obtains preprocessed text in article, and middle text is obtained by the newline for removing the redundancy in preprocessed text This, the typesetting format symbol set to internal expression text including at least predetermined placeholder obtains the typesetting of the typesetting format for meeting predetermined Text, it may not be necessary to detected respectively and changed accordingly for there may be various printer's errors in target article, and It is to carry out typesetting to target article using unified article typesetting scheme can just obtain the typesetting text that meets typesetting requirement, can Realized with less size of code, implementation complexity is significantly reduced, while reducing the process resource of consumption.
Specifically, in actual test, for the target article of 400,000 words, using the article typesetting provided in the present embodiment Method carries out typesetting, tests 1000 times, and it is 5 milliseconds that typesetting averagely takes, with target article in the prior art to equal length Carry out time-consuming tens of seconds needed for typesetting or even compare in minutes, drastically increase response speed, lift Consumer's Experience.
In the present embodiment, a kind of programmable device, including memory and processor are also provided, wherein, the memory For store instruction, the instruction is any according to what is provided in the present embodiment to perform for controlling the processor to be operated The article composition method of one.Specifically, the programmable device can as shown in Figure 1 shown in computer 1110, herein Repeat no more.
In the present embodiment, a kind of article type-setting equipment 9000 is also provided, as shown in figure 9, including pretreatment unit 9100th, intermediate treatment unit 9200 and typesetting setting unit 9300, for any one implementing to provide in the present embodiment Article composition method, will not be repeated here.
Article type-setting equipment 9000, including:
Pretreatment unit 9100, for the typesetting request in response to target article, includes in the text for removing target article Specific format symbol, obtain preprocessed text, space character, indentation symbol and carriage return character are at least included in the specific format symbol;
Intermediate treatment unit 9200, the newline for removing the redundancy included in the preprocessed text, obtains centre Text;
Typesetting setting unit 9300, for setting the internal expression text typesetting format symbol, obtains meeting predetermined typesetting lattice The typesetting text of formula, wherein, the typesetting format symbol at least includes predetermined placeholder.
Alternatively, the predetermined typesetting format is that every section of first trip is retracted two character bits and paragraph is at intervals of a line, described Typesetting format symbol includes the placeholder, the newline and the indentation symbol, and the typesetting setting unit 9300 includes:
For the single newline to being included in the internal expression text, execution character replacement step is continuous to replace with two Newline and two devices of continuous placeholder;And
For to the placeholder, execution character replacement step to obtain the typesetting text to replace with the indentation symbol Device.
Alternatively, the pretreatment unit 9100, for the specific format symbol to being included in the target article, holds Line character replacement step obtains preprocessed text to replace with NUL.
Alternatively, the intermediate treatment unit 9200, for the newline to being included in the preprocessed text, repetition is held The character replacement step of row pre-determined number, to realize for the multiple continuous newline replacing with single newline, obtains institute State internal expression text.
Alternatively, the placeholder is also included in the specific format symbol.
Alternatively, the character replacement step, is performed by str.replace () function of JavaScript.
In the present embodiment provide article type-setting equipment 9000 can apply to article distribution platform background server or In person's browser or the application program with similar browser function.
In the present embodiment, a kind of article distribution platform is also provided, including any one article provided in the present embodiment Type-setting equipment.Wherein, the article distribution platform, can support the online literature website of issue article, support issue article Forum on literature, support article issue browser or similar browser application program and support article issue shifting Dynamic ocr software (mobile phone A PP) etc..
Below embodiments of the invention have been described in conjunction with the accompanying, according to the present embodiment, there is provided article composition method, set Standby, programmable device and article distribution platform, are accorded with by removing specific format in target article and obtain preprocessed text, are passed through The newline for removing the redundancy in preprocessed text obtains internal expression text, internal expression text is set and comprises at least predetermined placeholder The typesetting format symbol typesetting text of typesetting format that obtains meeting predetermined so that may not necessarily be for may be deposited in target article Detected respectively in various printer's errors and changed accordingly, but used unified article typesetting scheme to target article Carrying out typesetting can just obtain the typesetting text for meeting typesetting requirement.Implementation complexity is low, and the process resource of consumption is less, accordingly Improve the response speed of typesetting, lift Consumer's Experience.
It will be appreciated by those skilled in the art that, article type-setting equipment 9000 can be realized by various modes.For example, can To realize article type-setting equipment 9000 by instructing configuration processor.For example, by instruction storage in ROM, and can work as During starting device, instruction is read in programming device to realize article type-setting equipment 9000 from ROM.For example, can be by text Chapter type-setting equipment 9000 is cured in dedicated devices (such as ASIC).Article type-setting equipment 9000 can be divided into separate Unit, or they can be merged realization.Article type-setting equipment 9000 can be by above-mentioned various implementations One kind realize, or can be realized by the combination of two or more modes in above-mentioned various implementations.
It is well known by those skilled in the art that the development of the electronic information technology with such as large scale integrated circuit technology With the trend of hardware and software, clearly to divide computer system soft and hardware boundary and seem relatively difficult.Because appointing What operation can be realized with software, it is also possible to be realized by hardware.The execution of any instruction can be completed by hardware, equally also may be used To be completed by software.Hardware implementations or software implement scheme are used for a certain machine function, depending on price, speed The Non-technical factors such as degree, reliability, memory capacity, change cycle.Therefore, for the ordinary skill of electronic information technical field For personnel, more it is direct and be explicitly described the mode of a technical scheme be describe the program in each operation.Knowing In the case of road institute operation to be performed, those skilled in the art can be based on directly setting the consideration of the Non-technical factor Count out desired product.
The present invention can be system, method and/or computer program product.Computer program product can include computer Readable storage medium storing program for executing, containing for making processor realize the computer-readable program instructions of various aspects of the invention.
Computer-readable recording medium can be the tangible of the instruction that holding and storage are used by instruction execution equipment Equipment.Computer-readable recording medium for example can be-- but be not limited to-- storage device electric, magnetic storage apparatus, optical storage Equipment, electromagnetism storage device, semiconductor memory apparatus or above-mentioned any appropriate combination.Computer-readable recording medium More specifically example (non exhaustive list) includes:Portable computer diskette, hard disk, random access memory (RAM), read-only deposit It is reservoir (ROM), erasable programmable read only memory (EPROM or flash memory), static RAM (SRAM), portable Compact disk read-only storage (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanical coding equipment, for example thereon Be stored with instruction punch card or groove internal projection structure and above-mentioned any appropriate combination.Calculating used herein above Machine readable storage medium storing program for executing is not construed as instantaneous signal in itself, the electromagnetic wave of such as radio wave or other Free propagations, logical Cross electromagnetic wave (for example, the light pulse for passing through fiber optic cables) that waveguide or other transmission mediums propagate or by wire transfer Electric signal.
Computer-readable program instructions as described herein can from computer-readable recording medium download to each calculate/ Processing equipment, or outer computer or outer is downloaded to by network, such as internet, LAN, wide area network and/or wireless network Portion's storage device.Network can include copper transmission cable, Optical Fiber Transmission, be wirelessly transferred, router, fire wall, interchanger, gateway Computer and/or Edge Server.Adapter or network interface in each calculating/processing equipment are received from network to be counted Calculation machine readable program instructions, and the computer-readable program instructions are forwarded, for storing the meter in each calculating/processing equipment In calculation machine readable storage medium storing program for executing.
For perform the present invention operation computer program instructions can be assembly instruction, instruction set architecture (ISA) instruction, Machine instruction, machine-dependent instructions, microcode, firmware instructions, condition setup data or with one or more programming language Source code or object code that any combination is write, the programming language include the programming language-such as Smal of object-oriented Ltalk, C++ etc., and routine procedural programming languages-such as " C " language or similar programming language.Computer-readable Programmed instruction can perform fully on the user computer, partly perform on the user computer, independent as one Software kit is performed, part performs or completely in remote computer or clothes on the remote computer on the user computer for part Performed on business device.In the situation for being related to remote computer, remote computer can by the network of any kind-include office Domain net (LAN) or wide area network (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (for example using because Spy net service provider comes by Internet connection).In certain embodiments, by using the shape of computer-readable program instructions State information comes personalized customization electronic circuit, such as PLD, field programmable gate array (FPGA) or programmable Logic array (PLA), the electronic circuit can perform computer-readable program instructions, so as to realize various aspects of the invention.
Referring herein to method according to embodiments of the present invention, device (system) and computer program product flow chart and/ Or block diagram describes various aspects of the invention.It should be appreciated that each square frame and flow chart of flow chart and/or block diagram and/ Or in block diagram each square frame combination, can be realized by computer-readable program instructions.
These computer-readable program instructions can be supplied to all-purpose computer, special-purpose computer or other programmable datas The processor of processing unit, so as to produce a kind of machine so that these instructions are by computer or other programmable datas During the computing device of processing unit, work(specified in one or more square frames realized in flow chart and/or block diagram is generated The device of energy/action.Can also be the storage of these computer-readable program instructions in a computer-readable storage medium, these refer to Order causes that computer, programmable data processing unit and/or other equipment work in a specific way, so that, be stored with instruction Computer-readable medium then includes a manufacture, and it includes realizing in one or more square frames in flow chart and/or block diagram The instruction of the various aspects of the function/action of regulation.
Can also computer-readable program instructions be loaded into computer, other programmable data processing units or other In equipment so that perform series of operation steps on computer, other programmable data processing units or miscellaneous equipment, to produce The computer implemented process of life, so that performed on computer, other programmable data processing units or miscellaneous equipment Instruct function/action specified in one or more square frames realized in flow chart and/or block diagram.
Flow chart and block diagram in accompanying drawing show system, method and the computer journey of multiple embodiments of the invention The architectural framework in the cards of sequence product, function and operation.At this point, each square frame in flow chart or block diagram can generation One part for module, program segment or instruction of table a, part for the module, program segment or instruction is used comprising one or more In the executable instruction of the logic function for realizing regulation.In some realizations as replacement, the function of being marked in square frame Can occur with different from the order marked in accompanying drawing.For example, two continuous square frames can essentially be held substantially in parallel OK, they can also be performed in the opposite order sometimes, and this is depending on involved function.It is also noted that block diagram and/or The combination of the square frame in each square frame and block diagram and/or flow chart in flow chart, can use the function of performing regulation or dynamic The special hardware based system made is realized, or can be realized with the combination of computer instruction with specialized hardware.It is right For those skilled in the art it is well known that, realized by hardware mode, realized by software mode and by software and The mode of combination of hardware realizes it being all of equal value.
It is described above various embodiments of the present invention, described above is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.In the case of without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes will be apparent from for the those of ordinary skill in art field.The selection of term used herein, purport Best explaining the principle of each embodiment, practical application or to the technological improvement in market, or make the art its Its those of ordinary skill is understood that each embodiment disclosed herein.The scope of the present invention be defined by the appended claims.

Claims (12)

1. a kind of article composition method, it is characterised in that including:
In response to the typesetting request of target article, the specific format symbol included in the target article is removed, obtain pretreatment text This, at least includes space character, indentation symbol and carriage return character in the specific format symbol;
The newline of the redundancy included in the preprocessed text is removed, internal expression text is obtained;
Typesetting format symbol is set to the internal expression text, obtains meeting the typesetting text of predetermined typesetting format, wherein, the typesetting Format character at least includes predetermined placeholder.
2. method according to claim 1, it is characterised in that the predetermined typesetting format is that every section of first trip is retracted two words Accord with position and paragraph is at intervals of a line, the typesetting format symbol includes the placeholder, the newline and the indentation symbol, institute State and typesetting format symbol is set to the internal expression text, include the step of obtain the typesetting text for meeting predetermined typesetting format:
To the single newline included in the internal expression text, execution character replacement step is replacing with two continuous newlines And two continuous placeholders;
To the placeholder, execution character replacement step obtains the typesetting text to replace with the indentation symbol.
3. method according to claim 1, it is characterised in that the specific format symbol included in the removing target article, The step of obtaining preprocessed text be:
To the specific format symbol included in the target article, execution character replacement step is obtained with replacing with NUL Preprocessed text.
4. method according to claim 1, it is characterised in that the redundancy included in the removing preprocessed text Newline, be the step of obtain internal expression text:
To the newline included in the preprocessed text, the character replacement step of pre-determined number is repeated, so that realize will be more Individual continuous newline replaces with single newline, obtains the internal expression text.
5. the method according to any one in claim 1-4, it is characterised in that
Also include the placeholder in the specific format symbol.
6. the method according to any one in claim 2-4, it is characterised in that
The character replacement step, is performed by str.replace () function of JavaScript.
7. a kind of article type-setting equipment, it is characterised in that including:
Pretreatment unit, for the typesetting request in response to target article, the particular bin included in the text for removing target article Formula is accorded with, and obtains preprocessed text, and space character, indentation symbol and carriage return character are at least included in the specific format symbol;
Intermediate treatment unit, the newline for removing the redundancy included in the preprocessed text, obtains internal expression text;
Typesetting setting unit, for setting typesetting format symbol to the internal expression text, obtains meeting the typesetting of predetermined typesetting format Text, wherein, the typesetting format symbol at least includes predetermined placeholder.
8. equipment according to claim 7, it is characterised in that the predetermined typesetting format is that every section of first trip is retracted two words Accord with position and paragraph is at intervals of a line, the typesetting format symbol includes the placeholder, the newline and the indentation symbol, institute Stating typesetting setting unit includes:
For the single newline to being included in the internal expression text, execution character replacement step is continuously changed with replacing with two Row symbol and two devices of continuous placeholder;And
For to the placeholder, execution character replacement step to obtain the dress of the typesetting text to replace with the indentation symbol Put.
9. equipment according to claim 7, it is characterised in that the pretreatment unit, in the target article Comprising specific format symbol, execution character replacement step obtains preprocessed text to replace with NUL.
10. equipment according to claim 7, it is characterised in that the intermediate treatment unit, for the pretreatment text The newline included in this, repeats the character replacement step of pre-determined number, to realize the multiple continuous newline Single newline is replaced with, the internal expression text is obtained.
11. a kind of programmable devices, it is characterised in that including memory and processor, wherein, the memory refers to for storage Order, it is described to instruct for controlling the processor to be operated to perform the text of any one according to claim 1-6 Chapter composition method.
12. a kind of article distribution platforms, it is characterised in that including any one article type-setting equipment in such as claim 7-10.
CN201611046564.4A 2016-11-23 2016-11-23 Article layout method and device, programmable device and article publishing platform Pending CN106681979A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611046564.4A CN106681979A (en) 2016-11-23 2016-11-23 Article layout method and device, programmable device and article publishing platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611046564.4A CN106681979A (en) 2016-11-23 2016-11-23 Article layout method and device, programmable device and article publishing platform

Publications (1)

Publication Number Publication Date
CN106681979A true CN106681979A (en) 2017-05-17

Family

ID=58866611

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611046564.4A Pending CN106681979A (en) 2016-11-23 2016-11-23 Article layout method and device, programmable device and article publishing platform

Country Status (1)

Country Link
CN (1) CN106681979A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110297927A (en) * 2019-05-17 2019-10-01 百度在线网络技术(北京)有限公司 Article dissemination method, device, equipment and storage medium
CN110348000A (en) * 2019-07-16 2019-10-18 仲恺农业工程学院 Typesetting document interactive computing method, device, equipment and computer readable medium
CN110929495A (en) * 2019-11-08 2020-03-27 广州坚和网络科技有限公司 Typesetting method for automatically beautifying article
CN111090671A (en) * 2019-12-19 2020-05-01 山大地纬软件股份有限公司 Method and device for eliminating difference between hollow character string and invalid character string in database
CN111368523A (en) * 2018-12-26 2020-07-03 嘉太科技(北京)有限公司 Method and device for converting layout format of movie and television script
CN111666733A (en) * 2019-02-20 2020-09-15 珠海金山办公软件有限公司 Cell processing method and device
CN111859871A (en) * 2020-07-22 2020-10-30 中国联合网络通信集团有限公司 Data processing method, device, equipment and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102081600A (en) * 2011-01-25 2011-06-01 珠海全志科技有限公司 E-book typesetting method and e-book typesetting system
CN102567303A (en) * 2010-12-24 2012-07-11 北京大学 Typesetting method and device for variable official document data
CN103778172A (en) * 2012-10-18 2014-05-07 万战斌 Examination paper information storing method and examination paper editing method and system
CN105183706A (en) * 2014-05-27 2015-12-23 腾讯科技(北京)有限公司 Method and device for processing rich text
CN105373526A (en) * 2015-10-23 2016-03-02 北大方正集团有限公司 Blank region processing method and system for electronic document

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567303A (en) * 2010-12-24 2012-07-11 北京大学 Typesetting method and device for variable official document data
CN102081600A (en) * 2011-01-25 2011-06-01 珠海全志科技有限公司 E-book typesetting method and e-book typesetting system
CN103778172A (en) * 2012-10-18 2014-05-07 万战斌 Examination paper information storing method and examination paper editing method and system
CN105183706A (en) * 2014-05-27 2015-12-23 腾讯科技(北京)有限公司 Method and device for processing rich text
CN105373526A (en) * 2015-10-23 2016-03-02 北大方正集团有限公司 Blank region processing method and system for electronic document

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111368523A (en) * 2018-12-26 2020-07-03 嘉太科技(北京)有限公司 Method and device for converting layout format of movie and television script
CN111666733A (en) * 2019-02-20 2020-09-15 珠海金山办公软件有限公司 Cell processing method and device
CN111666733B (en) * 2019-02-20 2023-10-27 珠海金山办公软件有限公司 Method and device for processing cells in document
CN110297927A (en) * 2019-05-17 2019-10-01 百度在线网络技术(北京)有限公司 Article dissemination method, device, equipment and storage medium
CN110297927B (en) * 2019-05-17 2022-07-29 百度在线网络技术(北京)有限公司 Article publishing method, device, equipment and storage medium
CN110348000A (en) * 2019-07-16 2019-10-18 仲恺农业工程学院 Typesetting document interactive computing method, device, equipment and computer readable medium
CN110348000B (en) * 2019-07-16 2023-12-26 仲恺农业工程学院 Typesetting document interaction calculation method, device, equipment and computer readable medium
CN110929495A (en) * 2019-11-08 2020-03-27 广州坚和网络科技有限公司 Typesetting method for automatically beautifying article
CN110929495B (en) * 2019-11-08 2023-08-29 广州坚和网络科技有限公司 Method for automatically beautifying typesetting of articles
CN111090671A (en) * 2019-12-19 2020-05-01 山大地纬软件股份有限公司 Method and device for eliminating difference between hollow character string and invalid character string in database
CN111859871A (en) * 2020-07-22 2020-10-30 中国联合网络通信集团有限公司 Data processing method, device, equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN106681979A (en) Article layout method and device, programmable device and article publishing platform
US11487924B2 (en) System, method and associated computer readable medium for designing integrated circuit with pre-layout RC information
CN110058856A (en) Page configuration method and device
CN108628741A (en) Webpage test method, device, electronic equipment and medium
CN107925786A (en) The data visualization video of animation
CN105335360B (en) The method and apparatus for generating file structure
CN104461519B (en) A kind of flow chart dynamic generation and the method for control of authority
CN111159415A (en) Sequence labeling method and system, and event element extraction method and system
CN107832052A (en) Show the method, apparatus and storage medium and electronic equipment of preview page
TW201712522A (en) Method and apparatus for editing printed receipt based on POS terminal
CN108170602A (en) A kind of method for generating test case, device, terminal and computer-readable medium
CN111652559A (en) Material matching method and device
CN109284488A (en) Based on the method, apparatus and medium that modification front end table column data is locally stored
US9038004B2 (en) Automated integrated circuit design documentation
Moreto Bootstrap 4 By Example
CN106844706A (en) Update method, equipment, web storage system and the search system of web storage
WO2023088109A1 (en) Erroneous cell detection using an artificial intelligence model
CN107589962A (en) A kind of method for displaying user interface and device
CN108228179A (en) The international processing method of the page, device, computer and storage medium
KR102602836B1 (en) Method and apparatus of speech synthesis for e-book and e-document data structured layout with complex multi layers
US9292624B2 (en) String generation tool
CN105740222A (en) Method and device used for reader typesetting, reader and electronic device
US10891420B1 (en) Customization engine for the auto-generation of readable markup
JP2021082183A (en) Information processing device, information processing method, program and document
CN112596828A (en) Application-based popup window generation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170517

RJ01 Rejection of invention patent application after publication