CN110458918A - Method and apparatus for output information - Google Patents

Method and apparatus for output information Download PDF

Info

Publication number
CN110458918A
CN110458918A CN201910757082.7A CN201910757082A CN110458918A CN 110458918 A CN110458918 A CN 110458918A CN 201910757082 A CN201910757082 A CN 201910757082A CN 110458918 A CN110458918 A CN 110458918A
Authority
CN
China
Prior art keywords
text
picture
style
replaced fall
learn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910757082.7A
Other languages
Chinese (zh)
Other versions
CN110458918B (en
Inventor
章成全
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910757082.7A priority Critical patent/CN110458918B/en
Publication of CN110458918A publication Critical patent/CN110458918A/en
Application granted granted Critical
Publication of CN110458918B publication Critical patent/CN110458918B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/001Texturing; Colouring; Generation of texture or colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Processing Or Creating Images (AREA)

Abstract

Embodiment of the disclosure discloses the method and apparatus for output information.One specific embodiment of this method include: obtain include it is to be replaced fall text style picture and textual image including target text;Learn from style picture it is to be replaced fall text text style attribute;Foreground picture and mask picture are generated based on text style attribute and textual image;Wiped from style picture it is to be replaced fall text generate background picture;Foreground picture, mask picture and background picture are merged and generates Target Photo and exports.The embodiment is integrated with the advantage of target text style incremental learning, the method that e-learning is generated using confrontation, various dimensions learn the text geometrical characteristic and other attributive character of arbitrary shape, while solving other methods and generating the text structure feature that target text process can lose.This method applies to image generation, and the speed of service is very fast, can satisfy some real-time application scenarios.

Description

Method and apparatus for output information
Technical field
Embodiment of the disclosure is related to field of computer technology, and in particular to the method and apparatus for output information.
Background technique
Text in image and video, or be scene text, it include semantic information abundant, in many multimedia application It is highly useful in program.The related work of text style migration and background texture reparation is always many in text editing field The heat subject of person's research, with the change in deep learning epoch, textual image editor is also from traditional method to based on confrontation The transformation of neural network Style Transfer.
In text editing, keeping original text style is a very challenging job.People attempt given Image or stylization text sample in migrate or replicate the style information of its text.Certain methods lay particular emphasis on word at present Accord with the conversion of grade style.For example autoencoder network writing style migrates and fights network method based on generating.
Textual image editor is faced with two big challenges: the conversion of text style and the reservation of background texture at present.Especially The attribute of text is various in scene image, such as language, font, color, direction, stroke size, aerial prospective information, this It is become difficult so that accurately capturing all characteristics relevant to text and being migrated to target text, meanwhile, keep background Consistency with original image is also a difficult point, when especially there is text in the complex scenes such as menu, street shop signboard, this Outside, if target text is shorter than urtext, edited image should be deleted beyond character zone, and use texture appropriate It is covered.So being difficult to directly apply to text style migration if being based only on existing method.Meanwhile currently based on figure As generation method is mostly using the method for neural convolutional network image reconstruction, but so generating process is easily lost text Attribute information, generate that resultative construction is irregular, font is inconsistent etc. with before.
Summary of the invention
Embodiment of the disclosure proposes the method and apparatus for output information.
In a first aspect, embodiment of the disclosure provides a kind of method for output information, comprising: acquisition includes wait replace Change the style picture of text and the textual image including target text;Learn from style picture it is to be replaced fall text text Style attribute;Foreground picture and mask picture are generated based on text style attribute and textual image;From style picture erasing to It replaces text and generates background picture;Foreground picture, mask picture and background picture are merged and generates Target Photo and exports.
In some embodiments, learn from style picture it is to be replaced fall text text style attribute, comprising: pass through sky Between converting network learn from style picture it is to be replaced fall the inclination of text, transitting probability information.
In some embodiments, learn from style picture it is to be replaced fall text text style attribute, comprising: pass through four Side shape detection plus thin plate spline function transform method learn from style picture it is to be replaced fall text inclination, transitting probability letter Breath.
In some embodiments, learn from style picture it is to be replaced fall text text style attribute, comprising: pass through pen Streak filter learn from style picture it is to be replaced fall the stroke weight of text, font information.
In some embodiments, mask picture is generated based on text style attribute, comprising: based on it is to be replaced fall text incline Tiltedly, transitting probability information and stroke weight, font information generate mask picture.
In some embodiments, learn from style picture it is to be replaced fall text text style attribute, comprising: from style Learn in picture it is to be replaced fall text colouring information.
In some embodiments, the textual image including target text is obtained, comprising: obtain target text;Based on target The textual image of text generation predetermined format.
In some embodiments, foreground picture, mask picture and background picture generate confrontation network by condition and generate.
Second aspect, embodiment of the disclosure provide a kind of device for output information, comprising: acquiring unit, quilt Be configured to obtain include it is to be replaced fall text style picture and textual image including target text;Unit is configured At learn from style picture it is to be replaced fall text text style attribute;First generation unit is configured to based on text wind Lattice attribute and textual image generate foreground picture and mask picture;Second generation unit is configured to wipe from style picture It is to be replaced fall text generate background picture;Integrated unit is configured to merge foreground picture, mask picture and background picture and give birth to At Target Photo and export.
In some embodiments, unit is further configured to: by spatial alternation network from style picture middle school Practise it is to be replaced fall the inclination of text, transitting probability information.
In some embodiments, unit is further configured to: being become by quadrangle detection plus thin plate spline function Changing device learn from style picture it is to be replaced fall the inclination of text, transitting probability information.
In some embodiments, unit is further configured to: being learnt from style picture by stroke filter It is to be replaced fall the stroke weight of text, font information.
In some embodiments, the first generation unit is further configured to: based on it is to be replaced fall text inclination, transmission Information converting and stroke weight, font information generate mask picture.
In some embodiments, unit is further configured to: learn from style picture it is to be replaced fall text Colouring information.
In some embodiments, acquiring unit is further configured to: obtaining target text;It is generated based on target text pre- The textual image for the formula that fixes.
In some embodiments, foreground picture, mask picture and background picture generate confrontation network by condition and generate.
The third aspect, embodiment of the disclosure provide a kind of electronic equipment for output information, comprising: one or more A processor;Storage device is stored thereon with one or more programs, when one or more programs are by one or more processors It executes, so that one or more processors are realized such as method any in first aspect.
Fourth aspect, embodiment of the disclosure provide a kind of computer-readable medium, are stored thereon with computer program, Wherein, it realizes when program is executed by processor such as method any in first aspect.
The method and apparatus for output information that embodiment of the disclosure provides generate net based on confrontation in deep learning The design concept of network solves the migration problem of any text style while keeping background texture information abundant.This Outside, the disclosure can migrate task in line of text rank and paragraph level reason text style.
Any scene text editor proposed using the disclosure, output target image are applied to, and original image is applied to, right It, which is edited, changes target text into, especially the complex scene some with bending, waveform text item, can learn to accurate Text attribute characteristic.Such as use actual OCR (Optical Character Recognition, optical character identification) In detection identification translation usage scenario, most of text is that various irregular fractions distributions are presented, and the use of the invention can Allow existing scene text editor in the case where multiple practical businesses include the scenes such as nature streetscape, advertisement, bill, video, menu Text can translate the visual effect that editor has conspicuousness in text picture, be conducive to bring more for computer terminal and mobile terminal Flow, bring better Product Experience etc. for users.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the disclosure is other Feature, objects and advantages will become more apparent upon:
Fig. 1 is that one embodiment of the disclosure can be applied to exemplary system architecture figure therein;
Fig. 2 is the flow chart according to one embodiment of the method for output information of the disclosure;
Fig. 3 is the schematic diagram according to an application scenarios of the method for output information of the disclosure;
Fig. 4 is the effect picture according to the method for output information of the disclosure;
Fig. 5 is the structural schematic diagram according to one embodiment of the device for output information of the disclosure;
Fig. 6 is adapted for the structural schematic diagram for the computer system for realizing the electronic equipment of embodiment of the disclosure.
Specific embodiment
The disclosure is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure can phase Mutually combination.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the method for output information of the disclosure or the implementation of the device for output information The exemplary system architecture 100 of example.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Various telecommunication customer end applications, such as picture character editor's class can be installed on terminal device 101,102,103 It is objective using, image processing class application, web browser applications, shopping class application, searching class application, instant messaging tools, mailbox Family end, social platform software etc..
Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard When part, can be with display screen and support the various electronic equipments of pictograph editor, including but not limited to smart phone, Tablet computer, E-book reader, MP3 player (Moving Picture Experts Group Audio Layer III, Dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) player, pocket computer on knee and desktop computer etc..Work as end When end equipment 101,102,103 is software, it may be mounted in above-mentioned cited electronic equipment.It may be implemented into multiple soft Part or software module (such as providing Distributed Services), also may be implemented into single software or software module.It does not do herein It is specific to limit.
Server 105 can be to provide the server of various services, such as to showing on terminal device 101,102,103 The backstage editing server of image offer copy editor's function.Backstage editing server can be to the number such as edit requests received According to analyze etc. processing, and by processing result (such as the figure including target text generated according to the text style of original image Piece) feed back to terminal device.
It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software It, can also be with to be implemented as multiple softwares or software module (such as providing multiple softwares of Distributed Services or software module) It is implemented as single software or software module.It is not specifically limited herein.
It should be noted that the method provided by embodiment of the disclosure for output information is generally by server 105 It executes, correspondingly, the device for output information is generally positioned in server 105.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.
With continued reference to Fig. 2, the process of one embodiment of the method for output information according to the disclosure is shown 200.This is used for the method for output information, comprising the following steps:
Step 201, obtain include it is to be replaced fall text style picture and textual image including target text.
In the present embodiment, can lead to for the executing subject of the method for output information (such as server shown in FIG. 1) It crosses wired connection mode or radio connection and receives picture character using its terminal for carrying out picture character editor from user Edit requests, wherein may include two kinds of pictures in picture character edit requests, one is that original needs are replaced text Style picture.Secondly for the textual image including target text for replacement.User can directly provide including target text Textual image.Textual image can be black matrix wrongly written or mispronounced character.
In some optional implementations of the present embodiment, user can only by terminal offer target text, then by The target text that server will acquire is converted into the textual image of predetermined format.For example, can be according to the style picture identified Target text is converted into the textual image of identical size by the size of middle character area.In addition, target text can be multirow point Cloth, therefore, line of text rank and the other text style migration of paragraph level may be implemented in embodiment of the disclosure.
Step 202, learn from style picture it is to be replaced fall text text style attribute.
In the present embodiment, can be learnt from style picture by feature extraction network it is to be replaced fall text text style Attribute.For the neural network of extracted in self-adaptive abstract characteristics, including but not limited to: VGG/Inception/ResNet/ MobileNet/ShuffleNet etc..Text style attribute may include at least one of following: the rotation of text in style image, flat Shifting, scaling, text size, color, stroke weight, font.Feature extraction network can be one that condition generates confrontation network Point.It further includes textual image that the input of condition generation confrontation network, which includes style picture,.
In some optional implementations of the present embodiment, for capture it is original it is to be replaced fall text inclination, thoroughly Information converting is penetrated, networks or four sides such as STN (spatial transform network, spatial alternation) can be used directly Shape detection plus TPS (Thin Plate Spline, thin plate spline function) transform method.
In some optional implementations of the present embodiment, learnt by Stroke Filter (stroke filter) former Begin it is to be replaced fall text the information such as stroke weight, font, correlation (correlation) operation is similar to, in style picture Then upper one learnt group convolution carries out convolution operation as convolution kernel on textual image.
It, can be by SENet (squeeze extract network) from style picture in some optional implementations of the present embodiment It is middle study it is to be replaced fall text colouring information.Colouring information includes the information in tri- channels RGB.
Step 203, foreground picture and mask picture are generated based on text style attribute and textual image.
In the present embodiment, mask picture is that the text style attribute in addition to color of basis from the study of style picture will Textual image adjusts accordingly generation.Mask picture black bottom wrongly written or mispronounced character, 1 represents white, and 0 represents black.It is raw that condition can be used Mask picture is generated at confrontation network method.It includes generating network (Generator) and differentiating network that condition, which generates confrontation network, (Discriminator).The disclosure all figures generated are all to be generated by generating network come then again with differentiation net Network judge these figures be it is original really scheme or generate to come, differentiate that network is typically all then convolution feature extraction judges It is 0 or 1.In order to simplify training process, game only can be carried out using network is differentiated to background picture and Target Photo.
Foreground picture is generated on the basis of all text style attributes extracted, that is, foreground picture and mask picture are most Big difference is that foreground picture includes colouring information.And mask picture is black and white.
Step 204, wiped from style picture it is to be replaced fall text generate background picture.
In the present embodiment, confrontation network can be generated based on condition, realize that style picture generates the Background of erasing text Piece.The comparison photo before and after a large amount of erasing texts can be used to generate confrontation network as sample training condition.
Step 205, foreground picture, mask picture and background picture are merged and generates Target Photo and exports.
In the present embodiment, confrontation network method is generated based on condition, realized according to foreground image, background image and mask Image co-registration generates target image.Foreground picture and background picture are merged by mask picture, formula Target= Mask*Fore+(1-Mask)*Back.Wherein, Target is Target Photo, and Mask is mask picture, and Fore is foreground picture, Back is background picture.Here * refers to the dot product operation of triple channel.Mask is that 1 place is exactly the part of word multiplied by fore, Other positions are 0.Mask picture black bottom wrongly written or mispronounced character, 1 represents white, and 0 represents black.
With continued reference to the signal that Fig. 3, Fig. 3 are according to the application scenarios of the method for output information of the present embodiment Figure.In the application scenarios of Fig. 3, user by terminal to server send include it is to be replaced fall text " INTRODUCE " wind Trrellis diagram piece and target text " barbarous ".Target text " barbarous " is converted into the text diagram of predetermined format by server Then piece extracts the inclination of " INTRODUCE ", transitting probability information by capturing conversion module, then is mentioned by stroke filter Take stroke weight, the font information of " INTRODUCE ".Then according to inclination, transitting probability information and stroke weight, font information Generate mask picture.The colouring information for extracting " INTRODUCE " by SENet again, then according to colouring information in mask picture On the basis of Fill Color obtain foreground picture.Wiped from style picture later it is to be replaced fall text generate background picture.Most Foreground picture, mask picture and background picture are merged afterwards and generates Target Photo and exports.
The disclosure can operate in the AR scene image editor of the end PC or mobile terminal, and task is by the text in scene image Word is substituted for target text, while retaining each attribute and background attribute of text in source images.In addition, if target text Shorter than urtext, then edited image should be deleted beyond character zone, and be covered using texture appropriate, such as Fig. 4 It is shown.Such as: in OCR character translation scene, identification English is detected according to scene image and translates into Chinese text, will be generated Chinese text replacement scene image in English, make generate target image in Chinese attribute and background attribute visually protect It is consistent to hold scene image.
The disclosure is the Style Transfer in text rank, thus we select Pix2Pix method and we method into Row compares.Experimental result indicates that Pix2Pix method cannot adapt to the style of new text completely, and edited text structure is not advised Then, font is inconsistent with before, and background texture is very in disorder, some blurred backgrounds often occurs, in contrast, our side Method correctly completes the transmission of prospect text and the reservation of background texture, while but also result is schemed with true on the whole Piece is similar.The quantizating index of Pix2Pix method is also compared simultaneously, our method is all excellent on generated data and real data In traditional SRnet method.
The disclosure is integrated with the advantage of target text style incremental learning, and the side for generating e-learning is fought using condition Method, various dimensions learn the text geometrical characteristic and other attributive character of arbitrary shape, while solving other methods and generating target text The text structure feature that this process can lose.This method applies to image generation, and the speed of service is very fast, can satisfy some realities When application scenarios.Meanwhile the disclosure edits text image in literal line rank, paragraph rank, our method exists Subjective vision authenticity and objective quantification score ICDAR test set achieve outstanding achievement.At the same time, condition confrontation generates Network also have the ability eliminate in the case that across language text and edit cannot keep the background problem consistent with original image.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, present disclose provides one kind for exporting letter One embodiment of the device of breath, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer For in various electronic equipments.
As shown in figure 5, the device 500 for output information of the present embodiment includes: acquiring unit 501, unit 502, the first generation unit 503, the second generation unit 504 and integrated unit 505.Wherein, acquiring unit 501 are configured to obtain Take including it is to be replaced fall text style picture and including the textual image of target text;Unit 502 is configured to from wind Learn in trrellis diagram piece it is to be replaced fall text text style attribute;First generation unit 503 is configured to based on text style category Property and textual image generate foreground picture and mask picture;Second generation unit 504, be configured to from style picture wipe to It replaces text and generates background picture;Integrated unit 505 is configured to merge foreground picture, mask picture and background picture It generates Target Photo and exports.
In the present embodiment, it is generated for the acquiring unit 501 of the device of output information 500, unit 502, first The specific processing of unit 503, the second generation unit 504 and integrated unit 505 can be with reference to the step in Fig. 2 corresponding embodiment 201-205。
In some optional implementations of the present embodiment, unit 502 is further configured to: being become by space Switching network learn from style picture it is to be replaced fall the inclination of text, transitting probability information.
In some optional implementations of the present embodiment, unit 502 is further configured to: passing through quadrangle Detection plus thin plate spline function converting means learn from style picture it is to be replaced fall the inclination of text, transitting probability information.
In some optional implementations of the present embodiment, unit 502 is further configured to: passing through stroke mistake Filter learn from style picture it is to be replaced fall the stroke weight of text, font information.
In some optional implementations of the present embodiment, the first generation unit 503 is further configured to: based on to Replace inclination, transitting probability information and the stroke weight of text, font information generates mask picture.
In some optional implementations of the present embodiment, unit 502 is further configured to: from style picture It is middle study it is to be replaced fall text colouring information.
In some optional implementations of the present embodiment, acquiring unit 501 is further configured to: obtaining target text Word;The textual image of predetermined format is generated based on target text.
In some optional implementations of the present embodiment, foreground picture, mask picture and background picture pass through condition Confrontation network is generated to generate.
Below with reference to Fig. 6, it illustrates the electronic equipment that is suitable for being used to realize embodiment of the disclosure, (example is as shown in figure 1 Server) 600 structural schematic diagram.Server shown in Fig. 6 is only an example, should not be to the function of embodiment of the disclosure Any restrictions can be brought with use scope.
As shown in fig. 6, electronic equipment 600 may include processing unit (such as central processing unit, graphics processor etc.) 601, random access can be loaded into according to the program being stored in read-only memory (ROM) 602 or from storage device 608 Program in memory (RAM) 603 and execute various movements appropriate and processing.In RAM 603, it is also stored with electronic equipment Various programs and data needed for 600 operations.Processing unit 601, ROM 602 and RAM 603 pass through the phase each other of bus 604 Even.Input/output (I/O) interface 605 is also connected to bus 604.
In general, following device can connect to I/O interface 605: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph As the input unit 606 of head, microphone, accelerometer, gyroscope etc.;Including such as liquid crystal display (LCD), loudspeaker, vibration The output device 607 of dynamic device etc.;Storage device 608 including such as tape, hard disk etc.;And communication device 609.Communication device 609, which can permit electronic equipment 600, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 6 shows tool There is the electronic equipment 600 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with Alternatively implement or have more or fewer devices.Each box shown in Fig. 6 can represent a device, can also root According to needing to represent multiple devices.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communication device 609, or from storage device 608 It is mounted, or is mounted from ROM 602.When the computer program is executed by processing unit 601, the implementation of the disclosure is executed The above-mentioned function of being limited in the method for example.It should be noted that computer-readable medium described in embodiment of the disclosure can be with It is computer-readable signal media or computer readable storage medium either the two any combination.It is computer-readable Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or Device, or any above combination.The more specific example of computer readable storage medium can include but is not limited to: have The electrical connection of one or more conducting wires, portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD- ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In embodiment of the disclosure, computer Readable storage medium storing program for executing can be any tangible medium for including or store program, which can be commanded execution system, device Either device use or in connection.And in embodiment of the disclosure, computer-readable signal media may include In a base band or as the data-signal that carrier wave a part is propagated, wherein carrying computer-readable program code.It is this The data-signal of propagation can take various forms, including but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate Combination.Computer-readable signal media can also be any computer-readable medium other than computer readable storage medium, should Computer-readable signal media can send, propagate or transmit for by instruction execution system, device or device use or Person's program in connection.The program code for including on computer-readable medium can transmit with any suitable medium, Including but not limited to: electric wire, optical cable, RF (radio frequency) etc. or above-mentioned any appropriate combination.
Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment;It is also possible to individualism, and not It is fitted into the electronic equipment.Above-mentioned computer-readable medium carries one or more program, when said one or more A program by the electronic equipment execute when so that the electronic equipment: obtain include it is to be replaced fall text style picture and including The textual image of target text;Learn from style picture it is to be replaced fall text text style attribute;Based on text style category Property and textual image generate foreground picture and mask picture;Wiped from style picture it is to be replaced fall text generate background picture; Foreground picture, mask picture and background picture are merged and generates Target Photo and exports.
The behaviour for executing embodiment of the disclosure can be write with one or more programming languages or combinations thereof The computer program code of work, described program design language include object oriented program language-such as Java, Smalltalk, C++ further include conventional procedural programming language-such as " C " language or similar program design language Speech.Program code can be executed fully on the user computer, partly be executed on the user computer, as an independence Software package execute, part on the user computer part execute on the remote computer or completely in remote computer or It is executed on server.In situations involving remote computers, remote computer can pass through the network of any kind --- packet It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit It is connected with ISP by internet).
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.
Being described in unit involved in embodiment of the disclosure can be realized by way of software, can also be passed through The mode of hardware is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor Including acquiring unit, unit, the first generation unit, the second generation unit and integrated unit.Wherein, the title of these units Do not constitute the restriction to the unit itself under certain conditions, for example, acquiring unit be also described as " acquisition include to Replace the style picture of text and the unit of the textual image including target text ".
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that invention scope involved in the disclosure, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from the inventive concept, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed in the disclosure Can technical characteristic replaced mutually and the technical solution that is formed.

Claims (18)

1. a kind of method for output information, comprising:
Obtain include it is to be replaced fall text style picture and textual image including target text;
Learn from the style picture it is described it is to be replaced fall text text style attribute;
Foreground picture and mask picture are generated based on the text style attribute and the textual image;
Wiped from the style picture it is described it is to be replaced fall text generate background picture;
The foreground picture, the mask picture and the background picture are merged and generates Target Photo and exports.
2. according to the method described in claim 1, wherein, it is described learn from the style picture it is described it is to be replaced fall text Text style attribute, comprising:
Learnt from the style picture by spatial alternation network it is described it is to be replaced fall the inclination of text, transitting probability information.
3. according to the method described in claim 1, wherein, it is described learn from the style picture it is described it is to be replaced fall text Text style attribute, comprising:
By quadrangle detection plus thin plate spline function transform method learn from the style picture it is described it is to be replaced fall text Inclination, transitting probability information.
4. according to the method described in claim 3, wherein, it is described learn from the style picture it is described it is to be replaced fall text Text style attribute, comprising:
Learnt from the style picture by stroke filter it is described it is to be replaced fall the stroke weight of text, font information.
5. described to generate mask picture based on the text style attribute according to the method described in claim 4, wherein, comprising:
Based on it is described it is to be replaced fall inclination, transitting probability information and the stroke weight of text, font information generate mask picture.
6. according to the method described in claim 1, wherein, it is described learn from the style picture it is described it is to be replaced fall text Text style attribute, comprising:
Learn from the style picture it is described it is to be replaced fall text colouring information.
7. according to the method described in claim 1, wherein, the acquisition includes the textual image of target text, comprising:
Obtain target text;
The textual image of predetermined format is generated based on the target text.
8. method described in one of -7 according to claim 1, wherein the foreground picture, the mask picture and the background Picture generates confrontation network by condition and generates.
9. a kind of device for output information, comprising:
Acquiring unit, be configured to obtain include it is to be replaced fall text style picture and textual image including target text;
Unit, be configured to learn from the style picture it is described it is to be replaced fall text text style attribute;
First generation unit is configured to generate foreground picture and mask based on the text style attribute and the textual image Picture;
Second generation unit, be configured to wipe from the style picture it is described it is to be replaced fall text generate background picture;
Integrated unit is configured to merging the foreground picture, the mask picture and the background picture into generation target figure Piece simultaneously exports.
10. device according to claim 9, wherein the unit is further configured to:
Learnt from the style picture by spatial alternation network it is described it is to be replaced fall the inclination of text, transitting probability information.
11. device according to claim 9, wherein the unit is further configured to:
By quadrangle detection plus thin plate spline function converting means learn from the style picture it is described it is to be replaced fall text Inclination, transitting probability information.
12. device according to claim 11, wherein the unit is further configured to:
Learnt from the style picture by stroke filter it is described it is to be replaced fall the stroke weight of text, font information.
13. device according to claim 12, wherein first generation unit is further configured to:
Based on it is described it is to be replaced fall inclination, transitting probability information and the stroke weight of text, font information generate mask picture.
14. device according to claim 9, wherein the unit is further configured to:
Learn from the style picture it is described it is to be replaced fall text colouring information.
15. device according to claim 9, wherein the acquiring unit is further configured to:
Obtain target text;
The textual image of predetermined format is generated based on the target text.
16. the device according to one of claim 9-15, wherein the foreground picture, the mask picture and the back Scape picture generates confrontation network by condition and generates.
17. a kind of electronic equipment for output information, comprising:
One or more processors;
Storage device is stored thereon with one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as method described in any one of claims 1-8.
18. a kind of computer-readable medium, is stored thereon with computer program, wherein real when described program is executed by processor Now such as method described in any one of claims 1-8.
CN201910757082.7A 2019-08-16 2019-08-16 Method and device for outputting information Active CN110458918B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910757082.7A CN110458918B (en) 2019-08-16 2019-08-16 Method and device for outputting information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910757082.7A CN110458918B (en) 2019-08-16 2019-08-16 Method and device for outputting information

Publications (2)

Publication Number Publication Date
CN110458918A true CN110458918A (en) 2019-11-15
CN110458918B CN110458918B (en) 2023-05-09

Family

ID=68487137

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910757082.7A Active CN110458918B (en) 2019-08-16 2019-08-16 Method and device for outputting information

Country Status (1)

Country Link
CN (1) CN110458918B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160142A (en) * 2019-12-14 2020-05-15 上海交通大学 Certificate bill positioning detection method based on numerical prediction regression model
CN111402367A (en) * 2020-03-27 2020-07-10 维沃移动通信有限公司 Image processing method and electronic equipment
CN111539184A (en) * 2020-04-29 2020-08-14 上海眼控科技股份有限公司 Text data manufacturing method and device based on deep learning, terminal and storage medium
CN111986076A (en) * 2020-08-21 2020-11-24 深圳市慧鲤科技有限公司 Image processing method and device, interactive display device and electronic equipment
CN112183296A (en) * 2020-09-23 2021-01-05 北京文思海辉金信软件有限公司 Simulated bill image generation and bill image recognition method and device
CN112801923A (en) * 2021-04-14 2021-05-14 江西博微新技术有限公司 Word processing method, system, readable storage medium and computer equipment
CN112825141A (en) * 2019-11-21 2021-05-21 上海高德威智能交通系统有限公司 Method and device for recognizing text, recognition equipment and storage medium
CN112839185A (en) * 2020-11-27 2021-05-25 北京百度网讯科技有限公司 Method, apparatus, device and medium for processing image
CN112911373A (en) * 2021-01-31 2021-06-04 云知声智能科技股份有限公司 Method, device and equipment for generating video subtitles and storage medium
CN113158977A (en) * 2021-05-12 2021-07-23 河南师范大学 Image character editing method for improving FANnet generation network
CN113223117A (en) * 2021-05-12 2021-08-06 北京世纪好未来教育科技有限公司 Image processing method and related device
CN113256490A (en) * 2020-02-13 2021-08-13 北京小米松果电子有限公司 Document image processing method, device and medium
CN113313786A (en) * 2020-02-27 2021-08-27 深圳云天励飞技术有限公司 Portrait picture coloring method and device and terminal equipment
CN113505772A (en) * 2021-06-23 2021-10-15 北京华创智芯科技有限公司 License plate image generation method and system based on generation countermeasure network
CN113553797A (en) * 2021-08-04 2021-10-26 哈尔滨工业大学 End-to-end consistent-style character and image generation method
CN114863256A (en) * 2022-05-12 2022-08-05 平安国际智慧城市科技股份有限公司 Picture skin changing method, device, equipment and storage medium
CN115310405A (en) * 2022-07-21 2022-11-08 北京汉仪创新科技股份有限公司 Font replacement method, system, device and medium based on countermeasure generation network
CN117710234A (en) * 2024-02-06 2024-03-15 青岛海尔科技有限公司 Picture generation method, device, equipment and medium based on large model

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013104183A1 (en) * 2012-01-09 2013-07-18 华为技术有限公司 Method for displaying user state, display terminal and server
CN103959282A (en) * 2011-09-28 2014-07-30 谷歌公司 Selective feedback for text recognition systems
CN104063380A (en) * 2013-03-18 2014-09-24 北京百度网讯科技有限公司 Method and device for converting picture files into webpage files
CN106664417A (en) * 2014-05-15 2017-05-10 英特尔公司 Content adaptive background-foreground segmentation for video coding
CN108108215A (en) * 2017-12-19 2018-06-01 北京百度网讯科技有限公司 Skin generation method, device, terminal and computer readable storage medium
CN108174237A (en) * 2017-12-28 2018-06-15 北京奇虎科技有限公司 Image combining method and device
CN109190722A (en) * 2018-08-06 2019-01-11 大连民族大学 Font style based on language of the Manchus character picture migrates transform method
CN109582926A (en) * 2018-11-26 2019-04-05 北京邮电大学 A kind of digital printing method of the anti printing and scanning attack based on fusion font
CN109635883A (en) * 2018-11-19 2019-04-16 北京大学 The Chinese word library generation method of the structural information guidance of network is stacked based on depth
CN109816589A (en) * 2019-01-30 2019-05-28 北京字节跳动网络技术有限公司 Method and apparatus for generating cartoon style transformation model
CN110008832A (en) * 2019-02-27 2019-07-12 西安电子科技大学 Based on deep learning character image automatic division method, information data processing terminal

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103959282A (en) * 2011-09-28 2014-07-30 谷歌公司 Selective feedback for text recognition systems
WO2013104183A1 (en) * 2012-01-09 2013-07-18 华为技术有限公司 Method for displaying user state, display terminal and server
CN104063380A (en) * 2013-03-18 2014-09-24 北京百度网讯科技有限公司 Method and device for converting picture files into webpage files
CN106664417A (en) * 2014-05-15 2017-05-10 英特尔公司 Content adaptive background-foreground segmentation for video coding
CN108108215A (en) * 2017-12-19 2018-06-01 北京百度网讯科技有限公司 Skin generation method, device, terminal and computer readable storage medium
CN108174237A (en) * 2017-12-28 2018-06-15 北京奇虎科技有限公司 Image combining method and device
CN109190722A (en) * 2018-08-06 2019-01-11 大连民族大学 Font style based on language of the Manchus character picture migrates transform method
CN109635883A (en) * 2018-11-19 2019-04-16 北京大学 The Chinese word library generation method of the structural information guidance of network is stacked based on depth
CN109582926A (en) * 2018-11-26 2019-04-05 北京邮电大学 A kind of digital printing method of the anti printing and scanning attack based on fusion font
CN109816589A (en) * 2019-01-30 2019-05-28 北京字节跳动网络技术有限公司 Method and apparatus for generating cartoon style transformation model
CN110008832A (en) * 2019-02-27 2019-07-12 西安电子科技大学 Based on deep learning character image automatic division method, information data processing terminal

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JULIET,SE 等: "Effective layer-based segmentation of compound images using morphology", 《JOURNAL OF REAL-TIME IMAGE PROCESSING》 *
张国和等: "最大稳定极值区域与笔画宽度变换的自然场景文本提取方法", 《西安交通大学学报》 *

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112825141B (en) * 2019-11-21 2023-02-17 上海高德威智能交通系统有限公司 Method and device for recognizing text, recognition equipment and storage medium
CN112825141A (en) * 2019-11-21 2021-05-21 上海高德威智能交通系统有限公司 Method and device for recognizing text, recognition equipment and storage medium
US11928872B2 (en) 2019-11-21 2024-03-12 Shanghai Goldway Intelligent Transportation System Co., Ltd. Methods and apparatuses for recognizing text, recognition devices and storage media
CN111160142A (en) * 2019-12-14 2020-05-15 上海交通大学 Certificate bill positioning detection method based on numerical prediction regression model
CN111160142B (en) * 2019-12-14 2023-07-11 上海交通大学 Certificate bill positioning detection method based on numerical prediction regression model
CN113256490A (en) * 2020-02-13 2021-08-13 北京小米松果电子有限公司 Document image processing method, device and medium
CN113313786A (en) * 2020-02-27 2021-08-27 深圳云天励飞技术有限公司 Portrait picture coloring method and device and terminal equipment
CN113313786B (en) * 2020-02-27 2024-06-11 深圳云天励飞技术有限公司 Portrait picture coloring method and device and terminal equipment
CN111402367B (en) * 2020-03-27 2023-09-26 维沃移动通信有限公司 Image processing method and electronic equipment
CN111402367A (en) * 2020-03-27 2020-07-10 维沃移动通信有限公司 Image processing method and electronic equipment
WO2021190429A1 (en) * 2020-03-27 2021-09-30 维沃移动通信有限公司 Image processing method and electronic device
CN111539184A (en) * 2020-04-29 2020-08-14 上海眼控科技股份有限公司 Text data manufacturing method and device based on deep learning, terminal and storage medium
CN111986076A (en) * 2020-08-21 2020-11-24 深圳市慧鲤科技有限公司 Image processing method and device, interactive display device and electronic equipment
WO2022037111A1 (en) * 2020-08-21 2022-02-24 深圳市慧鲤科技有限公司 Image processing method and apparatus, interactive display apparatus, and electronic device
CN112183296A (en) * 2020-09-23 2021-01-05 北京文思海辉金信软件有限公司 Simulated bill image generation and bill image recognition method and device
CN112183296B (en) * 2020-09-23 2022-03-04 中电金信软件有限公司 Simulated bill image generation and bill image recognition method and device
JP2021182441A (en) * 2020-11-27 2021-11-25 北京百度網訊科技有限公司 Method for processing image, device, apparatus, medium, and program
JP7320570B2 (en) 2020-11-27 2023-08-03 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Method, apparatus, apparatus, medium and program for processing images
CN112839185A (en) * 2020-11-27 2021-05-25 北京百度网讯科技有限公司 Method, apparatus, device and medium for processing image
CN112911373A (en) * 2021-01-31 2021-06-04 云知声智能科技股份有限公司 Method, device and equipment for generating video subtitles and storage medium
CN112911373B (en) * 2021-01-31 2023-05-26 云知声智能科技股份有限公司 Video subtitle generating method, device, equipment and storage medium
CN112801923A (en) * 2021-04-14 2021-05-14 江西博微新技术有限公司 Word processing method, system, readable storage medium and computer equipment
CN113223117A (en) * 2021-05-12 2021-08-06 北京世纪好未来教育科技有限公司 Image processing method and related device
CN113158977A (en) * 2021-05-12 2021-07-23 河南师范大学 Image character editing method for improving FANnet generation network
CN113158977B (en) * 2021-05-12 2022-07-29 河南师范大学 Image character editing method for improving FANnet generation network
CN113505772A (en) * 2021-06-23 2021-10-15 北京华创智芯科技有限公司 License plate image generation method and system based on generation countermeasure network
CN113505772B (en) * 2021-06-23 2024-05-10 北京华创智芯科技有限公司 License plate image generation method and system based on generation countermeasure network
CN113553797A (en) * 2021-08-04 2021-10-26 哈尔滨工业大学 End-to-end consistent-style character and image generation method
CN114863256A (en) * 2022-05-12 2022-08-05 平安国际智慧城市科技股份有限公司 Picture skin changing method, device, equipment and storage medium
CN115310405A (en) * 2022-07-21 2022-11-08 北京汉仪创新科技股份有限公司 Font replacement method, system, device and medium based on countermeasure generation network
CN117710234A (en) * 2024-02-06 2024-03-15 青岛海尔科技有限公司 Picture generation method, device, equipment and medium based on large model
CN117710234B (en) * 2024-02-06 2024-05-24 青岛海尔科技有限公司 Picture generation method, device, equipment and medium based on large model

Also Published As

Publication number Publication date
CN110458918B (en) 2023-05-09

Similar Documents

Publication Publication Date Title
CN110458918A (en) Method and apparatus for output information
US20210076105A1 (en) Automatic Data Extraction and Conversion of Video/Images/Sound Information from a Slide presentation into an Editable Notetaking Resource with Optional Overlay of the Presenter
CN111582241B (en) Video subtitle recognition method, device, equipment and storage medium
CN109688463A (en) A kind of editing video generation method, device, terminal device and storage medium
CN105184249B (en) Method and apparatus for face image processing
CN110414519A (en) A kind of recognition methods of picture character and its identification device
US20150049951A1 (en) Presenting meaningful information summary for analyzing complex visualizations
CN114821734A (en) Method and device for driving expression of virtual character
WO2022089170A1 (en) Caption area identification method and apparatus, and device and storage medium
CN109375914A (en) Information remote exchange method and system
CN111652232A (en) Bill identification method and device, electronic equipment and computer readable storage medium
CN104572683A (en) Word stroke animation processing method and device
CN112017257B (en) Image processing method, apparatus and storage medium
CN110795925A (en) Image-text typesetting method based on artificial intelligence, image-text typesetting device and electronic equipment
CN110532983A (en) Method for processing video frequency, device, medium and equipment
CN109388725A (en) The method and device scanned for by video content
WO2023056835A1 (en) Video cover generation method and apparatus, and electronic device and readable medium
CN114972847A (en) Image processing method and device
CN109389660A (en) Image generating method and device
CN113850714A (en) Training of image style conversion model, image style conversion method and related device
CN111881900B (en) Corpus generation method, corpus translation model training method, corpus translation model translation method, corpus translation device, corpus translation equipment and corpus translation medium
CN109493401B (en) PowerPoint generation method, device and electronic equipment
CN115018975A (en) Data set generation method and device, electronic equipment and storage medium
Thomas Multimodality and media archaeology: Complementary optics for looking at digital stuff?
CN113535017B (en) Method and device for processing and synchronously displaying drawing files and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant