CN111859893B - Image-text typesetting method, device, equipment and medium - Google Patents

Image-text typesetting method, device, equipment and medium Download PDF

Info

Publication number
CN111859893B
CN111859893B CN202010750241.3A CN202010750241A CN111859893B CN 111859893 B CN111859893 B CN 111859893B CN 202010750241 A CN202010750241 A CN 202010750241A CN 111859893 B CN111859893 B CN 111859893B
Authority
CN
China
Prior art keywords
image
character
text
characters
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010750241.3A
Other languages
Chinese (zh)
Other versions
CN111859893A (en
Inventor
姚志强
周曦
吴媛
杨开
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Yunconghonghuang Intelligent Technology Co Ltd
Original Assignee
Guangzhou Yunconghonghuang Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Yunconghonghuang Intelligent Technology Co Ltd filed Critical Guangzhou Yunconghonghuang Intelligent Technology Co Ltd
Priority to CN202010750241.3A priority Critical patent/CN111859893B/en
Publication of CN111859893A publication Critical patent/CN111859893A/en
Application granted granted Critical
Publication of CN111859893B publication Critical patent/CN111859893B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/189Automatic justification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables

Abstract

The invention provides a method, a device, equipment and a medium for typesetting pictures and texts, comprising the following steps: acquiring an image and determining the character position of each character in the image; semantically recognizing the image to obtain at least one label of the image main body; calculating the correlation between each character and each label to obtain the correlation value of each character; traversing all the characters according to the correlation values of the characters and the labels to obtain the size sequence of all the characters so as to adjust the size of the characters of the image. The method comprises the steps of recognizing labels of an image main body through semantics, arranging a character size sequence by utilizing the correlation of each character and the label, and adjusting the size of the image characters according to the character size sequence; the method can reduce the workload of editors, achieve the purpose of rapidly setting the image and the text, avoid manual adjustment, save labor, improve the typesetting speed and realize rapid and accurate image and text typesetting.

Description

Image-text typesetting method, device, equipment and medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method, a device, equipment and a medium for typesetting pictures and texts.
Background
The image-text typesetting can be used for making poster advertisements, magazine covers, PPT, website pages and the like, and the traditional image-text typesetting is manually finished by editors so as to meet the requirements of individuation and customization.
However, in the case of plane layout, there are images and characters, and the layout positions of the characters are different due to different contents in the images, so that how to intelligently adjust the sizes of the characters according to the positions, proportions and ornamental values of the images is an urgent need in the art for developing a new image-text layout method, compared with manually setting the sizes of the characters in the images.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, an object of the present invention is to provide a method, system, device and medium for typesetting image and text, which are used to solve the problem that the size of the characters in the image cannot be intelligently adjusted during the image and text typesetting in the prior art.
In order to achieve the above objects and other related objects, the present invention provides a method for composing a text, comprising the steps of:
acquiring an image and determining the character position of each character in the image;
semantically recognizing the image to obtain at least one label of the image main body;
calculating the correlation between each character and each label to obtain the correlation value of each character;
traversing all the characters according to the correlation values of the characters and the labels to obtain the size sequence of all the characters so as to adjust the size of the characters of the image.
The invention also provides a picture and text typesetting device, comprising:
the acquisition module is used for acquiring an image and determining the character position of each character in the image;
the semantic recognition module is used for semantically recognizing the image to obtain at least one label of the image main body;
the correlation calculation module is used for calculating the correlation between each position of the characters and each label to obtain the correlation value of each position of the characters;
and the first character adjusting module is used for traversing all characters according to the correlation values of the characters and the labels to obtain the size sequence of all characters so as to adjust the size of the image characters.
The present invention also provides an apparatus comprising:
one or more processors; and
one or more machine-readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform a method as described in one or more of the above.
The present invention also provides one or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform the methods as described in one or more of the above.
As described above, the image-text typesetting method, device, equipment and medium provided by the invention have the following beneficial effects:
the method comprises the steps of recognizing labels of an image main body through semantics, arranging a character size sequence by utilizing the correlation of each character and the label, and adjusting the size of the image characters according to the character size sequence; the method can reduce the workload of editors, achieve the purpose of rapidly setting the image and the text, avoid manual adjustment, save labor, improve the typesetting speed and realize rapid and accurate image and text typesetting.
Drawings
Fig. 1 is a schematic flow chart of a method for composing a text and a text according to an embodiment;
fig. 2 is a schematic flowchart of a method for composing a picture and text according to another embodiment;
fig. 3 is a schematic flowchart of a method for composing a picture and text according to another embodiment;
fig. 4 is a schematic diagram of a hardware structure of the image-text composition device according to an embodiment;
fig. 5 is a schematic hardware structure diagram of a terminal device according to an embodiment;
fig. 6 is a schematic diagram of a hardware structure of a terminal device according to another embodiment.
Description of the element reference numerals
M10 conversion module
M20 management module
1100 input device
1101 first processor
1102 output device
1103 first memory
1104 communication bus
1200 processing assembly
1201 second processor
1202 second memory
1203 communication assembly
1204 Power supply Assembly
1205 multimedia assembly
1206 voice assembly
1207 input/output interface
1208 sensor assembly
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
In the related technologies in the field, the conventional image-text typesetting always cannot well meet the requirement of a user for continuously adjusting the size of characters according to the position, the image proportion and the ornamental value of the characters in an image, and an editor needs to manually adjust the image-text typesetting (particularly the setting of the size of the characters in the image) in the images such as poster advertisements, magazine covers, PPT, website pages and the like.
Based on the problems existing in the scheme, the invention discloses and provides a picture-text typesetting method, a picture-text typesetting device, electronic equipment and a storage medium.
Semantic recognition: the meaning of the image is indicated, which represents that a computer simulates human understanding of the image, for example, the computer is a scene which cannot understand an image, and the image scene such as information of 'playground', 'grassland' and the like can be recognized after semantic recognition. The scene or object in which the image is most likely to be expressed is called image subject information.
And (3) area growth: the process of developing an image into a larger area according to rules starts with a certain pixel.
Correlation, refers to the degree of association of two variables.
Referring to fig. 1, the present invention provides a method for composing a text, including the following steps:
s1, acquiring an image, and determining the character position of each character in the image;
each character can be each character in the image, or a paragraph sentence can be designated in the image, and each character is selected for explanation.
In addition, by the above method, not only the character position of each character in the image can be determined, but also the number (quantity) of characters in the image can be counted.
S2, semantically recognizing the image to obtain at least one label of the image main body;
wherein, because the subject in the image may have one or more labels, for example, the image is a landscape image, and the labels of three subjects of beach, sea and people are obtained by semantically recognizing the image.
S3, calculating the correlation between each character and each label to obtain the correlation value of each character;
and expressing the correlation value of each character and each label in a numerical quantification mode, and expressing the correlation degree according to the numerical value.
And S4, traversing all the characters according to the relevance of the characters and the labels to obtain the size sequence of all the characters so as to adjust the size of the image characters.
In this embodiment, the tags of the image main body are identified by semantics, the word size sequence is arranged by utilizing the correlation between each word and the tag, and the image word size is adjusted according to the word size sequence; the method can reduce the workload of editors, achieve the purpose of rapidly setting the image and the text, avoid manual adjustment, save labor, improve the typesetting speed and realize rapid and accurate image and text typesetting.
In an exemplary embodiment, within the sample set, the image semantic recognition module is trained through a neural network, and the trained image semantic recognition module is used for recognizing the image to obtain the label of the subject in the image.
Since the image may be an image of any scene, such as an indoor scene (bedroom, living room, etc.) or an outdoor scene (forest, street, etc.), the specific implementation of this embodiment is not limited, and there are many possible semantic levels for describing the scene. For example, for an indoor scene, the semantic hierarchy may include room layout, objects, scene properties (e.g., lighting conditions and openness of the scene), and the like. Where the spatial layout may determine the spatial structure and the objects appearing in the image may determine the scene type, e.g. if we move the tv and sofa out and move the bed and lights inside, the living room may become a bedroom. At the same time, various attributes that may be related to materials, surface properties, lighting, etc., are more refined scene description elements. Therefore, the image semantic recognition module can also be a neural network model trained to synthesize a real scene, and the deep neural representation of the image semantic recognition module can encode the similar multi-level semantics through learning.
For example, the semantic hierarchies may include, but are not limited to, one or more of the following: scene spatial layout, object categories, scene attributes. The prediction semantics corresponding to each semantic level may include one or more, which is not limited in this embodiment, for example, the prediction semantics corresponding to the scene spatial layout level may include description of an indoor spatial structure, such as whether the indoor spatial structure exists, and a position of a structural layout line (layout line) is determined; the prediction semantics corresponding to the object category hierarchy may include any object name, such as a sofa, a table lamp, a cloud, a tree, a cup, a bridge, and the like; the semantics corresponding to the scene attribute hierarchy may include descriptions of scenes in the image, such as bedrooms, living rooms, bright/dim lighting attributes, wood attributes of main materials in the scene, and so on.
Through the mode, the main body in the image can be accurately identified, the corresponding label of the main body of the image is obtained, and the size of the font in the image can be adjusted according to the scene shown by the main body of the image.
In an exemplary embodiment, traversing all the texts according to the correlation between the texts and the label to obtain all the text size orders to adjust the image text size, which is detailed as follows:
comparing the relative values of any two characters, traversing all the characters, arranging the characters in the order according to the comparison result, and adjusting the image character size according to the character size order.
In this embodiment, by arbitrarily selecting two texts, such as di and dj, the correlation between each of the two texts and the intra-image subject label is compared, for example, the correlation values corresponding to di and dj are Ii and Ij, and if the correlation value Ii is greater than the correlation value Ij, the order Si corresponding to the text size order di is greater than the order Sj corresponding to dj; if the correlation value Ii is smaller than the correlation value Ij, the sequence Si corresponding to the character size sequence di is smaller than the sequence Sj corresponding to dj; if the correlation value Ii is equal to the correlation value Ij, the sequence Si corresponding to the text size sequence di is equal to the sequence Sj corresponding to dj; the characters are arranged in size order and are expressed as seq (1, n).
The characters in the image are sorted according to the character size according to the compact procedure of the relevance between the characters in the image and the image main body (namely, the larger the character is, the closer the relevance between the representative and the main body label is), so that the character size sequence is obtained, the character size sequence is utilized to intelligently adjust the sizes of the characters in the image, the artificial participation is avoided, the character optimization design efficiency is improved, the typesetting speed is improved, and the fast and accurate image-text typesetting is realized.
In an exemplary embodiment, please refer to fig. 2, which provides a method for typesetting text and text according to the present invention, further comprising the following steps:
step S5, determining the character size range by using the corresponding area of each character in the image, and adjusting the character size of the image according to the character size range; or
Step S5, determining the text size range by using the corresponding area of each text in the image, and adjusting the text size of the image according to the text size range and the text size sequence.
In one way, on the basis of step S1 in the above embodiment, the image text size is adjusted within the text size range through the text size range, so that the image text size is prevented from being adjusted only according to the image size sequence, which may cause unreasonable text size adjustment in the image, such as too large text and too small text, for example, too large image text may affect the image saliency.
Alternatively, on the basis of steps S1 to S4 in the above embodiment, the image character size is adjusted by the character size range and the character size sequence, so as to prevent the unreasonable size adjustment of the characters in the image due to the fact that the image character size is adjusted only according to the image size sequence, for example, the image saliency is affected by too large characters and too small characters, for example, too large characters and too small characters. The method combines the two aspects, adjusts the size of the characters according to the sequence of the sizes of the characters in the range of the sizes of the characters in the image, can reasonably and optimally control the sizes of the characters in the image, and avoids the phenomenon that the degree of correlation between the display size of the characters in the image and a main body of the image is not high, thereby influencing the significance of the image.
In an exemplary embodiment, the specific step of obtaining the size range of the text in the image includes:
dividing the image by using a region growing method to obtain a plurality of mutually disjoint regions with certain common characteristics, and determining the nearest segmented region where each character is located according to the regions;
and determining the character size range of each character according to the nearest segmentation area.
By determining the size range of the image characters in the image, the phenomenon that the image characters are too large or too small in adjustment is effectively prevented, and the optimization of the subsequent character typesetting design is facilitated.
In the above embodiment, dividing the image by using a region growing method includes:
step a, randomly selecting background pixel points from the character positions of each character and putting the background pixel points into a growth set;
b, selecting each pixel point in the growth set one by one, and calculating histograms of all other pixels of the pixel point in the K field;
c, when detecting that the histogram measurement of a certain pixel point is smaller than a preset histogram feature difference threshold, classifying the pixel point and a randomly selected background pixel point into the same class, and using the pixel point for updating a growth set;
and d, repeating the steps b and c until all pixel points in the growth set are detected.
In this embodiment, since the size of the text is smaller than the range of the divided region, the region is divided by using a region growing method, so as to obtain a text size limiting threshold, which is beneficial to subsequently adjusting the size of the image text within the size limiting threshold range.
It should be noted that, let k neighborhood of any point a on the image be:
Figure BDA0002609845020000061
in the formula (1), a and b are pixel points, axAnd ayDenotes the abscissa and ordinate of point a, bxAnd byThe abscissa and ordinate of the point a are indicated. Let H (a) be the color histogram of a point, defined as a in k neighborhood
Figure BDA0002609845020000062
With respect to statistical histograms of RGB components. The histogram of the pixel reflects the statistical distribution characteristics and the basic tone of the local color of the point.
It should be further noted that, in the image segmentation Area where each character is located, the Area size is Area, and the size Si of the character di should be smaller than Area; specifically, according to the nearest segmented region of each character in the image and the aesthetic relationship between the image and the character, the following formula is adopted to determine the character size range of each character:
w1*Area(i)≤di≤w2*Area(i) (2)
in the formula (1), w1 and w2 are respectively preset thresholds, are (i) is the size of the image area where the ith character is located, and di is the size of the ith character.
In this embodiment, the image is divided by using a region growing method, and each region in the image is uniformly divided, so that the connected region with the same characteristics (characters) can be generally divided, and good boundary information and a dividing result can be provided. Even without prior knowledge, can be used to accurately determine the size of a text region within an image.
In an exemplary embodiment, please refer to fig. 3, which provides a method for typesetting text and text for the present invention, further comprising the following steps:
and step S6, randomizing the value in the character size range according to the character size sequence, and adjusting the size of the characters in the image according to the randomized value.
For example, the following formula is used to adjust the size of the text in the image:
Figure BDA0002609845020000071
in the formula (3), seq (1, n) represents the character size sequence, the value is randomly taken according to the character size sequence in the character size range by combining the formula (2), the character size in the image is adjusted in a random value taking mode, manual participation is not needed, simplicity and convenience are achieved, and intelligent character size adjustment is achieved by using random values.
In an exemplary embodiment, the text position within the image is checked for correctness based on the correlation of the text to the label. Specifically, when the correlation between the characters in the image and the image main body label is detected to be lower than a preset threshold value, the characters and the texts are displayed to be unmatched, and the unmatched characters of the main body in the image are changed. For example, when the image is detected to be an indoor scene, the main body of the image is identified to be a television object semantically, and the unmatched text with low correlation is an electric lamp, the text with the television object is matched with the text with the television object in the image, and if the text with the television object replaces the text with the electric lamp in the original text position of the original image, the layout design with unmatched and irrelevant images and texts is effectively avoided, and the layout accuracy is ensured.
In an exemplary embodiment, further comprising: and using the same wire frame, or using the same color, or shortening the distance between the image and the text to increase the relevance of the image-text typesetting.
In this embodiment, whether the text indicating the corresponding image or the corresponding image is used, the correspondence between the image and the text may be established according to the identified orientation relationship and the indication arrow. For example, after the orientation relationship between the picture and the text is identified, instead of adding an indication arrow, the correspondence between the picture and the text may be enhanced in other ways, for example, the picture and the text may be formatted to enhance the associated layout. The format processing of the typesetting aspect can comprise: and using the same wire frame, the same color or the shortened interval between the picture and the character for the picture and the character, so that the user can easily distinguish the corresponding pictures and the characters.
Referring to fig. 4, a graphic typesetting apparatus provided by the present invention includes:
the system comprises an acquisition module 1, a display module and a display module, wherein the acquisition module is used for acquiring an image and determining the character position of each character in the image;
a semantic recognition module 2, configured to semantically recognize the image to obtain at least one tag of the image main body;
a correlation calculation module 3, configured to calculate a correlation between each of the characters and each of the labels, so as to obtain a correlation value of each of the characters;
and the first character adjusting module 4 is used for traversing all characters according to the correlation values of the characters and the labels to obtain the size sequence of all characters so as to adjust the size of the image characters.
In an exemplary embodiment, the obtaining module is further configured to count each text to determine the number of texts in the image.
In an exemplary embodiment, the first text adjustment module further comprises:
comparing the relative values of any two characters, traversing all the characters, arranging the characters in the order according to the comparison result, and adjusting the image character size according to the character size order.
In an exemplary embodiment, the teletext typesetting apparatus further comprises:
the second character adjusting module 5 is configured to determine the size range of the characters by using the corresponding area of each character in the image, and adjust the size of the characters in the image according to the size range of the characters; or
And the second character adjusting module 5 is configured to determine the character size range by using a corresponding region of each character in the image, and adjust the character size of the image according to the character size range and the character size sequence.
In an exemplary embodiment, the second text adjustment module includes:
the image region growing unit divides the image by using a region growing method to obtain a plurality of mutually disjoint regions with certain common characteristics, and determines the nearest segmentation region where the characters are located according to the regions;
and the image area determining unit is used for determining the character size range of each character according to the nearest segmented area.
Wherein, it should be noted that the dividing the image by using the region growing method includes:
step a, randomly selecting background pixel points from the character positions of each character and putting the background pixel points into a growth set;
b, selecting each pixel point in the growth set one by one, and calculating histograms of all other pixels of the pixel point in the K field;
c, when detecting that the histogram measurement of a certain pixel point is smaller than a preset histogram feature difference threshold, classifying the pixel point and a randomly selected background pixel point into the same class, and using the pixel point for updating a growth set;
and d, repeating the steps b and c until all pixel points in the growth set are detected.
It should be further noted that the image area determining unit is further configured to determine, according to a nearest segmented area of each character in the image, and in combination with an aesthetic relationship between the image and the character, a character size range of each character by using the following formula:
w1*Area(i)≤di≤w2*Area(i)
in the formula, w1 and w2 are respectively preset thresholds, are (i) is the size of the image area where the ith character is located, and di is the size of the ith character.
In an exemplary embodiment, the teletext typesetting apparatus further comprises:
and the third character adjusting module is used for randomly taking values in the character size range according to the character size sequence and adjusting the sizes of the characters in the image according to the randomly taken values.
In an exemplary embodiment, the teletext typesetting apparatus further comprises:
and the position checking module is used for checking whether the position of the characters in the image is proper or not according to the correlation between the characters and the label.
In an exemplary embodiment, the teletext typesetting apparatus further comprises:
and the relevancy enhancing module is used for increasing the relevancy of the image-text typesetting by using the same wire frame, the same color or shortening the distance between the image and the text.
In this embodiment, the image-text typesetting device and the image-text typesetting method are in a one-to-one correspondence relationship, and please refer to the above embodiment for details of technical details, technical functions and technical effects, which are not described herein in detail.
In summary, the present invention provides a text-to-text typesetting apparatus, which identifies tags of an image body by semantics, arranges a text size sequence by using the correlation between each text and the tag, and adjusts the text size of the image according to the text size sequence; the method can reduce the workload of editors, achieve the purpose of rapidly setting the image and the text, avoid manual adjustment, save labor, improve the typesetting speed and realize rapid and accurate image and text typesetting.
An embodiment of the present application further provides an apparatus, which may include: one or more processors; and one or more machine readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform the method of fig. 1. In practical applications, the device may be used as a terminal device, and may also be used as a server, where examples of the terminal device may include: the mobile terminal includes a smart phone, a tablet computer, an electronic book reader, an MP3 (Moving Picture Experts Group Audio Layer III) player, an MP4 (Moving Picture Experts Group Audio Layer IV) player, a laptop, a vehicle-mounted computer, a desktop computer, a set-top box, an intelligent television, a wearable device, and the like.
Embodiments of the present application also provide a non-transitory readable storage medium, where one or more modules (programs) are stored in the storage medium, and when the one or more modules are applied to a device, the device may execute instructions (instructions) included in the method in fig. 1 according to the embodiments of the present application.
Fig. 5 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present application. As shown, the terminal device may include: an input device 1100, a first processor 1101, an output device 1102, a first memory 1103, and at least one communication bus 1104. The communication bus 1104 is used to implement communication connections between the elements. The first memory 1103 may include a high-speed RAM memory, and may also include a non-volatile storage NVM, such as at least one disk memory, and the first memory 1103 may store various programs for performing various processing functions and implementing the method steps of the present embodiment.
Alternatively, the first processor 1101 may be, for example, a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a microcontroller, a microprocessor, or other electronic components, and the first processor 1101 is coupled to the input device 1100 and the output device 1102 through a wired or wireless connection.
Optionally, the input device 1100 may include a variety of input devices, such as at least one of a user-oriented user interface, a device-oriented device interface, a software programmable interface, a camera, and a sensor. Optionally, the device interface facing the device may be a wired interface for data transmission between devices, or may be a hardware plug-in interface (e.g., a USB interface, a serial port, etc.) for data transmission between devices; optionally, the user-facing user interface may be, for example, a user-facing control key, a voice input device for receiving voice input, and a touch sensing device (e.g., a touch screen with a touch sensing function, a touch pad, etc.) for receiving user touch input; optionally, the programmable interface of the software may be, for example, an entry for a user to edit or modify a program, such as an input pin interface or an input interface of a chip; the output devices 1102 may include output devices such as a display, audio, and the like.
In this embodiment, the processor of the terminal device includes a function for executing each module of the speech recognition apparatus in each device, and specific functions and technical effects may refer to the above embodiments, which are not described herein again.
Fig. 6 is a schematic hardware structure diagram of a terminal device according to an embodiment of the present application. FIG. 6 is a specific embodiment of the implementation of FIG. 5. As shown, the terminal device of the present embodiment may include a second processor 1201 and a second memory 1202.
The second processor 1201 executes the computer program code stored in the second memory 1202 to implement the method described in fig. 1 in the above embodiment.
The second memory 1202 is configured to store various types of data to support operations at the terminal device. Examples of such data include instructions for any application or method operating on the terminal device, such as messages, pictures, videos, and so forth. The second memory 1202 may include a Random Access Memory (RAM) and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.
Optionally, a second processor 1201 is provided in the processing assembly 1200. The terminal device may further include: communication component 1203, power component 1204, multimedia component 1205, speech component 1206, input/output interfaces 1207, and/or sensor component 1208. The specific components included in the terminal device are set according to actual requirements, which is not limited in this embodiment.
The processing component 1200 generally controls the overall operation of the terminal device. The processing assembly 1200 may include one or more second processors 1201 to execute instructions to perform all or part of the steps of the data processing method described above. Further, the processing component 1200 can include one or more modules that facilitate interaction between the processing component 1200 and other components. For example, the processing component 1200 can include a multimedia module to facilitate interaction between the multimedia component 1205 and the processing component 1200.
The power supply component 1204 provides power to the various components of the terminal device. The power components 1204 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the terminal device.
The multimedia components 1205 include a display screen that provides an output interface between the terminal device and the user. In some embodiments, the display screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the display screen includes a touch panel, the display screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.
The voice component 1206 is configured to output and/or input voice signals. For example, the voice component 1206 includes a Microphone (MIC) configured to receive external voice signals when the terminal device is in an operational mode, such as a voice recognition mode. The received speech signal may further be stored in the second memory 1202 or transmitted via the communication component 1203. In some embodiments, the speech component 1206 further comprises a speaker for outputting speech signals.
The input/output interface 1207 provides an interface between the processing component 1200 and peripheral interface modules, which may be click wheels, buttons, etc. These buttons may include, but are not limited to: a volume button, a start button, and a lock button.
The sensor component 1208 includes one or more sensors for providing various aspects of status assessment for the terminal device. For example, the sensor component 1208 may detect an open/closed state of the terminal device, relative positioning of the components, presence or absence of user contact with the terminal device. The sensor assembly 1208 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact, including detecting the distance between the user and the terminal device. In some embodiments, the sensor assembly 1208 may also include a camera or the like.
The communication component 1203 is configured to facilitate communications between the terminal device and other devices in a wired or wireless manner. The terminal device may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In one embodiment, the terminal device may include a SIM card slot therein for inserting a SIM card therein, so that the terminal device may log onto a GPRS network to establish communication with the server via the internet.
As can be seen from the above, the communication component 1203, the voice component 1206, the input/output interface 1207 and the sensor component 1208 referred to in the embodiment of fig. 6 can be implemented as the input device in the embodiment of fig. 5.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims (20)

1. A picture and text typesetting method is characterized by comprising the following steps:
acquiring an image and determining the character position of each character in the image;
semantically recognizing the image to obtain at least one label of the image main body;
calculating the correlation between each character and each label to obtain the correlation value of each character;
comparing the relative values of any two characters, traversing all the characters, arranging the characters in the order according to the comparison result, and adjusting the image character size according to the character size order.
2. The teletext method according to claim 1, further comprising: counting each word determines the number of words in the image.
3. The teletext method according to claim 1, further comprising:
determining the size range of the characters by utilizing the corresponding area of each character in the image, and adjusting the size of the characters in the image according to the size range of the characters; or
And determining the character size range by utilizing the corresponding area of each character in the image, and adjusting the image character size according to the character size range and the character size sequence.
4. The teletext typesetting method according to claim 3, wherein the step of determining the text size range using the corresponding region of each text in the image comprises:
dividing the image by using a region growing method to obtain a plurality of mutually disjoint regions with certain common characteristics, and determining the nearest segmented region where each character is located according to the regions;
and determining the character size range of each character according to the nearest segmentation area.
5. The teletext method according to claim 4, wherein the step of dividing the image by region growing method comprises:
step a, randomly selecting background pixel points from the character positions of each character and putting the background pixel points into a growth set;
b, selecting each pixel point in the growth set one by one, and calculating histograms of all other pixels of the pixel point in the K field;
c, when detecting that the histogram measurement of a certain pixel point is smaller than a preset histogram feature difference threshold, classifying the pixel point and a randomly selected background pixel point into the same class, and using the pixel point for updating a growth set;
and d, repeating the steps b and c until all pixel points in the growth set are detected.
6. The teletext typesetting method according to claim 4, wherein the text size range of each text is determined according to the closest partitioned area of each text in the image and by combining the aesthetic relationship between the image and the text by using the following formula:
w1*Area(i)≤di≤w2*Area(i)
in the formula, w1 and w2 are respectively preset thresholds, area (i) is the size of the image area where the ith character is located, and di is the size of the ith character.
7. The teletext typesetting method according to claim 3, wherein values are randomized within the text size range according to the text size order, and the size of the text in the image is adjusted by the randomized values.
8. The teletext method according to claim 1, further comprising: and checking whether the character position in the image is proper or not according to the correlation between the characters and the label.
9. The teletext method according to claim 1, further comprising: and using the same wire frame, or using the same color, or shortening the distance between the image and the text to increase the relevance of the image-text typesetting.
10. An image-text composition device, comprising:
the acquisition module is used for acquiring an image and determining the character position of each character in the image;
the semantic recognition module is used for semantically recognizing the image to obtain at least one label of the image main body;
the correlation calculation module is used for calculating the correlation between each position of the characters and each label to obtain the correlation value of each position of the characters;
the first character adjusting module is used for comparing the sizes of the relevant values of any two characters, traversing all the characters and arranging the sizes of the characters in sequence according to the comparison result, and adjusting the sizes of the image characters according to the size sequence of the characters.
11. The teletext typesetting apparatus according to claim 10, wherein the obtaining module is further configured to count each text to determine the number of texts in the image.
12. The teletext typesetting apparatus according to claim 10, further comprising:
the second character adjusting module is used for determining the character size range by utilizing the corresponding area of each character in the image and adjusting the character size of the image according to the character size range; or
And the second character adjusting module is used for determining the character size range by utilizing the corresponding area of each character in the image and adjusting the character size of the image according to the character size range and the character size sequence.
13. The teletext typesetting apparatus according to claim 12, wherein the second text adjustment module comprises:
the image region growing unit divides the image by using a region growing method to obtain a plurality of mutually disjoint regions with certain common characteristics, and determines the nearest segmentation region where the characters are located according to the regions;
and the image area determining unit is used for determining the character size range of each character according to the nearest segmented area.
14. The teletext typesetting apparatus according to claim 13, wherein the dividing of the image using region growing method comprises:
step a, randomly selecting background pixel points from the character positions of each character and putting the background pixel points into a growth set;
b, selecting each pixel point in the growth set one by one, and calculating histograms of all other pixels of the pixel point in the K field;
c, when detecting that the histogram measurement of a certain pixel point is smaller than a preset histogram feature difference threshold, classifying the pixel point and a randomly selected background pixel point into the same class, and using the pixel point for updating a growth set;
and d, repeating the steps b and c until all pixel points in the growth set are detected.
15. The teletext typesetting apparatus according to claim 13, wherein the image area determination unit is further configured to determine the text size range of each text according to the most adjacent segmented area of each text in the image, in combination with the aesthetic relationship between the image and the text, by using the following formula:
w1*Area(i)≤di≤w2*Area(i)
in the formula, w1 and w2 are respectively preset thresholds, area (i) is the size of the image area where the ith character is located, and di is the size of the ith character.
16. The teletext typesetting apparatus according to claim 12, further comprising:
and the third character adjusting module is used for randomly taking values in the character size range according to the character size sequence and adjusting the sizes of the characters in the image according to the randomly taken values.
17. The teletext typesetting apparatus according to claim 10, further comprising: and the position checking module is used for checking whether the position of the characters in the image is proper or not according to the correlation between the characters and the label.
18. The teletext typesetting apparatus according to claim 10, further comprising: and the relevancy enhancing module is used for increasing the relevancy of the image-text typesetting by using the same wire frame, the same color or shortening the distance between the image and the text.
19. An image-text composition apparatus, comprising:
one or more processors; and
one or more machine readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform a teletext method according to one or more of claims 1-9.
20. A machine-readable medium having stored thereon instructions which, when executed by one or more processors, cause an apparatus to perform a teletext method according to one or more of claims 1-9.
CN202010750241.3A 2020-07-30 2020-07-30 Image-text typesetting method, device, equipment and medium Active CN111859893B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010750241.3A CN111859893B (en) 2020-07-30 2020-07-30 Image-text typesetting method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010750241.3A CN111859893B (en) 2020-07-30 2020-07-30 Image-text typesetting method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN111859893A CN111859893A (en) 2020-10-30
CN111859893B true CN111859893B (en) 2021-04-09

Family

ID=72945127

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010750241.3A Active CN111859893B (en) 2020-07-30 2020-07-30 Image-text typesetting method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN111859893B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113536169B (en) * 2021-06-28 2022-08-05 上海硬通网络科技有限公司 Method, device, equipment and storage medium for typesetting characters of webpage
CN114255302B (en) * 2022-03-01 2022-05-13 北京瞭望神州科技有限公司 Wisdom country soil data processing all-in-one

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101196876A (en) * 2007-12-29 2008-06-11 北京大学 Method and system for implementing pre-typesetting
CN110097010A (en) * 2019-05-06 2019-08-06 北京达佳互联信息技术有限公司 Picture and text detection method, device, server and storage medium
CN110188755A (en) * 2019-05-30 2019-08-30 北京百度网讯科技有限公司 A kind of method, apparatus and computer readable storage medium of image recognition
CN110706310A (en) * 2019-08-23 2020-01-17 华为技术有限公司 Image-text fusion method and device and electronic equipment
CN110705547A (en) * 2019-09-06 2020-01-17 中国平安财产保险股份有限公司 Method and device for recognizing characters in image and computer readable storage medium
CN111291572A (en) * 2020-01-20 2020-06-16 Oppo广东移动通信有限公司 Character typesetting method and device and computer readable storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9208515B2 (en) * 2011-03-08 2015-12-08 Affinnova, Inc. System and method for concept development
CN107025215A (en) * 2017-02-13 2017-08-08 阿里巴巴集团控股有限公司 A kind of picture and text composition method and device
TWI657343B (en) * 2017-09-01 2019-04-21 莊坤衛 System capable of adaptively adjusting embedded web page element and method thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101196876A (en) * 2007-12-29 2008-06-11 北京大学 Method and system for implementing pre-typesetting
CN110097010A (en) * 2019-05-06 2019-08-06 北京达佳互联信息技术有限公司 Picture and text detection method, device, server and storage medium
CN110188755A (en) * 2019-05-30 2019-08-30 北京百度网讯科技有限公司 A kind of method, apparatus and computer readable storage medium of image recognition
CN110706310A (en) * 2019-08-23 2020-01-17 华为技术有限公司 Image-text fusion method and device and electronic equipment
CN110705547A (en) * 2019-09-06 2020-01-17 中国平安财产保险股份有限公司 Method and device for recognizing characters in image and computer readable storage medium
CN111291572A (en) * 2020-01-20 2020-06-16 Oppo广东移动通信有限公司 Character typesetting method and device and computer readable storage medium

Also Published As

Publication number Publication date
CN111859893A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
US11601630B2 (en) Video processing method, electronic device, and non-transitory computer-readable medium
CN109618222B (en) A kind of splicing video generation method, device, terminal device and storage medium
CN112200062B (en) Target detection method and device based on neural network, machine readable medium and equipment
CN111339246B (en) Query statement template generation method, device, equipment and medium
CN108961157B (en) Picture processing method, picture processing device and terminal equipment
CN108961267B (en) Picture processing method, picture processing device and terminal equipment
CN111859893B (en) Image-text typesetting method, device, equipment and medium
CN108898082B (en) Picture processing method, picture processing device and terminal equipment
US20210406549A1 (en) Method and apparatus for detecting information insertion region, electronic device, and storage medium
WO2022089170A1 (en) Caption area identification method and apparatus, and device and storage medium
CN110022397B (en) Image processing method, image processing device, storage medium and electronic equipment
CN111563810A (en) Credit wind control model generation method, credit evaluation system, machine-readable medium and device
CN111310725A (en) Object identification method, system, machine readable medium and device
CN108847066A (en) A kind of content of courses reminding method, device, server and storage medium
CN112200844A (en) Method, device, electronic equipment and medium for generating image
CN107895004A (en) Method, device, terminal device and storage medium
CN114548276A (en) Method and device for clustering data, electronic equipment and storage medium
CN108961314A (en) Moving image generation method, device, electronic equipment and computer readable storage medium
CN111275683A (en) Image quality grading processing method, system, device and medium
WO2023197648A1 (en) Screenshot processing method and apparatus, electronic device, and computer readable medium
CN111523541A (en) Data generation method, system, equipment and medium based on OCR
WO2023045635A1 (en) Multimedia file subtitle processing method and apparatus, electronic device, computer-readable storage medium, and computer program product
CN108898169B (en) Picture processing method, picture processing device and terminal equipment
CN111818364B (en) Video fusion method, system, device and medium
CN111914850B (en) Picture feature extraction method, device, server and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant