CN113221901A

CN113221901A - Immature self-checking system-oriented picture literacy conversion method and system

Info

Publication number: CN113221901A
Application number: CN202110488166.2A
Authority: CN
Inventors: 梁循; 高君恒; 武文娟
Original assignee: Renmin University of China
Current assignee: Renmin University of China
Priority date: 2021-05-06
Filing date: 2021-05-06
Publication date: 2021-08-06

Abstract

The invention relates to a picture literacy conversion method and a picture literacy conversion system for an immature self-checking system, wherein the method comprises the following steps: acquiring a picture to be converted; identifying the picture to obtain picture text information, and subdividing text segments; determining an overall presentation layout of the text segments; determining the internal presentation layout of the text segment; and generating a character learning and converting picture according to the overall presentation layout of the text segment and the internal presentation layout of the text segment. The invention can carry out picture identification in advance at the information output end to obtain the character information in the picture, and convert the text information, thereby obtaining a new picture which contains the same character information and is difficult to be detected by an immature self-checking system, saving time and labor cost, and being beneficial to reasonable distribution and effective utilization of resources.

Description

Immature self-checking system-oriented picture literacy conversion method and system

Technical Field

The invention relates to the technical field of optical character recognition, in particular to an immature self-checking system-oriented image literacy conversion method and system.

Background

With the continuous development of internet technology, various social platforms become a part of people's lives today. In the age of rapid development of information, people's daily social activities are gradually changed from materialization to networking, and all information exchanged can be recorded and identified, wherein the information mainly relates to characters, pictures and audio information in a daily social platform. The view of publishing legality and compliance in a social platform is a very normal matter, however, due to an imperfect self-checking mechanism of a platform system, a lot of text information is judged by the system to be illegal and is not sent due to a name, and more authors select pictures to publish articles on the social platform, so that the system self-checking mechanism is avoided.

However, with the continuous development of deep learning at present, the picture literacy function is more and more perfect, and even the verification codes of various systems are easy to identify. The statement of publishing legality and compliance is the citizen's right, but because the self-checking system is not mature enough, many violations judged by the self-checking system do not exist actually, and if the social platform puts the deep learning picture literacy into the existing immature self-checking system, many published legal and compliance articles are difficult to publish.

Disclosure of Invention

In view of the above problems, an object of the present invention is to provide a picture word-learning conversion method for an immature self-checking system, which can perform picture recognition in advance at an information output end to obtain word information in a picture, and convert text information to obtain a new picture that contains the same word information and is not easily detected by the immature self-checking system as a rule, thereby saving time and labor cost, and facilitating reasonable allocation and effective utilization of resources.

In order to achieve the purpose, the invention adopts the following technical scheme: an immature self-checking system oriented picture literacy conversion method comprises the following steps:

acquiring a picture to be converted;

identifying the picture to obtain picture text information, and subdividing text segments;

determining an overall presentation layout of the text segments;

determining the internal presentation layout of the text segment;

and generating a character learning and converting picture according to the overall presentation layout of the text segment and the internal presentation layout of the text segment.

Further, the process of acquiring the picture to be converted is as follows: and selecting a character-learning conversion area of the required picture through screen capturing software, and storing to obtain the picture.

Further, the process of identifying the picture to obtain the text information of the picture and subdividing the text segment includes:

performing character recognition on the picture by using an optical character recognition tool for deep learning to obtain text information, wherein the recognition content comprises text detection, detection box correction and text recognition;

the optical character recognition tool automatically divides the text information into a plurality of text sections according to the positions of the text sections;

and checking the content of the text segments, revising the text content, and redefining the range of the text content covered by each text segment to subdivide the text segments.

Further, the text segment subdivision includes text segment merging, text segment splitting and/or text segment recombining

Further, the overall presentation layout of the text passage comprises:

the horizontal layout is adopted, and the text sections are orderly arranged from top to bottom;

the vertical layout is adopted, and the text sections are regularly arranged from left to right;

the horizontal and vertical staggered type, horizontal version text segment and vertical version text segment appear in the same layout together.

Further, the determining the text segment presentation layout process comprises:

determining an internal reading starting point of the text segment, wherein the internal reading starting point of the text segment comprises an upper left starting point and an upper right starting point;

determining the internal presentation layout of the text segment, specifically:

the internal presentation layout of the text segment is determined according to the determined internal reading starting point of the text segment, and if the upper left position is selected as the reading starting point of the text segment, the internal presentation layout of the text segment comprises a horizontal layout and a vertical layout; if the upper right position is selected as the reading starting point of the text segment, the layout inside the text segment is a vertical layout.

In a second aspect, the present invention provides an image literacy conversion system for an immature self-checking system, the system comprising:

the image acquisition unit is configured to acquire an image needing conversion;

the text segment subdivision unit is configured to identify the picture to obtain picture text information and subdivide the text segment;

a layout determination unit configured to determine an overall presentation layout of the text segment and to determine an internal presentation layout of the text segment;

and the picture conversion unit is configured to generate a picture character learning conversion picture according to the overall presentation layout of the text segment and the internal presentation layout of the text segment.

Further, the text segment subdividing unit includes:

the character recognition module is configured to obtain text information by using an optical character recognition tool for deep learning and perform character recognition on the picture, wherein the recognition content comprises text detection, detection box correction and text recognition;

the text segment dividing module is configured to automatically divide the text segment into a plurality of text segments according to the position of the text segment of the text information in the initial picture by the optical character recognition tool;

and the text segment subdivision module is configured to check the content of the text segments, redefine the range of the character content covered by each text segment and subdivide the text segments after correcting the text content.

In a third aspect, the present invention further provides a processing device, where the processing device at least includes a processor and a memory, and the memory stores a computer program, and is characterized in that the processor executes the computer program when executing the computer program to implement the picture literacy conversion method for the immature self-test system.

In a fourth aspect, the present invention further provides a computer storage medium, on which computer readable instructions are stored, where the computer readable instructions can be executed by a processor to perform the picture literacy conversion method for the immature self-test system.

Due to the adoption of the technical scheme, the invention has the following advantages:

the invention utilizes the picture literacy technology, faces to the existing immature self-checking system, uses the optical character recognition tool to obtain the text content of the original picture and subdivide the text segment, then determines the overall presentation layout and the text segment presentation layout, and finally generates the picture literacy conversion picture, thereby saving the time and labor cost and being beneficial to the reasonable distribution and the effective utilization of resources.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Like reference numerals refer to like parts throughout the drawings. In the drawings:

FIG. 1 is a flow chart of a piece literacy conversion method according to a first embodiment of the present invention;

FIG. 2 is a schematic diagram of OCR tool recognition according to a first embodiment of the present invention;

FIG. 3 is a diagram illustrating text segment subdivision according to a first embodiment of the present invention;

FIG. 4 is a schematic diagram of a determined overall presentation layout according to a first embodiment of the present invention;

fig. 5 is a schematic diagram of a presentation layout of a text segment determination according to a first embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

It is to be understood that the terminology used herein is for the purpose of describing particular example embodiments only, and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" may be intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms "comprises," "comprising," "including," and "having" are inclusive and therefore specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof. The method steps, processes, and operations described herein are not to be construed as necessarily requiring their performance in the particular order described or illustrated, unless specifically identified as an order of performance. It should also be understood that additional or alternative steps may be used.

For convenience of description, spatially relative terms, such as "inner", "outer", "lower", "upper", and the like, may be used herein to describe one element or feature's relationship to another element or feature as illustrated in the figures. Such spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures.

The existing various social contact platforms have respective text detection systems to avoid spreading of illegal contents, but the detection systems of the existing social contact platforms are not mature, and many legal and compliant text contents can be mistakenly detected and deleted. At present, a number of imperfect detection systems of social platforms are called immature self-checking systems. The invention aims at the existing immature self-checking system, picture recognition is carried out in advance at an information output end to obtain character information in a picture, and text information is converted, so that a new picture which contains the same character information and is difficult to detect violation by the immature self-checking system is obtained. The character content in the logic picture can be easily obtained according to normal writing through the existing mature picture character learning system, the existing character learning system can recognize a single character with high accuracy, and even a high-deformation character verification code like the Aliskiren can be recognized. However, the characters with high deformability not only increase the recognition difficulty of the machine, but also cause a large burden to the normal recognition of the naked eyes of people, which is obviously not a more scientific method for converting the characters by the pictures, and the purpose of converting the characters by the pictures is to enable people to easily obtain information, and an immature self-checking system cannot detect article logics to judge violation. At present, the semantic understanding of sentences in deep learning can not make a machine understand the meaning of a single Chinese character, and more, the machine can know that a certain character sequence has a certain meaning, and the character meaning is usually a given 'label' artificially. That is, it is difficult to easily determine whether a segment of text is illegal by simply having the machine fail to concatenate the text in some logic. Therefore, the picture character recognition conversion method facing the immature self-checking system is established, when a picture containing character information is input, a new picture containing the same information can be obtained after the specific picture character recognition conversion operation on the basis of not influencing reading of people, and unreasonable operation of the immature self-checking system is avoided in a low-cost and high-practicability mode.

Example one

As shown in fig. 1, in this embodiment, an immature self-checking system is oriented to an image literacy technology, an optical character recognition tool is used to obtain text content of an original image and subdivide a text segment, an overall presentation layout and a text segment presentation layout are determined, and finally, an image literacy conversion image is generated, where the specific content includes:

s1, acquiring the original picture needing to be converted

Specifically, the original picture to be converted is obtained by the present embodiment mainly through tools such as commonly used screen capture software and image processing software, so as to obtain the original picture containing the text information.

S2, performing picture literacy to obtain picture text information, and the specific process is as follows:

and S21, performing Character Recognition on the original picture by using an Optical Character Recognition (OCR) tool for deep learning, wherein the Recognition content comprises text detection, detection box correction and text Recognition. Wherein:

and the text detection is responsible for finding the position information of the text in the picture and providing a corresponding detection box.

The detection frame correction mainly makes ninety-degree rotations on the text detection frame obtained by text detection for obtaining the text detection frame with a positive visual angle.

And the text recognition is to perform text recognition on the processed text detection box and extract character information in the text detection box.

S22, text segment subdivision

The OCR tool automatically divides the text segments into a plurality of text segments according to the text information in the original picture and the positions of the text segments, and at the moment, the text segments need to be subjected to content manual visual inspection to avoid the deviation of the character expression meaning caused by inaccurate character recognition of the OCR tool. According to the result of visual inspection, the text content needs to be modified to a certain extent, wherein the content such as punctuation marks, character recognition results and the like needs to be deleted manually or redundant content needs to be modified. After that, the range of the text content covered by each text segment needs to be redefined for text segment subdivision, wherein the text segment subdivision includes operations of text segment merging, text segment splitting and/or text segment recombining.

S3, determining the overall presentation layout of the text segment

In order to conform to the normal reading habits of people, the overall presentation layout designed by the embodiment includes three types, respectively: horizontal layout, vertical layout and horizontal-vertical staggered layout.

The horizontal layout is a conventional layout, and the text sections are regularly arranged from top to bottom.

The vertical layout requires that the text segments be arranged regularly from left to right.

The horizontal and vertical staggered type can self-define and adjust the overall presentation layout according to the user preference, and the horizontal text section and the vertical text section are presented in the same layout together.

S4, determining the internal presentation layout of the text segment

Specifically, after determining the overall presentation layout of the text segments, the user needs to determine a reading starting point inside each text segment, and determine the presentation layout inside the text segment based on the reading starting point, which includes the following specific steps:

s41, determining the reading starting point in the text segment

The invention is oriented to an immature self-checking system, and a semantic understanding model in deep learning is used in the self-checking system, but most of network models for semantic understanding only consider the upper left position as a reading starting point of a text segment, and the semantic understanding logic of the model is from left to right. To address this issue, the present embodiment considers the division of reading starting points into two categories: an upper left starting point and an upper right starting point.

S42, determining the internal presentation layout of the text segment

The layout of the presentation inside the text passage will vary according to the user-determined starting point of the reading inside the text passage. If the user selects the upper left position conforming to the normal reading habit as the reading starting point of the text segment, the internal presentation layout of the text segment can comprise a horizontal layout and a vertical layout. If the user selects the upper right position as the reading starting point of the text segment, the internal reproduction layout of the text segment can only be a vertical layout in order to meet the normal reading logic.

S5, generating a picture character-learning conversion picture

And generating a final literacy conversion picture according to the determined overall presentation layout of the text segment and the internal presentation layout of the text segment.

In order to make the operation steps of the image literacy conversion method for the immature self-checking system provided in this embodiment clearer, the image literacy conversion method of this embodiment is described in detail by specific embodiments, specifically:

1. obtaining an original picture, wherein the specific process comprises the following steps:

11. the relevant screen capture software is selected.

12. And selecting a required image literacy conversion area by using screen capture software.

13. And storing to obtain an original picture.

2. As shown in fig. 2, the initial text segment is obtained by using an OCR tool, and the specific process includes:

21. an original picture is input.

22. The initial text segments, e.g., text segment 1, text segment 2, text segment 3 …, can be obtained by writing a Python program using an OCR tool to obtain the text information of the picture.

3. As shown in fig. 3, the text segment is subdivided, and the specific process includes:

31. and checking to obtain an initial text segment after the recognition of the OCR tool, and ensuring that the character content of the initial picture is consistent with the character content of the text segment.

32. And splitting, merging and recombining the initial text segments according to the preference of the user to obtain subdivided text segments.

4. Determining the internal presentation layout of the text segment, wherein the specific process comprises the following steps:

41. according to the user requirement, one of the horizontal layout, the vertical layout and the horizontal-vertical staggered layout is selected to determine the overall presentation layout, and the number of the text segments required to be presented is determined, as shown in fig. 4.

42. And determining a reading starting point of each text segment, wherein the reading starting point is selected from the upper left position and the upper right position, and different reading starting points can be selected from different text segments.

43. And determining the internal presentation layout of each text segment, wherein if the reading starting point is at the upper left position, the horizontal type or the vertical type can be selected, and if the reading starting point is at the upper right position, only the vertical type can be selected, as shown in fig. 5.

5: generating and storing the image character-learning conversion image, which specifically comprises the following steps:

51. and generating a converted picture after the picture is literate according to the presentation layout.

52. And visually checking the correctness and readability of the picture, and storing the picture character-learning conversion picture.

Example two

The first embodiment provides a picture literacy conversion method for an immature self-checking system, and correspondingly, the first embodiment provides a picture literacy conversion system. The image literacy conversion provided by the embodiment can be implemented by the image literacy conversion method facing the immature self-checking system of the embodiment one, and the system can be implemented by software, hardware or a combination of software and hardware. For example, the system may comprise integrated or separate functional modules or units to perform the corresponding steps in the method of an embodiment. Since the image literacy conversion system of the present embodiment is basically similar to the method embodiment, the description process of the present embodiment is relatively simple, and reference may be made to part of the description of the first embodiment for relevant points, and the image literacy conversion of the present embodiment is only schematic.

The picture literacy conversion system facing the immature self-checking system provided by the embodiment comprises:

EXAMPLE III

The present embodiment provides a processing device for implementing the image literacy conversion method for the immature self-checking system according to the first embodiment, where the processing device may be a processing device for a client, such as a mobile phone, a notebook computer, a tablet computer, a desktop computer, and the like, so as to execute the image literacy conversion method according to the first embodiment.

The processing equipment comprises a processor, a memory, a communication interface and a bus, wherein the processor, the memory and the communication interface are connected through the bus so as to complete mutual communication. The memory stores a computer program that can be run on the processor, and the processor executes the image literacy conversion method provided by the embodiment when running the computer program.

Preferably, the Memory may be a high-speed Random Access Memory (RAM), and may also include a non-volatile Memory, such as at least one disk Memory.

Preferably, the processor may be various general processors such as a Central Processing Unit (CPU), a Digital Signal Processor (DSP), and the like, which are not limited herein.

Example four

The image literacy conversion method for the immature self-checking system in this embodiment is implemented as a computer program product, which may include a computer-readable storage medium carrying computer-readable program instructions for executing the image recognition method in this embodiment.

The computer readable storage medium may be a tangible device that retains and stores instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any combination of the foregoing.

Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: it is to be understood that modifications may be made to the above-described arrangements in the embodiments or equivalents may be substituted for some of the features of the embodiments without departing from the spirit or scope of the present invention.

Claims

1. An immature self-checking system oriented picture literacy conversion method is characterized by comprising the following steps:

acquiring a picture to be converted;

determining an overall presentation layout of the text segments;

determining the internal presentation layout of the text segment;

2. The picture literacy conversion method for the immature self-checking system according to claim 1, wherein the process of obtaining the picture to be converted comprises: and selecting a character-learning conversion area of the required picture through screen capturing software, and storing to obtain the picture.

3. The picture word-learning conversion method oriented to the immature self-checking system according to claim 1, wherein the process of identifying the picture to obtain the picture text information and subdividing the text segment comprises:

4. The immature self-checking system-oriented image literacy conversion method according to claim 3, wherein the text segment subdivision comprises text segment merging, text segment splitting and/or text segment recombination.

5. The immature self-checking system-oriented image literacy conversion method according to any one of claims 1-4, wherein the overall text segment presentation layout comprises:

6. The immature self-checking system-oriented image literacy conversion method according to claim 5, wherein the process of determining the text segment presentation layout comprises:

determining the internal presentation layout of the text segment, specifically:

7. An immature self-checking system oriented picture literacy conversion system is characterized in that the system comprises:

8. The immature self-checking system-oriented image literacy conversion system according to claim 7, wherein the text segment subdivision unit comprises:

9. A processing device comprising at least a processor and a memory, the memory having stored thereon a computer program, wherein the processor executes when executing the computer program to implement the picture literacy conversion method for the immature self-test system according to any of claims 1 to 6.

10. A computer storage medium having computer readable instructions stored thereon, the computer readable instructions being executable by a processor to implement the immature self-test system oriented picture literacy conversion method of any of claims 1 to 6.