CN114399670A - Control method for extracting characters in pictures in 5G messages in real time - Google Patents

Control method for extracting characters in pictures in 5G messages in real time Download PDF

Info

Publication number
CN114399670A
CN114399670A CN202210038976.2A CN202210038976A CN114399670A CN 114399670 A CN114399670 A CN 114399670A CN 202210038976 A CN202210038976 A CN 202210038976A CN 114399670 A CN114399670 A CN 114399670A
Authority
CN
China
Prior art keywords
picture
pictures
steps
message
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210038976.2A
Other languages
Chinese (zh)
Inventor
黄书涵
陈淼生
郑仲嵩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Fufu Information Technology Co Ltd
Original Assignee
China Telecom Fufu Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Fufu Information Technology Co Ltd filed Critical China Telecom Fufu Information Technology Co Ltd
Priority to CN202210038976.2A priority Critical patent/CN114399670A/en
Publication of CN114399670A publication Critical patent/CN114399670A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Character Discrimination (AREA)

Abstract

The invention discloses a control method for extracting characters in a picture in a 5G message in real time, which introduces graphics-based preprocessing, optimizes the characteristics of 5G message junk pictures and obtains a great deal of efficiency improvement of an optical character recognition stage with a small amount of time cost of graphics operation. The method can control the sample picture to flexibly select the training direction of the model according to the current situation of the picture variant, and improve the identification accuracy. Reasonable optimization is carried out on a preprocessing algorithm and an identification model, the speed and the accuracy are improved to reasonable threshold values, and the requirement of real-time picture authentication is met. And the extracted text is processed by a common text filtering link, and a final judgment result is returned so as to realize the management and control of the junk picture message.

Description

Control method for extracting characters in pictures in 5G messages in real time
Technical Field
The invention relates to the field of 5G technical application, in particular to a control method for extracting characters in pictures in 5G messages in real time.
Background
With the advent of the 5G era, operators have developed 5G messages based on RCS (rich media communication) in hopes of expanding richer message services beyond traditional short message service (sms) and multimedia message service (mms) communication. However, the spam is a problem that the spam cannot be eradicated from short messages and multimedia messages to 5G messages. Perfecting the spam management and control platform is a long-term race between an operator and a sender.
Unlike IM software, the weak client-side characteristics of 5G messages result in the need for information management and control capabilities on the device side. And the real-time scene of the 5G message requires that the management and control have low time delay. In media types supported by the 5G messages, the text real-time monitoring technology is mature; the streaming media is difficult to filter in real time under the current computing power; the picture real-time communication is mainly multimedia messages before the 5G message, the multimedia message traffic of each operator is very low at present, and low load enables equipment resources to better cope with real-time picture processing; and the multimedia message is rarely in a real-time interactive scene, so the requirement on the processing speed is not strict.
Disclosure of Invention
The invention aims to provide a control method for extracting characters in a picture in a 5G message in real time.
The technical scheme adopted by the invention is as follows:
a control method for extracting characters in pictures in 5G messages in real time comprises the following steps:
step 1, preprocessing a picture by graphics processing under an opencv frame; the method specifically comprises the following steps:
step 1-1, graying the picture; the gray-scale image is a single-channel image only containing brightness information and no color information, and each pixel is the brightness value of the single-channel image;
1-2, carrying out threshold segmentation and binarization on a gray level picture;
in view of the different gray scale configurations of different pictures, if a fixed threshold is used for all the picture binaryzation, there is a possibility that a subject is blended into the background or noise is highlighted, which may cause interference. The partition threshold needs to be calculated.
Step 1-3, performing noise reduction processing on the binarized picture;
a large number of samples are analyzed to find that most of the noise points of the junk pictures are isolated small points and are distributed in a non-main area of the pictures in a large number. This is often the case where the manufacturer uses algorithms to add various types of noise to the picture, creating obstacles for the information monitor. It is necessary to effectively remove the independent noise using a noise reduction algorithm.
Step 1-4, performing edge detection to obtain a character outline, and obtaining a text block after morphological expansion and corrosion;
the efficiency and accuracy of character recognition are interfered by useless graphic information in the binary image; meanwhile, the line-dividing recognition capability of cnocr is easily influenced by the complicated layout of the text in the graph and becomes inaccurate. Firstly, carrying out edge detection on the binary image to obtain an edge highlight image of the character; followed by morphological erosion and dilation to smooth the patch area;
step 1-5, acquiring four-corner coordinates of a maximum rectangle occupied by the outer edge of the text block, and acquiring each text block in a binary image;
identifying pixel coordinates of the color block outline; and then extracting a corresponding part of subgraph from the obtained rectangular region coordinate to the original binary image. Since cnocr scans pictures by taking files as units, sub-pictures obtained by scanning also need to be pieced into a regular large line drawing, so that the speed and the recognition rate are optimal.
1-6, neatly splicing each text block into a picture;
2, training a model according to the characteristics of the 5G message junk pictures by optical character recognition under a cnocr suite; the recognition of the optical character recognition to the common fonts of the non-standard characters and the junk information pictures is improved, so that the management and control effect is improved.
Step 3, the management and control service process carries out keyword matching on the extracted text and returns a management and control result in real time; and meanwhile, the text is sent to a statistical module for natural language identification to find suspicion and generate a recommendation strategy.
Further, the threshold segmentation in the step 1-2 adopts an OTSU method.
Further, an eight-neighborhood algorithm is adopted in the step 1-3 for noise reduction; or an eight-neighborhood algorithm is combined with a connected domain algorithm to perform noise reduction processing.
Further, Sobel operator is adopted in the step 1-4 for edge detection.
Further, the preprocessing of step 1 also eliminates the processing flow of the pictures which obviously do not have text characteristics according to the analysis characteristic data, so as to reduce the processing load.
Further, the training of step 2 comprises the following procedures:
step 2-1, selecting a large number of sample pictures for unified binarization processing;
step 2-2, generating a training set and a test set and converting binary files;
step 2-3, executing a training set by using a training script;
and 2-4, verifying the effect on the test set by using the test script and importing a new model.
By adopting the technical scheme, the invention introduces the preprocessing based on the graphics, optimizes the characteristics of the 5G message junk pictures and obtains a great deal of efficiency improvement of the optical character recognition stage at the cost of a small amount of time of the graphics operation. The method can control the sample picture to flexibly select the training direction of the model according to the current situation of the picture variant, and improve the identification accuracy.
Compared with the prior art, the invention has the main advantages that: 1. the identification speed is high, and a large amount of concurrent real-time processing can be supported through framework optimization under the condition of GPU hardware support. And a large amount of concurrent quasi-real-time processing and real-time processing of a small-traffic system can be realized under the condition of a non-GPU. 2. The optical character recognition is established on a domestic open source suite, and the Chinese recognition rate is far higher than the general Tesseract in the industry. 3. Model training is carried out according to a large number of existing sample picture cases, and the recognition rate of non-standard fonts is high.
Drawings
The invention is described in further detail below with reference to the accompanying drawings and the detailed description;
fig. 1 is a schematic flow chart of a control method for extracting text in a picture in a 5G message in real time according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
In the 5G message, the picture is directly sent and integrated into the interactive scene in a manner similar to IM. And 5G messages do not maintain a list of communicating entities as does IM, the recipient of a 5G message may be number based. Therefore, under the limit that the management and control can not be positioned at the terminal, the real-time accuracy requirement of the platform on the picture monitoring is greatly improved compared with the multimedia message.
According to the data feedback display of the multimedia message platform and the IM platform, the junk picture information in real-time communication is mainly used for solidifying related texts in a picture mode, and interference factors are added in the picture to avoid the monitoring of the platform. For the junk information, the invention provides a set of character extraction method based on opencv image preprocessing and cnocr optical character recognition, reasonable optimization is carried out on a preprocessing algorithm and a recognition model, the speed and the accuracy are improved to reasonable threshold values, and the requirement of real-time image authentication is met. And the extracted text is processed by a common text filtering link, and a final judgment result is returned so as to realize the management and control of the junk picture message.
OpenCV is a set of BSD-based open-source computer vision library, and it provides an interface and functions to process images conveniently, and is one of the most widely used vision libraries in the industry.
The cnocr is a domestic lightweight open source OCR library based on a Cyclic Neural Network (CNN) and a convolutional neural network (RNN), supports GPU hardware are preset, and the cnocr is superior to the Google open source library Tesseract which is used in the industry in Chinese recognition and model training.
As shown in fig. 1, the present invention discloses a control method for extracting text in a picture in a 5G message in real time, which comprises the following steps:
step 1, the graphic processing uses a self-writing algorithm and an encapsulation function under an opencv framework. The method mainly comprises the following steps:
step 1-1, graying processing. The gray scale value interval is [0,255], representing from darkest to brightest. The RGB to grayscale map conversion may call cvtColor () to get the result of the graying.
And step 1-2, carrying out binarization treatment. The OTSU method firstly counts the number omega _1 of each pixel in the gray level in the whole image; calculating the probability distribution omega _2 of each pixel in the whole image; traversing and searching the gray levels to obtain a background average gray level mu _1 and a foreground average gray level mu _2, and calculating the probability between foreground and background classes under the current gray level; and finally, calculating the inter-class variance g, namely a target threshold value, through an objective function. The formula is as follows:
g=ω1*ω2*(μ1-μ2)2
the obtained threshold value is used as a parameter for calling threshold ()
And 1-3, denoising. The eight neighborhood noise reduction principle is that all non-white points in a traversal graph are calculated, if the number of the non-white points in 8 points around the graph is calculated to be less than a certain threshold value, the noise points are judged to be juxtaposed to be white. The time complexity of the method is only O (MN), and the method is an effective and simple pretreatment method.
And 1-4, detecting edges. The edge detection is carried out by using a method of acquiring the first-order gradient of the image by using a Sobel operator. Using two 3-by-3 matrixes to carry out convolution operation with the original image to obtain the transverse and longitudinal gradient values of a certain point,
after the convolution, the evolution of the sum of the squares of the horizontal and vertical gradient values is calculated, and if the evolution is larger than the threshold value, the point is considered as an edge point.
And 1-5, performing morphological treatment. The dilation operation may be implemented by directly calling a dilate () and the function of the erosion operation is an enode ().
And 1-6, defining a text area. By eliminating blocks with too small area and significantly non-compliant aspect ratio, the four corner pixel coordinates of the smallest rectangle containing the block can be obtained by using the bounding select function of the standard rectangle Rect class for the remaining blocks. And after the blocks with different proportions are removed, the four-corner coordinates of each block can be obtained.
And 1-7, splicing the blocks. Cutting out corresponding subgraphs on the original binary image according to the four-corner coordinates by using an ROI method, and splicing a regular branch image.
And 2, realizing optical character recognition under a cnocr suite. The specific identification process is not changed, and corresponding model training is mainly carried out according to the 5G message junk picture characteristics. The training comprises the following procedures:
and 2-1, selecting a sample picture, cutting out a single row of characters for storage, and uniformly numbering file names. And performing binarization processing on the image data to serve as an image data source.
And 2-2, writing a program, and generating a training set and a test set according to the picture file name, the corresponding correct characters and the indexes of the characters in the Label file.
And 2-3, converting the data into a binary format by using the recordio of the artificial intelligence suite mxnet. Because mxnet is used for improving IO efficiency, the picture file cannot be directly read, the picture list and the label are converted into a binary file in a RecordIO format, data can be read sequentially during training, and IO rate is greatly improved. And calling the script im2rec.
Step 2-4, the model is trained using a training script, cnocr _ train. To increase speed, the configuration may use GPU training.
Step 2-5, cnocr provides the evaluation tool cnocr _ evaluation. py, which can test the recognition effect of the new model on the test set.
And 2-6, importing the new model file, namely using ocr () to specify the new model to be used for processing in the program.
Step 3, the management and control service process analyzes the picture through a protocol and inputs the picture into the process; performing keyword matching on the extracted text and returning a control result in real time; and meanwhile, the text is sent to a statistical module for natural language identification to find suspicion and generate a recommendation strategy.
The following is a description of specific effects of the present invention:
the invention divides the test set of jpg pictures into four groups of 20 pieces according to the conditions of whether the test set has background noise and interference color blocks and the number of words of 50 or 150. And taking a common CPU server as test hardware. The test set was preprocessed and the time spent in each link was calculated using the self-training model ocr, with the results in milliseconds, as shown in the table below.
50 has no interference 50 band interference 150 has no interference 150 band interference
Time consuming pretreatment 17 101 58 133
OCR is time consuming 37 151 111 205
Rate of accuracy 98.9% 94.5% 97.3% 94.5%
Direct OCR is time consuming 605 866 973 1380
Rate of accuracy 44.5% <10% 26% <10%
It can be seen that the speed of extracting the characters of a single common picture is over 6 times of the direct OCR of the original picture, the accuracy rate is improved to about 95% from the basic unavailability, and if the cluster architecture with the GPU can be used for deployment, the requirements of real-time and quasi-real-time authentication of 5G messages can be basically met. The comprehensive treatment scheme proposed by the present invention is effective.
Possible future application scenarios of the invention: 1. and performing real-time or quasi-real-time picture text management and control in multimedia communication represented by a 5G message service. 2. And identifying the image watermark and the character identification in the large data service system of various images.
The invention provides the capability of picture management and control for the 5G message real-time communication and improves the overall safety of the service. The image-text recognition capability does not depend on a real-time authentication interface provided by a professional company, the economic cost of interface use is saved, a network where the system is located does not need to be communicated with the network of the professional company, and the overall safety of the system is improved.
Compared with the prior art, the invention has the main advantages that: 1. the identification speed is high, and a large amount of concurrent real-time processing can be supported through framework optimization under the condition of GPU hardware support. And a large amount of concurrent quasi-real-time processing and real-time processing of a small-traffic system can be realized under the condition of a non-GPU. 2. The optical character recognition is established on a domestic open source suite, and the Chinese recognition rate is far higher than the general Tesseract in the industry. 3. Model training is carried out according to a large number of existing sample picture cases, and the recognition rate of non-standard fonts is high.
It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. The embodiments and features of the embodiments in the present application may be combined with each other without conflict. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the detailed description of the embodiments of the present application is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Claims (6)

1. A control method for extracting characters in pictures in a 5G message in real time is characterized in that: which comprises the following steps:
step 1, preprocessing a picture by graphics processing under an opencv frame; the method specifically comprises the following steps:
step 1-1, graying the picture; the gray-scale image is a single-channel image only containing brightness information and no color information, and each pixel is the brightness value of the single-channel image;
1-2, carrying out threshold segmentation and binarization on a gray level picture;
step 1-3, carrying out noise reduction treatment on the binarized picture to effectively remove independent noise points;
step 1-4, performing edge detection to obtain an edge highlight image of the character, and obtaining a text block after morphological expansion and corrosion;
step 1-5, acquiring four-corner coordinates of a maximum rectangle occupied by the outer edge of the text block, and acquiring each text block in a binary image;
1-6, neatly splicing each text block into a picture;
2, training a model according to the characteristics of the 5G message junk pictures by optical character recognition under a cnocr suite; the identification of the optical character identification to the common fonts of the non-standard characters and the junk information pictures is improved;
step 3, the management and control service process carries out keyword matching on the extracted text and returns a management and control result in real time; and meanwhile, the text is sent to a statistical module for natural language identification to find suspicion and generate a recommendation strategy.
2. The method according to claim 1, wherein the method comprises the following steps: and the threshold segmentation in the step 1-2 adopts an OTSU method.
3. The method according to claim 1, wherein the method comprises the following steps: in the step 1-3, an eight-neighborhood algorithm is adopted for noise reduction; or an eight-neighborhood algorithm is combined with a connected domain algorithm to perform noise reduction processing.
4. The method according to claim 1, wherein the method comprises the following steps: and (4) adopting a Sobel operator to carry out edge detection in the steps 1-4.
5. The method according to claim 1, wherein the method comprises the following steps: the preprocessing of the step 1 also eliminates the processing flow of the pictures which obviously do not have text characteristics according to the analysis characteristic data so as to reduce the processing load.
6. The method according to claim 1, wherein the method comprises the following steps: the training of step 2 comprises the following procedures:
step 2-1, selecting a large number of sample pictures for unified binarization processing;
step 2-2, generating a training set and a test set and converting binary files;
step 2-3, executing a training set by using a training script;
and 2-4, verifying the effect on the test set by using the test script and importing a new model.
CN202210038976.2A 2022-01-13 2022-01-13 Control method for extracting characters in pictures in 5G messages in real time Pending CN114399670A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210038976.2A CN114399670A (en) 2022-01-13 2022-01-13 Control method for extracting characters in pictures in 5G messages in real time

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210038976.2A CN114399670A (en) 2022-01-13 2022-01-13 Control method for extracting characters in pictures in 5G messages in real time

Publications (1)

Publication Number Publication Date
CN114399670A true CN114399670A (en) 2022-04-26

Family

ID=81230768

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210038976.2A Pending CN114399670A (en) 2022-01-13 2022-01-13 Control method for extracting characters in pictures in 5G messages in real time

Country Status (1)

Country Link
CN (1) CN114399670A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116206227A (en) * 2023-04-23 2023-06-02 上海帜讯信息技术股份有限公司 Picture examination system and method for 5G rich media information, electronic equipment and medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116206227A (en) * 2023-04-23 2023-06-02 上海帜讯信息技术股份有限公司 Picture examination system and method for 5G rich media information, electronic equipment and medium

Similar Documents

Publication Publication Date Title
US20190188528A1 (en) Text detection method and apparatus, and storage medium
US8155445B2 (en) Image processing apparatus, method, and processing program for image inversion with tree structure
CN104298982A (en) Text recognition method and device
CN112101386B (en) Text detection method, device, computer equipment and storage medium
CN112989995B (en) Text detection method and device and electronic equipment
CN113642576B (en) Method and device for generating training image set in target detection and semantic segmentation tasks
US20210056429A1 (en) Apparatus and methods for converting lineless tables into lined tables using generative adversarial networks
CN110889311A (en) Financial electronic facsimile document identification system and method
KR20230030259A (en) Deep learning-based data augmentation method for product defect detection learning
CN103530625A (en) Optical character recognition method based on digital image processing
CN112883926A (en) Identification method and device for table medical images
CN109272526B (en) Image processing method and system and electronic equipment
CN106682670B (en) Station caption identification method and system
CN114399670A (en) Control method for extracting characters in pictures in 5G messages in real time
KR102285269B1 (en) Image analysis apparatus and method for utilizing the big data base GEO AI
CN117541546A (en) Method and device for determining image cropping effect, storage medium and electronic equipment
CN111325207A (en) Bill identification method and device based on preprocessing
CN114511862B (en) Form identification method and device and electronic equipment
CN116030472A (en) Text coordinate determining method and device
JP4967045B2 (en) Background discriminating apparatus, method and program
CN115578741A (en) Mask R-cnn algorithm and type segmentation based scanned file layout analysis method
CN114267035A (en) Document image processing method and system, electronic device and readable medium
US9773472B2 (en) Text extraction from graphical user interface content
CN112580452A (en) Method and device for processing fault tree, computer readable storage medium and processor
CN114332866A (en) Document curve separation and coordinate information extraction method based on image processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination