CN110502990B

CN110502990B - Method and system for data acquisition by image processing

Info

Publication number: CN110502990B
Application number: CN201910645911.2A
Authority: CN
Inventors: 金东赫; 蒋君超; 张凯
Original assignee: Shanghai Zhanwan Information Science & Technology Co ltd
Current assignee: Shanghai Zhanwan Information Science & Technology Co ltd
Priority date: 2019-07-17
Filing date: 2019-07-17
Publication date: 2022-06-03
Anticipated expiration: 2039-07-17
Also published as: CN110502990A

Abstract

The invention discloses a method and a system for data acquisition by image processing, which comprises the following steps: s1, marking a recognition frame of a numerical value corresponding to each parameter to be read in the collected image; s2, performing matrixing processing and contrast sharpening processing on the recognition frame to highlight characters in the recognition frame; s3, recognizing a character area in the recognition frame, segmenting the character area according to spaces among characters to obtain block characters, and comparing the shape of each character in each block character with the shape in a corresponding database to recognize each character in a matching manner; and S4, storing the identification result in a database of the edge computing gateway. According to the invention, through screen capture of the control system HMI, image processing is carried out on the screen captured picture, data such as characters, numbers and characters in the picture are identified, and the identification result is analyzed and output and stored in the database of the edge computing gateway.

Description

Method and system for data acquisition by image processing

Technical Field

The invention relates to the technical field of data acquisition, in particular to a method and a system for acquiring data by utilizing image processing.

Background

In the field of industrial internet, in the face of various industrial devices, especially older control devices such as a numerical control cutting machine, a numerical control bending machine and the like, data of a device controller cannot be acquired through a standard communication protocol. The industrial field control system is usually based on operating system platforms such as Windows and Linux, part of the system is based on an embedded special control system, relevant parameters of equipment, such as coordinate values, alarm information and other contents, are generally displayed in real time on an HMI (human machine interface) of the control system, and the data are data which have important value on the industrial internet and need to be acquired.

Disclosure of Invention

Aiming at the problems and the defects in the prior art, the invention provides a method and a system for acquiring data by utilizing image processing.

The invention solves the technical problems through the following technical scheme:

the invention provides a method for data acquisition by utilizing image processing, which is characterized by comprising the following steps of:

s1, marking a recognition frame of a numerical value corresponding to each parameter to be read in the collected image;

s2, performing matrixing processing and contrast sharpening processing on the recognition frame to highlight characters in the recognition frame;

s3, recognizing a character area in the recognition frame, segmenting the character area according to spaces among characters to obtain block characters, and comparing the shape of each character in each block character with the shape in a corresponding database to recognize each character in a matching manner;

and S4, analyzing and outputting the identification result and storing the analysis result in a database of the edge computing gateway.

Preferably, in step S1, the top border, the bottom border, the left border or the right border of the recognition box is fine-tuned so that the recognition box is not doped with the background interference element.

Preferably, in step S3, for the part of the masked font in the block character, the character represented by the masked font is identified by comparing the shape of the masked font with the shape in the corresponding database.

Preferably, an example picture for covering fonts is collected, the content of a recognition frame of the covered fonts in the example picture is extracted, background denoising and contrast sharpening are adopted, the processed covered text is re-labeled by means of a jTessBoxEditor tool, re-training is carried out on re-labeled data, a new text library is generated, and the new text library is used for recognizing and predicting the future partial covered fonts.

The invention also provides a system for acquiring data by utilizing image processing, which is characterized by comprising a calibration module, a processing module, an identification module and a storage module;

the calibration module is used for calibrating an identification frame of a numerical value corresponding to each parameter to be read in the acquired image;

the processing module is used for performing matrixing processing and contrast sharpening processing on the identification frame so as to highlight characters in the identification frame;

the recognition module is used for recognizing a character area in the recognition frame, segmenting the character area according to spaces among characters to obtain block characters, and comparing the shape of each character in each block character with the shape in a corresponding database to recognize each character in a matching manner;

and the storage module is used for analyzing and outputting the identification result and storing the analysis result in a database of the edge computing gateway.

Preferably, the calibration module is used for fine tuning the upper frame, the lower frame, the left frame or the right frame of the recognition frame, so that background interference elements are not doped in the recognition frame.

Preferably, for a part of the masked font in the block character, the identification module is configured to identify the character represented by the masked font by comparing the shape of the masked font with the shape in the corresponding database.

Preferably, the system further comprises a sample acquisition module, wherein the sample acquisition module is used for collecting an example picture for covering fonts, extracting the content of a recognition frame of the covered fonts in the example picture, performing background denoising and contrast sharpening, re-labeling the processed covered text by using a jTessBoxEditor tool, re-training re-labeled data to generate a new text library, and performing recognition prediction on a future part of the covered fonts by using the new text library.

On the basis of the common knowledge in the field, the above preferred conditions can be combined randomly to obtain the preferred embodiments of the invention.

The positive progress effects of the invention are as follows:

according to the invention, through screen capture of the control system HMI and image processing of the screen captured picture, data such as characters, numbers and characters in the picture are identified, and the identification result is analyzed and output and stored in the database of the edge computing gateway.

Drawings

FIG. 1 is a flow chart of a method for data acquisition using image processing according to a preferred embodiment of the present invention.

FIG. 2 is a diagram illustrating the positioning of an image processing parameter identification box according to a preferred embodiment of the present invention.

FIG. 3 is a block diagram of a system for data acquisition using image processing according to a preferred embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

As shown in fig. 1, the present embodiment provides a method for data acquisition by image processing, which includes the following steps:

step 101, marking an identification frame of a numerical value corresponding to each parameter to be read in the acquired image, and finely adjusting an upper frame, a lower frame, a left frame or a right frame of the identification frame so as to ensure that background interference elements are not doped in the identification frame, as shown in fig. 2.

For each parameter, four pixel positions, up, down, left, right, and left, are required to include all the parameter contents into an identification box (base line) as much as possible on the basis of not doping other background interference elements (such as interference backgrounds like borders and horizontal lines), as far as possible.

And 102, performing matrixing processing and contrast sharpening processing on the identification frame to highlight characters in the identification frame.

The background noise is processed by first correctly intercepting the bounding box of the parameter content. Ensure that the bounding box contains various background noises as little as possible. Such as borders, interference lines, etc. Meanwhile, the situation of double backgrounds is also avoided, and the denoising method is to perform matrixing processing on pixels of intercepted parameter contents. And searching the distribution rule of the pixels on the basis, and then sharpening the chrominance values.

And 103, recognizing a character area in the recognition frame, segmenting the character area according to spaces among characters to obtain block characters, and comparing the shape of each character in each block character with the shape in a corresponding database to recognize each character in a matching manner.

Wherein, for part of the covering fonts in the block characters, the characters represented by the covering fonts are identified by comparing the shapes of the covering fonts with the shapes in the corresponding database in a matching way.

Collecting an example picture for covering fonts, extracting the content of a recognition frame of the covered fonts in the example picture, carrying out background denoising and contrast sharpening, carrying out re-labeling on the processed covered text by a jTessBoxEditor tool, retraining re-labeled data to generate a new text library, and carrying out recognition prediction on future partial covered fonts by using the new text library.

And 104, analyzing and outputting the identification result and storing the analysis result in a database of the edge computing gateway.

As shown in fig. 3, the embodiment further provides a system for acquiring data by using image processing, which includes a calibration module 1, a processing module 2, an identification module 3, and a storage module 4.

The calibration module 1 is used for calibrating a recognition frame of a numerical value corresponding to each parameter to be read in the acquired image, and finely adjusting an upper frame, a lower frame, a left frame or a right frame of the recognition frame, so that background interference elements are not doped in the recognition frame.

The processing module 2 is used for performing matrixing processing and contrast sharpening processing on the recognition frame to highlight characters in the recognition frame.

The recognition module 3 is configured to recognize a character region in the recognition frame, segment the character region according to a space between characters to obtain block characters, and recognize each character by matching the shape of the character with the shape in the corresponding database.

Wherein, for a part of the masked fonts in the block characters, the identification module is used for matching and identifying the characters represented by the masked fonts by comparing the shapes of the masked fonts with the shapes in the corresponding database.

The system also comprises a sample acquisition module, wherein the sample acquisition module is used for collecting an example picture for covering fonts, extracting the content of a recognition frame of the covered fonts in the example picture, carrying out background denoising and contrast sharpening treatment, carrying out re-labeling on the processed covered text by a jTessBoxEditor tool, carrying out re-training on re-labeled data, generating a new text library, and carrying out recognition prediction on part of the covered fonts in the future by using the new text library.

The storage module 4 is used for analyzing and outputting the identification result and storing the analysis result in a database of the edge computing gateway.

The method and the device aim at old equipment in an industrial field, the old equipment often does not support a standard communication protocol, the conventional data acquisition thought is difficult to realize data acquisition of the equipment, the scheme can make up for the defects of the conventional data acquisition scheme, each output parameter in the screenshot picture can be rapidly and accurately identified, and the comprehensive cost is low.

While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that these are by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims

1. A method for data acquisition using image processing, comprising the steps of:

s1, marking an identification frame of a numerical value corresponding to each parameter to be read in the collected image, wherein the collected image is a screen capture picture of the control system HMI;

s4, analyzing and outputting the identification result and storing the analysis result in a database of the edge computing gateway;

in step S1, fine-tuning the top border, the bottom border, the left border, or the right border of the recognition frame so that the recognition frame is not doped with background interference elements;

in step S3, for the partial masking font in the block character, the character represented by the masking font is identified by comparing the shape of the masking font with the shape in the corresponding database;

2. A system for data acquisition by image processing is characterized by comprising a calibration module, a processing module, an identification module and a storage module;

the calibration module is used for calibrating an identification frame of a numerical value corresponding to each parameter to be read in an acquired image, and the acquired image is a screen capture picture of the control system HMI;

the storage module is used for analyzing and outputting the identification result and storing the identification result in a database of the edge computing gateway;

the calibration module is used for finely adjusting an upper frame, a lower frame, a left frame or a right frame of the identification frame so as to ensure that background interference elements are not doped in the identification frame;

for a part of the masked fonts in the block characters, the identification module is used for matching and identifying the characters represented by the masked fonts by comparing the shapes of the masked fonts with the shapes in the corresponding database;