CN114926830A - Screen image recognition method, device, equipment and computer readable medium - Google Patents

Screen image recognition method, device, equipment and computer readable medium Download PDF

Info

Publication number
CN114926830A
CN114926830A CN202210599436.1A CN202210599436A CN114926830A CN 114926830 A CN114926830 A CN 114926830A CN 202210599436 A CN202210599436 A CN 202210599436A CN 114926830 A CN114926830 A CN 114926830A
Authority
CN
China
Prior art keywords
screen image
cut
cutting
image
information set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210599436.1A
Other languages
Chinese (zh)
Other versions
CN114926830B (en
Inventor
车文彬
刘超
张超
郭丽娜
陈逸帆
张俊杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Shurui Data Technology Co ltd
Original Assignee
Nanjing Shurui Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Shurui Data Technology Co ltd filed Critical Nanjing Shurui Data Technology Co ltd
Priority to CN202210599436.1A priority Critical patent/CN114926830B/en
Publication of CN114926830A publication Critical patent/CN114926830A/en
Application granted granted Critical
Publication of CN114926830B publication Critical patent/CN114926830B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/635Overlay text, e.g. embedded captions in a TV program
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

The embodiment of the disclosure discloses a screen image recognition method, a screen image recognition device, a screen image recognition equipment and a computer readable medium. One embodiment of the method comprises: acquiring a screen image to be cut; cutting the screen image to be cut according to each preset cutting size to obtain a cut screen image group; performing component detection processing by using a cutting image unit in each cutting screen image in the cutting screen image group to obtain a target component information set; performing character recognition on a screen image to be cut to obtain a character information set; and combining the target component information set and the character information set to obtain a screen picture identification information set. This embodiment allows for more comprehensive screen image components.

Description

Screen image recognition method, device, equipment and computer readable medium
Technical Field
The embodiment of the disclosure relates to the technical field of computers, in particular to a screen image identification method, a screen image identification device, screen image identification equipment and a computer readable medium.
Background
Screen image recognition is a technology for recognizing components and characters in a screen image. At present, when a screen image is identified, the method generally adopted is as follows: the screen image as a whole is recognized.
However, when the screen image is recognized in the above manner, there are often technical problems as follows:
firstly, the detection types of the components are limited, and all the components contained in the screen image cannot be identified;
secondly, the requirement on the definition of the screen image is high, and when the definition of the screen image is low, the accuracy of character recognition is reduced.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Some embodiments of the present disclosure propose screen image recognition methods, apparatuses, devices and computer readable media to solve one or more of the technical problems mentioned in the background section above.
In a first aspect, some embodiments of the present disclosure provide a screen image recognition method, including: acquiring a screen image to be cut; cutting the screen image to be cut according to each preset cutting size to obtain a cut screen image group, wherein the cut screen image in the cut screen image group consists of cut image units with corresponding cutting sizes; performing component detection processing by using a cutting image unit in each cutting screen image in the cutting screen image group to obtain a target component information set; performing character recognition on the screen image to be cut to obtain a character information set; and combining the target component information set and the character information set to obtain a screen picture identification information set.
Optionally, the performing component detection processing by using the cut image unit in each cut screen image in the cut screen image group to obtain a target component information set includes:
performing the following component detection steps for each of the cut screen images using a preset set of component detection box information:
determining whether the cutting image unit meets a preset condition, wherein the preset condition is that the probability value of the central point of the cutting image unit as the central point of the component is greater than a preset probability value;
and in response to the fact that the cutting image unit meets the preset condition, carrying out component detection by using the component detection frame information set and taking the cutting unit as a center to obtain component information.
Optionally, the component detection frame information in the component detection frame information set includes a component detection frame and a component detection frame type; and
and in response to determining that the cutting image unit meets the preset condition, performing component detection by using the component detection frame information set and taking the cutting unit as a center to obtain component information, including:
determining the component detection frame type in the component detection frame information matched with the detected component in the component detection frame information set as the component type of the detected component;
determining a center point of the cut image unit as a component center point of the detected component, and determining coordinates of the component center point and a component size detected by the component detection as a component position of the detected component;
and combining the component type and the component position to obtain the component information of the detected component.
Optionally, the component detection processing is performed on the cut image unit in each cut screen image in the cut screen image group to obtain a target component information set, and the method further includes:
and carrying out duplication elimination processing on each obtained component information, and determining the duplicated component information as target component information to obtain a target component information set.
In a second aspect, some embodiments of the present disclosure provide a screen image recognition apparatus, including: an acquisition unit configured to acquire a screen image to be cut; the cutting unit is configured to cut the screen image to be cut according to preset cutting sizes to obtain a cutting screen image group, wherein the cutting screen image in the cutting screen image group is composed of cutting image units with corresponding cutting sizes; the component detection unit is configured to utilize a cut image unit in each cut screen image in the cut screen image group to perform component detection processing, so as to obtain a target component information set; the character recognition unit is configured to perform character recognition on the screen image to be cut to obtain a character information set; and the combining unit is configured to combine the target component information set and the character information set to obtain a screen picture identification information set.
In a third aspect, some embodiments of the present disclosure provide an electronic device, comprising: one or more processors; a storage device having one or more programs stored thereon, which when executed by one or more processors, cause the one or more processors to implement the method described in any of the implementations of the first aspect.
In a fourth aspect, some embodiments of the present disclosure provide a computer readable medium on which a computer program is stored, wherein the program, when executed by a processor, implements the method described in any of the implementations of the first aspect.
The above embodiments of the present disclosure have the following advantages: by the screen image identification method of some embodiments of the present disclosure, components included in a screen image can be relatively comprehensive. Specifically, the reason for the incomplete detection of the component is: the component detection category is limited. Based on this, the screen image recognition method of some embodiments of the present disclosure first acquires a screen image to be cut. And then, cutting the screen image to be cut according to each preset cutting size to obtain a cut screen image group, wherein the cut screen image in the cut screen image group consists of cut image units with corresponding cutting sizes. Therefore, the screen image to be cut is cut to be integrated into a part, so that subsequent component detection and character recognition are facilitated. And then, carrying out component detection processing by using the cut image unit in each cut screen image in the cut screen image group to obtain a target component information set. From this, carry out the subassembly to the cutting image unit that cutting size is different and detect, can promote the comprehensiveness that the subassembly detected to avoid the omission that the subassembly detected. And then, carrying out character recognition on the screen image to be cut to obtain a character information set. And finally, combining the target component information set and the character information set to obtain a screen picture identification information set. Therefore, the components contained in the screen image can be more comprehensive.
Drawings
The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and components are not necessarily drawn to scale.
FIG. 1 is a flow diagram of some embodiments of a screen image recognition method according to the present disclosure;
FIG. 2 is a schematic block diagram of some embodiments of a screen image recognition device of the present disclosure;
FIG. 3 is a schematic block diagram of an electronic device suitable for use in implementing some embodiments of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Referring now to fig. 1, a flow 100 of some embodiments of a screen image recognition method according to the present disclosure is shown. The screen image identification method comprises the following steps:
step 101, acquiring a screen image to be cut.
In some embodiments, the executing body of the screen image recognition method may acquire the screen image to be cut through a wired connection manner or a wireless connection manner. The screen image to be cut may be a screen image cut from a display screen.
And 102, cutting the screen image to be cut according to each preset cutting size to obtain a cut screen image group.
In some embodiments, the executing body may perform a cutting process on the screen image to be cut according to preset cutting sizes, so as to obtain a cut screen image group. Wherein the cutting screen image in the cutting screen image group may be composed of cutting image units corresponding to the cutting size. The respective cutting sizes may indicate the number of cuts for the length and width of the screen image to be cut.
As an example, the various cut sizes described above may include 13 × 13, 26 × 26, and 52 × 52. Wherein 13 × 13 may represent that the screen image to be cut is cut into 13 × 13 cut image units with the same size and the same length and width. 26 × 26 may represent that the screen image to be cut is cut into 13 × 13 square cut image units with the same size and the same length and width. 52 × 52 may represent a screen image to be cut as described above, which is cut into 52 × 52 square cut image units having the same size and the same length and width.
In some optional implementation manners of some embodiments, the executing main body performs a cutting process on the screen image to be cut according to preset cutting sizes to obtain a cut screen image group, and the cutting processing method may include the following steps:
the first step is that in response to the fact that the length and the width of the screen image to be cut are not equal, the screen image to be cut is subjected to equal length processing, and the screen image to be cut with the same length and width is obtained.
The equal length processing may be splicing the blank area of the short edge of the screen image to be cut, so that the length and the width of the screen image to be cut after the blank area is spliced are equal.
And secondly, cutting the screen image to be cut with the same length and width to obtain a cut screen image group.
And 103, performing component detection processing by using the cutting image unit in each cutting screen image in the cutting screen image group to obtain a target component information set.
In some embodiments, the performing main body performs component detection processing by using the cut image unit in each cut screen image in the cut screen image group to obtain the target component information set, and may include the following steps:
the first step, using the preset assembly detection frame information set to execute the following assembly detection steps for each cutting image unit in the cutting screen image:
and a first substep of determining whether the cut image unit satisfies a preset condition, wherein the preset condition may be that a probability value of the center point of the cut image unit being the component center point is greater than a preset probability value. The component detection frame information in the component detection frame information set may include a component detection frame. The probability value that the center point of the cut image unit is the center point of the component can be determined by using a YOLO (you Only Look one) model. In practice, the preset probability value may be adjusted according to actual application requirements, and is not limited herein. The above components may be page elements such as buttons and text boxes displayed in the screen image to be cut.
And a second substep of performing component detection with the cutting unit as a center by using the component detection frame information set in response to determining that the cutting image unit meets the preset condition, so as to obtain component information. The size of the component detection frame may be used as component information of the component satisfying the preset condition.
And secondly, performing duplicate removal processing on each obtained component information, and determining the duplicate-removed component information as target component information to obtain a target component information set. The above-mentioned each component information may be subjected to deduplication processing by using an NMS (non maximum suppression) algorithm.
In some optional implementations of some embodiments, the component detection box information in the component detection box information set may include a component detection box and a component detection box type. The executing step of, in response to determining that the cut image unit satisfies the preset condition, performing component detection with the cutting unit as a center by using the component detection frame information set to obtain component information, may include:
the method comprises the steps of firstly, determining the component detection frame type in the component detection frame information matched with the detected component in the component detection frame information set as the component type of the detected component.
And secondly, determining the central point of the cutting image unit as the component central point of the detected component, and determining the coordinate of the component central point and the size of the component detected by the component detection as the component position of the detected component.
And thirdly, combining the component types and the component positions to obtain the component information of the detected component.
And 104, performing character recognition on the screen image to be cut to obtain a character information set.
In some embodiments, the performing step of performing text recognition on the screen image to be cut by the performing main body to obtain a text information set may include the following steps:
firstly, determining the coordinates of each character area in the screen image to be cut to obtain a character area coordinate set. The coordinates of each text Region may be determined by using Algorithms such as Sliding Window Algorithm (Sliding Window Algorithm) or candidate Region Algorithm (Region pro-spatial Algorithms).
And secondly, cutting the screen image to be cut according to the character area coordinate set to obtain a character area image set.
And cutting out an area corresponding to each character area coordinate set in the character area coordinate set in the screen image to be cut to obtain a character area image set.
And thirdly, performing character recognition on each character area image in the character area image set to obtain a character information set.
In some optional implementation manners of some embodiments, the performing main body performs text recognition on each text region image in the text region image set to obtain a text information set, and may include the following steps:
firstly, inputting the character area image into a character recognition model trained in advance to obtain character contents.
And secondly, combining the text content with the center point coordinate of each text area coordinate in the text area coordinate group corresponding to the text area image to obtain text information.
Optionally, the character recognition model may include: convolutional layer, cyclic layer, and transcriptional layer. The executing body may input the text region image into a pre-trained text recognition model to obtain text contents, and may include the following steps:
first, the character area image is input to the convolution layer to obtain a feature sequence.
And secondly, inputting the characteristic sequence into the circulation layer to obtain the label distribution of the characteristic sequence.
And thirdly, distributing and inputting the labels into the transcription layer to obtain the text content.
The above steps are an inventive point of the embodiment of the present disclosure, and solve the technical problem mentioned in the background art that "accuracy of character recognition is reduced". Factors that cause a decrease in the accuracy of character recognition tend to be as follows: the requirement on the definition of the screen image is high, and when the definition of the screen image is low, the accuracy of character recognition is reduced. If the above factors are solved, the effect of improving the character recognition accuracy rate can be achieved. In order to achieve the effect, the method firstly determines the coordinates of each character area in the screen image to be cut, and then cuts the screen image to be cut to obtain the character area image set. Therefore, the range of character recognition can be limited, character recognition in a large range is avoided, the influence of image definition on character recognition is avoided to a certain extent, and the accuracy of character recognition is improved.
And 105, combining the target component information set and the character information set to obtain a set.
In some embodiments, the execution subject may perform a combination process on the target component information set and the text information set to obtain a screen identification information set. The target component information in the target component information set and the text information in the text information set may be used as screen image identification information to obtain a screen image identification information set.
In some optional implementations of some embodiments, the screen identification information in the set of screen identification information may include a component type and a component position, or a text content and a text position. The execution main body can also send the screen picture identification information set and the screen image to be cut to a target terminal for displaying.
The above embodiments of the present disclosure have the following advantages: by the screen image identification method of some embodiments of the present disclosure, components included in a screen image can be relatively comprehensive. Specifically, the reason for the incomplete detection of the component is: the component detection category is limited. Based on this, the screen image recognition method of some embodiments of the present disclosure, first, obtains the screen image to be cut. And then, cutting the screen image to be cut according to each preset cutting size to obtain a cut screen image group, wherein the cut screen image in the cut screen image group consists of cut image units with corresponding cutting sizes. Therefore, the screen image to be cut is cut to be integrated into a part, so that subsequent component detection and character recognition are facilitated. And then, carrying out component detection processing by using the cut image unit in each cut screen image in the cut screen image group to obtain a target component information set. From this, carry out the subassembly to the cutting image unit that cutting size is different and detect, can promote the comprehensiveness that the subassembly detected to avoid the omission that the subassembly detected. And then, carrying out character recognition on the screen image to be cut to obtain a character information set. And finally, combining the target component information set and the character information set to obtain a screen picture identification information set. Therefore, the components contained in the screen image can be more comprehensive.
With further reference to fig. 2, as an implementation of the methods illustrated in the above figures, the present disclosure provides some embodiments of a screen image recognition apparatus, which correspond to those method embodiments illustrated in fig. 2, and which may be applied in particular to various electronic devices.
As shown in fig. 2, the screen image recognition apparatus 200 of some embodiments includes: an acquisition unit 201, a cutting unit 202, a component detection unit 203, a character recognition unit 204, and a combining unit 205. The obtaining unit 201 is configured to obtain a screen image to be cut; a cutting unit 202, configured to perform cutting processing on the screen image to be cut according to preset cutting sizes to obtain a cut screen image group, where the cut screen image in the cut screen image group is composed of cut image units of corresponding cutting sizes; a component detection unit 203 configured to perform component detection processing by using a cut image unit in each cut screen image in the cut screen image group to obtain a target component information set; a character recognition unit 204 configured to perform character recognition on the screen image to be cut, so as to obtain a character information set; a combining unit 205 configured to combine the target component information set and the text information set to obtain a screen identification information set.
It will be understood that the units described in the apparatus 200 correspond to the various steps in the method described with reference to fig. 1. Thus, the operations, features and resulting advantages described above with respect to the method are also applicable to the apparatus 200 and the units included therein, and are not described herein again.
Referring now to FIG. 3, a block diagram of an electronic device 300 suitable for use in implementing some embodiments of the present disclosure is shown. The electronic device shown in fig. 3 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 3, electronic device 300 may include a processing device (e.g., central processing unit, graphics processor, etc.) 301 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)302 or a program loaded from a storage device 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data necessary for the operation of the electronic apparatus 300 are also stored. The processing device 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
Generally, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; and a communication device 309. The communication means 309 may allow the electronic device 300 to communicate with other devices, wireless or wired, to exchange data. While fig. 3 illustrates an electronic device 300 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 3 may represent one device or may represent multiple devices, as desired.
In particular, according to some embodiments of the present disclosure, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In some such embodiments, the computer program may be downloaded and installed from a network through the communication device 309, or installed from the storage device 308, or installed from the ROM 302. The computer program, when executed by the processing apparatus 301, performs the above-described functions defined in the methods of some embodiments of the present disclosure.
It should be noted that the computer readable medium described in some embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a screen image to be cut; cutting the screen image to be cut according to each preset cutting size to obtain a cut screen image group, wherein the cut screen image in the cut screen image group consists of cut image units with corresponding cutting sizes; performing component detection processing by using a cutting image unit in each cutting screen image in the cutting screen image group to obtain a target component information set; carrying out character recognition on the screen image to be cut to obtain a character information set; and combining the target component information set and the character information set to obtain a screen picture identification information set.
Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in some embodiments of the present disclosure may be implemented by software, and may also be implemented by hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, a cutting unit, a component detection unit, a character recognition unit, and a combining unit. The names of these units do not in some cases constitute a limitation on the unit itself, and for example, the acquiring unit may also be described as a "unit that acquires a screen image to be cut".
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

Claims (9)

1. A screen image recognition method, comprising:
acquiring a screen image to be cut;
cutting the screen image to be cut according to each preset cutting size to obtain a cut screen image group, wherein the cut screen image in the cut screen image group consists of cut image units with corresponding cutting sizes;
performing component detection processing by using a cutting image unit in each cutting screen image in the cutting screen image group to obtain a target component information set;
performing character recognition on the screen image to be cut to obtain a character information set;
and combining the target component information set and the character information set to obtain a screen picture identification information set.
2. The method of claim 1, wherein the screen shot identification information in the set of screen shot identification information includes a component type and a component location, or a textual content and a textual location; and
the method further comprises the following steps:
and sending the screen picture identification information set and the screen image to be cut to a target terminal for displaying.
3. The method according to claim 1, wherein the performing text recognition on the screen image to be cut to obtain a text information set comprises:
determining the coordinates of each character area in the screen image to be cut to obtain a character area coordinate set;
according to the character area coordinate set, performing cutting processing on the screen image to be cut to obtain a character area image set;
and performing character recognition on each character area image in the character area image set to obtain a character information set.
4. The method of claim 3, wherein the performing text recognition on each text region image in the text region image set to obtain a text information set comprises:
inputting the character area image into a character recognition model trained in advance to obtain character content;
and combining the text content with the center point coordinates of each text area coordinate in the text area coordinate group corresponding to the text area image to obtain text information.
5. The method of claim 4, wherein the word recognition model comprises: a convolutional layer, a cyclic layer and a transcription layer; and
inputting the text area image into a pre-trained text recognition model to obtain text contents, wherein the text contents comprise:
inputting the character area image into the convolution layer to obtain a characteristic sequence;
inputting the characteristic sequence into the circulation layer to obtain the label distribution of the characteristic sequence;
and inputting the labels into the transcription layer in a distributed manner to obtain the text content.
6. The method according to claim 1, wherein the cutting the screen image to be cut according to the preset cutting sizes to obtain a cut screen image group comprises:
in response to the fact that the length and the width of the screen image to be cut are not equal, carrying out equal-length processing on the screen image to be cut to obtain the screen image to be cut with the same length and width;
and cutting the screen image to be cut with the same length and width to obtain a cut screen image group.
7. A screen image recognition apparatus comprising:
an acquisition unit configured to acquire a screen image to be cut;
the cutting unit is configured to cut the screen image to be cut according to preset cutting sizes to obtain a cutting screen image group, wherein the cutting screen image in the cutting screen image group is composed of cutting image units with corresponding cutting sizes;
a component detection unit configured to perform component detection processing by using a cut image unit in each cut screen image in the cut screen image group, so as to obtain a target component information set;
the character recognition unit is configured to perform character recognition on the screen image to be cut to obtain a character information set;
and the combining unit is configured to combine the target component information set and the character information set to obtain a screen picture identification information set.
8. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-6.
9. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-6.
CN202210599436.1A 2022-05-30 2022-05-30 Screen image recognition method, apparatus, device and computer readable medium Active CN114926830B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210599436.1A CN114926830B (en) 2022-05-30 2022-05-30 Screen image recognition method, apparatus, device and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210599436.1A CN114926830B (en) 2022-05-30 2022-05-30 Screen image recognition method, apparatus, device and computer readable medium

Publications (2)

Publication Number Publication Date
CN114926830A true CN114926830A (en) 2022-08-19
CN114926830B CN114926830B (en) 2023-09-12

Family

ID=82811845

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210599436.1A Active CN114926830B (en) 2022-05-30 2022-05-30 Screen image recognition method, apparatus, device and computer readable medium

Country Status (1)

Country Link
CN (1) CN114926830B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110099499A1 (en) * 2009-10-26 2011-04-28 Ayelet Pnueli Graphical user interface component identification
US20190250891A1 (en) * 2018-02-12 2019-08-15 Oracle International Corporation Automated code generation
US10635413B1 (en) * 2018-12-05 2020-04-28 Bank Of America Corporation System for transforming using interface image segments and constructing user interface objects
CN111652266A (en) * 2020-04-17 2020-09-11 北京三快在线科技有限公司 User interface component identification method and device, electronic equipment and storage medium
CN111652208A (en) * 2020-04-17 2020-09-11 北京三快在线科技有限公司 User interface component identification method and device, electronic equipment and storage medium
KR102179552B1 (en) * 2019-05-15 2020-11-17 주식회사 한컴위드 Apparatus and method for collecting evidence based on ocr
US20210019574A1 (en) * 2019-07-19 2021-01-21 UiPath, Inc. Retraining a computer vision model for robotic process automation
CN112631586A (en) * 2020-12-24 2021-04-09 软通动力信息技术(集团)股份有限公司 Application development method and device, electronic equipment and storage medium
CN113377356A (en) * 2021-06-11 2021-09-10 四川大学 Method, device, equipment and medium for generating user interface prototype code
US11221833B1 (en) * 2020-03-18 2022-01-11 Amazon Technologies, Inc. Automated object detection for user interface generation
CN114205365A (en) * 2020-08-31 2022-03-18 华为技术有限公司 Application interface migration system and method and related equipment
CN114332118A (en) * 2021-12-23 2022-04-12 北京达佳互联信息技术有限公司 Image processing method, device, equipment and storage medium

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110099499A1 (en) * 2009-10-26 2011-04-28 Ayelet Pnueli Graphical user interface component identification
US20190250891A1 (en) * 2018-02-12 2019-08-15 Oracle International Corporation Automated code generation
US10635413B1 (en) * 2018-12-05 2020-04-28 Bank Of America Corporation System for transforming using interface image segments and constructing user interface objects
KR102179552B1 (en) * 2019-05-15 2020-11-17 주식회사 한컴위드 Apparatus and method for collecting evidence based on ocr
US20210019574A1 (en) * 2019-07-19 2021-01-21 UiPath, Inc. Retraining a computer vision model for robotic process automation
US11221833B1 (en) * 2020-03-18 2022-01-11 Amazon Technologies, Inc. Automated object detection for user interface generation
CN111652266A (en) * 2020-04-17 2020-09-11 北京三快在线科技有限公司 User interface component identification method and device, electronic equipment and storage medium
CN111652208A (en) * 2020-04-17 2020-09-11 北京三快在线科技有限公司 User interface component identification method and device, electronic equipment and storage medium
CN114205365A (en) * 2020-08-31 2022-03-18 华为技术有限公司 Application interface migration system and method and related equipment
CN112631586A (en) * 2020-12-24 2021-04-09 软通动力信息技术(集团)股份有限公司 Application development method and device, electronic equipment and storage medium
CN113377356A (en) * 2021-06-11 2021-09-10 四川大学 Method, device, equipment and medium for generating user interface prototype code
CN114332118A (en) * 2021-12-23 2022-04-12 北京达佳互联信息技术有限公司 Image processing method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
胡超;: "组件化在应用界面设计中的优势和表现", 信息通信, no. 03, pages 290 - 292 *
闲鱼技术: "重磅系列文章!UI2CODE智能生成代码——组件识别篇", Retrieved from the Internet <URL:https://blog.csdn.net/weixin_38912070/article/details/93856944> *

Also Published As

Publication number Publication date
CN114926830B (en) 2023-09-12

Similar Documents

Publication Publication Date Title
CN110321958B (en) Training method of neural network model and video similarity determination method
KR102002024B1 (en) Method for processing labeling of object and object management server
CN111784712B (en) Image processing method, device, equipment and computer readable medium
CN113313064A (en) Character recognition method and device, readable medium and electronic equipment
CN112883966B (en) Image character recognition method, device, medium and electronic equipment
CN113378855A (en) Method for processing multitask, related device and computer program product
CN112418249A (en) Mask image generation method and device, electronic equipment and computer readable medium
CN111461968A (en) Picture processing method and device, electronic equipment and computer readable medium
CN111783777A (en) Image processing method, image processing device, electronic equipment and computer readable medium
CN111461967A (en) Picture processing method, device, equipment and computer readable medium
CN110852242A (en) Watermark identification method, device, equipment and storage medium based on multi-scale network
CN113807056B (en) Document name sequence error correction method, device and equipment
CN113033552B (en) Text recognition method and device and electronic equipment
CN113378025B (en) Data processing method, device, electronic equipment and storage medium
CN114926830B (en) Screen image recognition method, apparatus, device and computer readable medium
CN111784709B (en) Image processing method, image processing device, electronic equipment and computer readable medium
CN111783572B (en) Text detection method and device
CN111680754B (en) Image classification method, device, electronic equipment and computer readable storage medium
CN114862720A (en) Canvas restoration method and device, electronic equipment and computer readable medium
CN110334763B (en) Model data file generation method, model data file generation device, model data file identification device, model data file generation apparatus, model data file identification apparatus, and model data file identification medium
CN112308678A (en) Price information processing method, device, equipment and medium based on image recognition
CN112528970A (en) Guideboard detection method, device, equipment and computer readable medium
CN111815654A (en) Method, apparatus, device and computer readable medium for processing image
CN111797263A (en) Image label generation method, device, equipment and computer readable medium
CN110991312A (en) Method, apparatus, electronic device, and medium for generating detection information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant