WO2019233212A1 - 文本识别方法、装置、移动终端以及存储介质 - Google Patents

文本识别方法、装置、移动终端以及存储介质 Download PDF

Info

Publication number
WO2019233212A1
WO2019233212A1 PCT/CN2019/084377 CN2019084377W WO2019233212A1 WO 2019233212 A1 WO2019233212 A1 WO 2019233212A1 CN 2019084377 W CN2019084377 W CN 2019084377W WO 2019233212 A1 WO2019233212 A1 WO 2019233212A1
Authority
WO
WIPO (PCT)
Prior art keywords
control
user interface
touch operation
control image
image
Prior art date
Application number
PCT/CN2019/084377
Other languages
English (en)
French (fr)
Inventor
揭骏仁
林建华
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2019233212A1 publication Critical patent/WO2019233212A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/0483Interaction with page-structured environments, e.g. book metaphor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04842Selection of displayed objects or displayed text elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the present application relates to the technical field of mobile terminals, and more particularly, to a text recognition method, device, mobile terminal, and storage medium.
  • this application proposes a text recognition method, device, mobile terminal, and storage medium to improve the above problems.
  • an embodiment of the present application provides a text recognition method.
  • the method includes: detecting a touch operation acting on a user interface, and when the touch operation meets a preset condition, the touch operation The interface element on the user interface corresponding to the position is identified; when the identification is not successful, the control image corresponding to the position of the touch operation is intercepted and the control image is identified; in a part of the user interface At least one card is displayed superimposed on the area, and the at least one card is used to display information identified by the control image.
  • an embodiment of the present application provides a text recognition device.
  • the device includes an interface element recognition module for detecting a touch operation acting on a user interface. When the touch operation meets a preset condition, Identifying an interface element on the user interface corresponding to the position of the touch operation; an image capture module, configured to capture a control image corresponding to the position of the touch operation when the recognition is not successful, and The control image is used for identification; a card display module is configured to superimpose and display at least one card on a part of the user interface, and the at least one card is used to display information identified by the control image.
  • an embodiment of the present application provides a mobile terminal including a touch screen, a memory, and a processor.
  • the touch screen and the memory are coupled to the processor, and the memory stores instructions.
  • the processor executes, the processor executes the above method.
  • an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores program code, and the program code can be called by a processor to execute the foregoing method.
  • FIG. 1 is a schematic flowchart of a text recognition method according to an embodiment of the present application.
  • FIG. 2 is a schematic diagram of a user interface of a mobile terminal according to an embodiment of the present application
  • FIG. 3 is a schematic flowchart of a text recognition method according to another embodiment of the present application.
  • FIG. 4 is a schematic flowchart of step S240 of the text recognition method provided by the embodiment shown in FIG. 3 of the present application; FIG.
  • FIG. 5 shows another schematic diagram of a user interface of a mobile terminal according to an embodiment of the present application
  • FIG. 6 is a schematic flowchart of step S270 of the text recognition method provided by the embodiment shown in FIG. 3 of the present application;
  • FIG. 7 is a schematic flowchart of a text recognition method according to another embodiment of the present application.
  • FIG. 8 shows a module block diagram of a text recognition device according to an embodiment of the present application.
  • FIG. 9 shows a module block diagram of a text recognition device according to another embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a mobile terminal according to an embodiment of the present application.
  • FIG. 11 shows a block diagram of a mobile terminal for performing a text recognition method according to an embodiment of the present application
  • FIG. 12 illustrates a storage unit for storing or carrying a program code for implementing a text recognition method according to an embodiment of the present application.
  • the inventors have proposed a text recognition method, device, mobile terminal, and storage medium provided in the embodiments of the present application.
  • the image recognition technology is used to improve the speed and accuracy of word recognition to improve the user experience.
  • the specific text recognition method is described in detail in subsequent embodiments.
  • FIG. 1 is a schematic flowchart of a text recognition method according to an embodiment of the present application.
  • the text recognition method is used to improve the speed and accuracy of word recognition through image recognition technology to improve user experience.
  • the text recognition method is applied to a text recognition device 200 as shown in FIG. 8 and a mobile terminal (FIG. 10) configured with the text recognition device 200.
  • a mobile terminal is taken as an example to describe the specific process of this embodiment.
  • the mobile terminal applied in this embodiment may be a smart phone, a tablet computer, a wearable electronic device, etc., and is not described in detail here. limited.
  • the process shown in FIG. 1 will be described in detail below.
  • the text recognition method may specifically include the following steps:
  • Step S110 Detect a touch operation acting on a user interface, and when the touch operation meets a preset condition, identify an interface element on the user interface corresponding to a position of the touch operation.
  • a touch operation acting on a user interface is detected.
  • the touch operation may include single-finger tap, multi-finger tap, single-finger long press, multi-finger long press, heavy pressure, Multiple clicks, slide operations, copy operations, pressing areas, etc., where the single-finger click refers to a single-finger click operation on the user interface; the multi-finger click refers to a multi-finger click operation on the user interface at the same time.
  • Single-finger long press means single-finger pressing on the user interface for more than the preset time
  • Multi-finger long press means multi-finger pressing on the user interface for more than the preset time
  • Heavy pressure means that the pressure on the user interface exceeds the preset time Set the intensity
  • multiple clicks means that the number of clicks exceeds the preset number of times within a preset time
  • a sliding operation refers to a single-finger sliding operation on the user interface
  • a copy operation refers to copying text information to the pasteboard in the user interface Operation
  • pressing area means that the single-finger pressing area on the user interface exceeds the preset area.
  • the mobile terminal sets and stores a preset condition in advance, wherein the preset condition is used as a basis for judging a touch operation, that is, after detecting and obtaining the touch operation, the touch operation is compared with the preset condition. Make a comparison to determine whether the touch operation satisfies a preset condition.
  • a preset condition is used as a basis for judging a touch operation, that is, after detecting and obtaining the touch operation, the touch operation is compared with the preset condition. Make a comparison to determine whether the touch operation satisfies a preset condition.
  • the touch operation meets the preset condition, obtain a position of the touch operation, for example, obtain a position corresponding to the position of the touch operation.
  • Coordinate information and then identify interface elements on the user interface corresponding to the position of the touch operation.
  • the interface element includes, but is not limited to, text, picture, audio, and video.
  • the mobile terminal may determine and recognize at least one interface element based on the position of the touch operation.
  • the interface elements at the location are identified, and all interface elements located in the same paragraph as the interface elements at the location of the touch operation may be identified, etc., which are not specifically limited herein.
  • Step S120 when the recognition is not successful, capture a control image corresponding to the position of the touch operation and identify the control image.
  • the image corresponding to the position of the touch operation is automatically captured and acquired.
  • the system automatically obtains the control where the text corresponding to the position of the touch operation is located, intercepts the control, and then calls the image-to-text recognition OCR (Optical Character Recognition) module for recognition through the background.
  • image-to-text recognition uses image-to-text recognition technology, which can be taken offline, that is, a method of transplanting an image-to-text recognition library to a mobile terminal.
  • an image-to-text recognition operation is performed on the picture information according to the image-to-text recognition library in the mobile terminal; the image can also be transmitted to a remote image-to-text server for identification in an online manner. Upload the picture information to the image-to-text server.
  • the image-to-text server performs image-to-text recognition operation based on the internal image-to-text recognition library picture information, and sends the recognition result to the mobile terminal.
  • the image-to-text can also be accompanied by the x coordinate, y coordinate, width, and height of each text, which is not repeated here.
  • the user interface of the mobile terminal may display prompt information, where the prompt information is used to prompt the user that the image-to-text recognition operation is currently being performed.
  • Step S130 Overlay and display at least one card on a part of the user interface, where the at least one card is used to display information identified by the control image.
  • FIG. 2 illustrates a schematic diagram of a user interface of a mobile terminal according to an embodiment of the present application.
  • Part of the user interface can be located in the lower half of the user interface, can be located in the upper half of the user interface, can be located in the left half of the user interface, or the user
  • a partial area of the right half of the interface is optional.
  • the partial area is located in a lower half of the user interface near the bottom, and the size is not specifically limited.
  • image-to-text recognition is performed on the captured control image, at least one keyword in the control image is acquired, and the at least one keyword is searched to obtain search result information corresponding to the content of the control image.
  • the search result information is displayed in the form of a card, where the card serves as a carrier carrying the search result information, and each card can display at least one search result information, and each card in the at least one card displays search results.
  • the amount of information can be the same or different, and the search result information displayed by each card can come from the same application, or from different applications.
  • the at least one card may further display word segmentation information, that is, at least one keyword obtained after recognition based on the control image may be displayed, and the user may perform word selection editing based on the word segmentation information, for example, in the word segmentation information, Search, translate, share, etc. with keywords.
  • the at least one card is displayed in a superimposed form on a partial area of the user interface. Understandably, at this time, the card may be displayed in a layered manner above the partial area of the user interface. It may also cover a part of the user interface and be displayed at a different level from the user interface. In addition, in this embodiment, when the at least one card is superimposed and displayed in a partial area of the user interface, the original content located in the partial area is still partially visible and is not completely blocked for a user's click operation.
  • a text recognition method detects a touch operation acting on a user interface, and when the touch operation meets a preset condition, identifies an interface element on a user interface corresponding to a position of the touch operation
  • the recognition is not successful, the control image corresponding to the position of the touch operation is intercepted and the control image is recognized, and at least one card is displayed superimposed on a part of the user interface, and the at least one card is used to display the control by the control.
  • the information recognized by the image through image recognition technology, improves the speed and accuracy of word recognition to improve the user experience.
  • FIG. 3 is a schematic flowchart of a text recognition method according to another embodiment of the present application. The process shown in FIG. 3 will be described in detail below. The method may specifically include the following steps:
  • Step S210 Detect a touch operation acting on a user interface, and when the touch operation meets a preset condition, identify an interface element on the user interface corresponding to a position of the touch operation.
  • Step S220 When the identification is not successful, obtain an application program corresponding to the user interface.
  • An application program includes multiple user interfaces. After obtaining the user interface, the corresponding application program can be obtained based on the user interface. As a way, through the user interface, the type of the application, the name of the application, or the purpose of the application can be obtained.
  • Step S230 Determine whether the application is a key application, if not, execute step S270, and if yes, execute step S240.
  • the mobile terminal presets and stores a key application, and the key application is used as a basis for judging the application.
  • the key application may be a system native application or a third-party application downloaded and installed by the user.
  • the key application program can be configured by the mobile terminal system in advance or manually by the user. Specifically, when the key application is configured by the mobile terminal system itself, the system can be configured according to the frequency of use of the application. For example, an application with a frequency of use that is higher than a certain frequency threshold is used as a key application.
  • An application above a certain frequency threshold is regarded as a non-key application; or when the key application is configured by the mobile terminal system itself, it can be configured according to the type of application, for example, a text display or instant messaging application As non-key applications, such as WeChat, QQ, Weibo, News, Browser, and video display applications as key applications.
  • non-key applications such as WeChat, QQ, Weibo, News, Browser, and video display applications as key applications.
  • the key application is manually configured by the user, one or more applications can be selected as the key application according to the user's preferences or needs.
  • Step S240 When the application is the key application, capture a control image corresponding to the position of the touch operation and identify the control image.
  • image capture is performed on the text corresponding to the position of the touch operation to obtain a control image, and the control image is identified.
  • FIG. 4 is a schematic flowchart of step S240 of the text recognition method provided by the embodiment shown in FIG. 3 of the present application.
  • the process shown in FIG. 4 will be described in detail below.
  • the method may specifically include the following steps:
  • Step S241 When the application is the key application, obtain a control type corresponding to the position of the touch operation.
  • the control type of the control corresponding to the position of the current touch operation is detected and acquired.
  • the control type can at least include a text type, a picture type, a video type, and the like.
  • Step S242 Determine whether the control type meets a preset type.
  • the mobile terminal sets and stores a preset type in advance, and the preset type is used as a basis for judging the type of the control.
  • the preset type may be a text view. After the control type, the control type is compared with the text view to determine whether the control type satisfies the text view.
  • Step S243 When the type of the control meets a preset type, capture a control image corresponding to the position of the touch operation and identify the control image.
  • a control image corresponding to the position of the touch operation is intercepted and an automatic OCR is performed to identify the control image.
  • Step S250 Determine whether the control image can identify valid information, wherein the confidence probability of the valid information is higher than a preset value; if yes, go to step S260; if no, go to step S270.
  • the control image it is judged whether the control image can identify valid information, wherein the confidence probability of the valid information is higher than a preset value.
  • the information obtained after the control image is identified is detected as a kind of Method, first detect whether the information contains text information. When the information does not contain text information, it indicates that the information is empty, and the recognition fails. When the information contains text information, continue to obtain the confidence probability of the text information and judge it as In one method, the mobile terminal stores a confidence probability algorithm and a preset value in advance, and through this algorithm, the confidence probability of the information can be calculated, and then the confidence probability is compared with the preset value to determine whether the confidence probability is higher than the confidence probability. A preset value. When the confidence probability is higher than the preset value, it indicates that the control image can identify valid information.
  • garbled text appears when the control image is parsed and identified, preliminary analysis of the parsed result is required, and garbled characters and characters are filtered. If there is no valid information after filtering, the selection control is displayed in the user interface. If there is valid information, at least one card is displayed superimposed on a part of the user interface.
  • Step S260 Overlay and display at least one card on a part of the user interface.
  • the control image can identify the valid information
  • the result is displayed, that is, at least one card is superimposedly displayed on a partial area of the user interface.
  • a selection control is displayed below the card on the same interface as the card, so as to provide the user with an entry for manual frame selection when the user is not satisfied with the valid information.
  • Step S270 Display a selection control on the user interface, where the selection control is used to trigger manual frame selection or cancellation of recognition.
  • FIG. 5 illustrates another schematic diagram of a user interface of a mobile terminal according to an embodiment of the present application.
  • the application program is not a key application program or the control image cannot recognize the valid information
  • the selection control is displayed on a user interface, where the selection control is used to trigger manual frame selection or cancellation of identification.
  • FIG. 6 illustrates a schematic flowchart of step S270 of the text recognition method provided by the embodiment shown in FIG. 3 of the present application.
  • the process shown in FIG. 6 will be described in detail below.
  • the method may specifically include the following steps:
  • Step S271 Obtain the duration for recognizing the control image.
  • Step S272 Determine whether the duration exceeds a preset duration.
  • the recognition duration is acquired, and the duration is compared with a preset duration, where the preset duration is preset in the mobile terminal and stored for use as The basis for judging the duration, for example, the preset duration may be 8s, 10s, or the like.
  • the preset duration may be 8s, 10s, or the like.
  • Step S273 When the duration exceeds the preset duration, display the selection control on the user interface.
  • a QR code identification control As a method, if the user selects manual frame selection, after the corresponding area is manually selected, a QR code identification control, a product identification control, and a text identification control are displayed below the frame selection control, and the user recognizes the QR code triggered by the user
  • the control can perform two-dimensional code recognition on the captured image; the product recognition control triggered by the user can perform product recognition on the captured image; and the text recognition of the captured image based on the user-triggered text recognition control. Further, during the recognition process, a circle progress prompt is displayed on the user interface, and a response card pops up after the recognition is completed.
  • the text recognition method provided by another embodiment of the present application detects a touch operation acting on a user interface, and when the touch operation satisfies a preset condition, an interface element on a user interface corresponding to a position of the touch operation is performed.
  • Recognition When the recognition is not successful, obtain the application corresponding to the user interface, and determine whether the application is a key application.
  • the application When the application is not a key application, display a selection control in the user interface. The selection control is used to trigger a manual operation. Select or cancel recognition.
  • the application is a key application, intercept the control image corresponding to the position of the touch operation and identify the control image to determine whether the control image can identify valid information. Among them, the valid information The confidence probability of is higher than a preset value.
  • control image can identify valid information
  • at least one card is superimposed on a part of the user interface.
  • a selection control is displayed on the user interface.
  • FIG. 7 is a schematic flowchart of a text recognition method according to another embodiment of the present application. The process shown in FIG. 7 will be described in detail below. The method may specifically include the following steps:
  • Step S310 Detect a touch operation acting on a user interface, and when the touch operation meets a preset condition, identify an interface element on the user interface corresponding to a position of the touch operation.
  • Step S320 when the identification is not successful, obtain a touch center position corresponding to the touch operation.
  • the touch center position corresponding to the touch operation is obtained.
  • the touch area of the touch operation is obtained, and calculation is performed based on the touch area to obtain the touch.
  • the center position of the control area where the center position is the touch center position corresponding to the touch operation.
  • Step S330 Determine whether the touch center position is on an effective control of the user interface, where the effective control includes at least one interface element.
  • the user interface includes multiple controls.
  • the controls can be divided by judging whether the multiple controls include interface elements, and when the control includes at least one interface element, the control can be viewed as As a valid control; when the control does not include interface elements, the control can be regarded as a blank control or an invalid control.
  • the coordinate position of each effective control is detected, and whether the touch center position is on the effective control is determined by the coordinate position of the touch center position and the coordinate position of the effective control.
  • Step S340 when the touch center position is on the effective control, intercept an effective control image and identify the effective control image.
  • the effective control image is intercepted and the image-to-text recognition is performed on the effective control image.
  • the text on the effective control can be intercepted and identified.
  • Step S350 when the touch center position is not on the effective control, capture a user interface image and identify the user interface image.
  • the user interface image is intercepted and the user interface image is subjected to image-to-text recognition.
  • the full screen of the user interface can be intercepted and identified.
  • Step S360 Overlay and display at least one card on a part of the user interface, and the at least one card is used to display information identified by the effective control image or the user interface image.
  • a text recognition method detects a touch operation acting on a user interface, and when the touch operation meets a preset condition, identifies an interface element on a user interface corresponding to a position of the touch operation
  • the touch center position corresponding to the touch operation is obtained to determine whether the touch center position is on an effective control of the user interface, where the effective control includes at least one interface element, and when the touch center position When it is on the effective control, it intercepts the effective control image and identifies the effective control image.
  • the touch center position is not on the effective control, it intercepts the user interface image and recognizes the user interface image, so that the image is intercepted according to the touch center position. Recognize and improve the speed of word recognition to improve user experience.
  • FIG. 8 illustrates a module block diagram of a text recognition device 200 provided by an embodiment of the present application.
  • the following describes the block diagram shown in FIG. 8.
  • the text recognition device 200 includes: an interface element recognition module 210, an image capture module 220, and a card display module 230, where:
  • the interface element recognition module 210 is configured to detect a touch operation acting on a user interface, and when the touch operation meets a preset condition, identify an interface element on the user interface corresponding to a position of the touch operation. .
  • the image capturing module 220 is configured to capture a control image corresponding to the position of the touch operation and identify the control image when the recognition is not successful. Please refer to FIG. 9, which illustrates a block diagram of a text recognition device 200 according to another embodiment of the present application. Further, the image capture module 220 includes: an application acquisition submodule 221 and an application determination submodule 222 , Control image recognition submodule 223, selection control display submodule 224, touch center position acquisition submodule 225, touch center position determination submodule 226, effective control image recognition submodule 227, and user interface image recognition submodule 228, of which :
  • the application obtaining sub-module 221 is configured to obtain an application corresponding to the user interface when the identification is not successful.
  • the application judging sub-module 222 is configured to determine whether the application is a key application.
  • a control image recognition sub-module 223 is configured to intercept a control image corresponding to a position of the touch operation and identify the control image when the application is the key application. Further, the control image recognition sub-module 223 includes a control type acquisition unit, a control type judgment unit, and a control image recognition unit, wherein:
  • a control type acquisition unit is configured to acquire a control type corresponding to the position of the touch operation when the application is the key application.
  • a control type judgment unit is configured to determine whether the control type meets a preset type.
  • a control image recognition unit is configured to capture a control image corresponding to a position of the touch operation and identify the control image when the control type meets a preset type.
  • a selection control display sub-module 224 is configured to display a selection control on the user interface when the application is not the key application, wherein the selection control is used to trigger manual frame selection or cancellation of recognition.
  • the touch center position acquisition sub-module 225 is configured to acquire a touch center position corresponding to the touch operation when the recognition is not successful.
  • the touch center position determination sub-module 226 is configured to determine whether the touch center position is on an effective control of the user interface, where the effective control includes at least one interface element.
  • the effective control image recognition sub-module 227 is configured to capture an effective control image and identify the effective control image when the touch center position is on the effective control.
  • the user interface image recognition sub-module 228 is configured to capture a user interface image and identify the user interface image when the touch center position is not on the effective control.
  • the card display module 230 is configured to superimpose and display at least one card on a part of the user interface, and the at least one card is used to display information identified by the control image. Further, the card display module 230 includes: a valid information judgment sub-module 231, a card display sub-module 232, and a selection control display sub-module 233, wherein:
  • the valid information judging sub-module 231 is configured to determine whether the control image can identify valid information, wherein a confidence probability of the valid information is higher than a preset value.
  • the card display submodule 232 is configured to superimpose and display at least one card on a part of the user interface when the control image can identify the valid information.
  • the selection control display submodule 233 is configured to display the selection control on the user interface when the control image cannot identify the valid information. Further, the selection control display sub-module 233 includes a duration acquisition unit, a duration determination unit, and a selection control display unit, wherein:
  • a duration obtaining unit is configured to obtain a duration for identifying the control image.
  • a duration judging unit is configured to determine whether the duration exceeds a preset duration.
  • a selection control display unit is configured to display the selection control on the user interface when the duration exceeds the preset duration.
  • a text recognition method, device, mobile terminal, and storage medium detect a touch operation acting on a user interface.
  • the touch operation meets a preset condition
  • the touch The interface element on the user interface corresponding to the position of the operation is identified.
  • the control image corresponding to the position of the touch operation is intercepted and the control image is identified, and at least a part of the user interface is superimposed and displayed.
  • a card, the at least one card is used to display the information identified by the control image, and the image recognition technology is used to improve the accuracy of word recognition to improve the user experience.
  • an embodiment of the present application further provides a mobile terminal 100, which includes an electronic body portion 10, which includes a housing 12 and is disposed on the shell.
  • Main display 120 on the body 12.
  • the casing 12 can be made of metal, such as steel and aluminum alloy.
  • the main display screen 120 generally includes a display panel 111, and may also include a circuit for responding to a touch operation on the display panel 111, and the like.
  • the display panel 111 may be a liquid crystal display (Liquid Crystal Display, LCD).
  • the display panel 111 is a touch screen 109 at the same time.
  • the mobile terminal 100 can be used as a smart phone terminal.
  • the electronic body portion 10 usually also includes one or more (only shown in the figure) (A) The processor 102, the memory 104, an RF (Radio Frequency) module 106, an audio circuit 110, a sensor 114, an input module 118, and a power module 122.
  • a person of ordinary skill in the art can understand that the structure shown in FIG. 11 is only schematic, and it does not limit the structure of the electronic body portion 10.
  • the electronic body portion 10 may further include more or fewer components than those shown in FIG. 11, or have a different configuration from that shown in FIG. 11.
  • peripheral interface 124 may be implemented based on the following standards: Universal Asynchronous Receiver / Transmitter (UART), General Input / Output (GPIO), Serial Peripheral Interface , SPI), Inter-Integrated Circuit (I2C), but not limited to the above standards.
  • UART Universal Asynchronous Receiver / Transmitter
  • GPIO General Input / Output
  • SPI Serial Peripheral Interface
  • I2C Inter-Integrated Circuit
  • the peripheral interface 124 may only include a bus; in other examples, the peripheral interface 124 may further include other elements, such as one or more controllers, for example, for connecting the display panel.
  • these controllers can also be separated from the peripheral interface 124 and integrated into the processor 102 or a corresponding peripheral.
  • the memory 104 may be used to store software programs and modules, and the processor 102 executes various functional applications and data processing by running the software programs and modules stored in the memory 104.
  • the memory 104 may include a high-speed random access memory, and may further include a non-volatile memory, such as one or more magnetic storage devices, a flash memory, or other non-volatile solid-state memory.
  • the memory 104 may further include memories remotely disposed with respect to the processor 102, and these remote memories may be connected to the electronic body portion 10 or the main display screen 120 through a network. Examples of the above network include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
  • the RF module 106 is used to receive and send electromagnetic waves, to realize the mutual conversion of electromagnetic waves and electrical signals, and to communicate with a communication network or other equipment.
  • the RF module 106 may include various existing circuit elements for performing these functions, such as an antenna, a radio frequency transceiver, a digital signal processor, an encryption / decryption chip, a subscriber identity module (SIM) card, a memory, and the like .
  • the RF module 106 can communicate with various networks, such as the Internet, an intranet, and a wireless network, or communicate with other devices through a wireless network.
  • the wireless network may include a cellular telephone network, a wireless local area network, or a metropolitan area network.
  • the above wireless network can use various communication standards, protocols and technologies, including but not limited to Global System for Mobile Communication (GSM), Enhanced Data Communication Technology (GSM) Environment, EDGE, Broadband Code Division multiple access technology (wideband code division multiple access, W-CDMA), code division multiple access technology (Code division access, CDMA), time division multiple access technology (time division multiple access, TDMA), wireless fidelity technology (Wireless, Fidelity , WiFi) (such as the American Institute of Electrical and Electronics Engineers standards IEEE 802.10A, IEEE 802.11b, IEEE802.11g, and / or IEEE 802.11n), Voice over Internet (Internet Protocol, VoIP), Global Microwave Interoperability (Worldwide Interoperability for Microwave Access (Wi-Max), other protocols for mail, instant messaging, and short messaging, and any other suitable communication protocol, even those that have not yet been developed.
  • GSM Global System for Mobile Communication
  • GSM Global System for Mobile Communication
  • GSM Global System for Mobile Communication
  • GSM Global System for Mobile Communication
  • EDGE Broadband
  • the audio circuit 110, the earpiece 101, the sound jack 103, and the microphone 105 collectively provide an audio interface between the user and the electronic body portion 10 or the main display screen 120.
  • the audio circuit 110 receives sound data from the processor 102, converts the sound data into electrical signals, and transmits the electrical signals to the handset 101.
  • the earpiece 101 converts electrical signals into sound waves that human ears can hear.
  • the audio circuit 110 also receives electrical signals from the microphone 105, converts the electrical signals into sound data, and transmits the sound data to the processor 102 for further processing. Audio data may be obtained from the memory 104 or through the RF module 106. In addition, the audio data may also be stored in the memory 104 or transmitted through the RF module 106.
  • the sensor 114 is disposed in the electronic body portion 10 or the main display screen 120.
  • Examples of the sensor 114 include, but are not limited to, a light sensor, a running sensor, a pressure sensor, a gravity acceleration sensor, and other sensors. .
  • the sensor 114 may include a light sensor 114F and a pressure sensor 114G.
  • the pressure sensor 114G can detect a pressure generated by pressing on the mobile terminal 100. That is, the pressure sensor 114G detects the pressure generated by the contact or pressing between the user and the mobile terminal, such as the pressure generated by the contact or pressing between the user's ear and the mobile terminal. Therefore, the pressure sensor 114G can be used to determine whether a contact or a press has occurred between the user and the mobile terminal 100, and the magnitude of the pressure.
  • the light sensor 114F and the pressure sensor 114G are disposed adjacent to the display panel 111.
  • the light sensor 114F may turn off the display output when an object approaches the main display screen 120, for example, when the electronic body portion 10 moves to the ear.
  • a gravity acceleration sensor can detect the magnitude of acceleration in various directions (generally three axes), and can detect the magnitude and direction of gravity when it is stationary, and can be used to identify the posture of the mobile terminal 100 (such as horizontal and vertical Screen switching, related games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tap) and so on.
  • the electronic body portion 10 may be configured with other sensors such as a gyroscope, a barometer, a hygrometer, and a thermometer, which are not described herein again.
  • the input module 118 may include the touch screen 109 provided on the main display screen 120, and the touch screen 109 may collect a user's touch operation on or near the user (such as a user using a finger, a stylus, etc.). Wait for any suitable object or accessory to be operated on or near the touch screen 109), and drive the corresponding connection device according to a preset program.
  • the touch screen 109 may include a touch detection device and a touch controller. The touch detection device detects a user's touch position, and detects a signal caused by a touch operation, and transmits the signal to the touch controller.
  • the touch controller receives touch information from the touch detection device, and The touch information is converted into touch point coordinates, and then sent to the processor 102, and can receive and execute commands sent by the processor 102.
  • various types such as resistive, capacitive, infrared, and surface acoustic wave can be used to implement the touch detection function of the touch screen 109.
  • the input module 118 may further include other input devices, such as keys 107.
  • the keys 107 may include, for example, character keys for inputting characters, and control keys for triggering control functions. Examples of the control buttons include a "return to the home screen" button, an on / off button, and the like.
  • the main display screen 120 is used to display information input by the user, information provided to the user, and various graphical user interfaces of the electronic body portion 10, and these graphical user interfaces may include graphics, text, icons, numbers, videos, and
  • the touch panel 109 may be arranged in any combination. In one example, the touch screen 109 may be disposed on the display panel 111 so as to form a whole with the display panel 111.
  • the power module 122 is configured to provide power to the processor 102 and other components.
  • the power module 122 may include a power management system, one or more power sources (such as a battery or an AC power source), a charging circuit, a power failure detection circuit, an inverter, a power status indicator, and any other electronic body. Components related to the generation, management, and distribution of power in the unit 10 or the main display screen 120.
  • the mobile terminal 100 further includes a locator 119, which is configured to determine an actual location where the mobile terminal 100 is located.
  • the locator 119 uses a positioning service to implement positioning of the mobile terminal 100.
  • the positioning service should be understood as obtaining position information (such as latitude and longitude coordinates) of the mobile terminal 100 through a specific positioning technology. ), A technology or service for marking the location of an object on an electronic map.
  • the above-mentioned mobile terminal 100 is not limited to a smart phone terminal, and it should refer to a computer device that can be used in mobile. Specifically, the mobile terminal 100 refers to a mobile computer device equipped with a smart operating system.
  • the mobile terminal 100 includes, but is not limited to, a smart phone, a smart watch, a tablet computer, and the like.
  • FIG. 12 illustrates a structural block diagram of a computer-readable storage medium provided by an embodiment of the present application.
  • the computer-readable storage medium 300 stores program code, which can be called by a processor to execute the method described in the foregoing method embodiment.
  • the computer-readable storage medium 300 may be an electronic memory such as a flash memory, an EEPROM (Electrically Erasable Programmable Read-Only Memory), an EPROM, a hard disk, or a ROM.
  • the computer-readable storage medium 300 includes a non-transitory computer-readable storage medium.
  • the computer-readable storage medium 300 has a storage space of a program code 310 that executes any of the method steps in the above method. These program codes can be read from or written into one or more computer program products.
  • the program code 310 may be compressed in a suitable form, for example.
  • first and second are used for descriptive purposes only and cannot be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Therefore, the features defined as “first” and “second” may explicitly or implicitly include at least one of the features. In the description of the present application, the meaning of "plurality” is at least two, for example, two, three, etc., unless it is specifically and specifically defined otherwise.
  • Any process or method description in a flowchart or otherwise described herein can be understood as a module, fragment, or portion of code that includes one or more executable instructions for implementing a particular logical function or step of a process
  • the scope of the preferred embodiments of the present application includes additional implementations, in which the functions may be performed out of the order shown or discussed, including performing functions in a substantially simultaneous manner or in the reverse order according to the functions involved, which should It is understood by those skilled in the art to which the embodiments of the present application pertain.
  • Logic and / or steps represented in a flowchart or otherwise described herein, for example, a sequenced list of executable instructions that may be considered to implement a logical function, may be embodied in any computer-readable medium, For use by or in combination with an instruction execution system, device, or device (such as a computer-based system, a system including a processor, or other system that can fetch and execute instructions from an instruction execution system, device, or device) Or equipment.
  • a "computer-readable medium” may be any device that can contain, store, communicate, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device.
  • computer-readable media include the following: electrical connections (mobile terminals) with one or more wirings, portable computer disk enclosures (magnetic devices), random access memory (RAM), Read-only memory (ROM), erasable and editable read-only memory (EPROM or flash memory), fiber optic devices, and portable optical disk read-only memory (CDROM).
  • the computer-readable medium may even be paper or other suitable medium on which the program can be printed, because, for example, by optically scanning the paper or other medium, followed by editing, interpretation, or other suitable Processing to obtain the program electronically and then store it in computer memory.
  • each part of the application may be implemented by hardware, software, firmware, or a combination thereof.
  • multiple steps or methods may be implemented by software or firmware stored in a memory and executed by a suitable instruction execution system.
  • a suitable instruction execution system For example, if implemented in hardware, as in another embodiment, it may be implemented using any one or a combination of the following techniques known in the art: Discrete logic circuits, application specific integrated circuits with suitable combinational logic gate circuits, programmable gate arrays (PGA), field programmable gate arrays (FPGA), etc.
  • each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist separately physically, or two or more units may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software functional module and sold or used as an independent product, it may also be stored in a computer-readable storage medium.
  • the aforementioned storage medium may be a read-only memory, a magnetic disk, or an optical disk.

Abstract

本申请实施例公开了一种文本识别方法、装置、移动终端以及存储介质,涉及移动终端技术领域。所述方法包括:检测作用于用户界面的触控操作,当该触控操作满足预设条件时,对与该触控操作的位置对应的用户界面上的界面元素进行识别,当未识别成功时,截取与该触控操作的位置对应的控件图像并对该控件图像进行识别,在用户界面的部分区域上叠加显示至少一个卡片,该至少一个卡片用于显示由该控件图像识别出的信息。本申请实施例提供的文本识别方法、装置、移动终端以及存储介质,通过图像识别技术,提升取词识别的快捷性和准确性,以提升用户体验。

Description

文本识别方法、装置、移动终端以及存储介质
相关申请的交叉引用
本申请要求于2018年6月7日提交的申请号为201810586716.2的中国申请的优先权,其在此出于所有目的通过引用将其全部内容并入本文。
技术领域
本申请涉及移动终端技术领域,更具体地,涉及一种文本识别方法、装置、移动终端以及存储介质。
背景技术
随着科学技术的发展,移动终端已经成为人们日常生活中最常用的电子产品之一。并且,用户经常会通过移动终端获取信息进行大量的阅读,以满足自己的阅读需求。
发明内容
鉴于上述问题,本申请提出了一种文本识别方法、装置、移动终端以及存储介质,以改善上述问题。
第一方面,本申请实施例提供了一种文本识别方法,所述方法包括:检测作用于用户界面的触控操作,当所述触控操作满足预设条件时,对与所述触控操作的位置对应的所述用户界面上的界面元素进行识别;当未识别成功时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别;在所述用户界面的部分区域上叠加显示至少一个卡片,所述至少一个卡片用于显示由所述控件图像识别出的信息。
第二方面,本申请实施例提供了一种文本识别装置,所述装置包括:界面元素识别模块,用于检测作用于用户界面的触控操作,当所述触控操作满足预设条件时,对所述触控操作的位置对应的所述用户界面上的界面元素进行识别;图像截取模块,用于当未识别成功时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别;卡片显示模块,用于在所述用户界面的部分区域上叠加显示至少一个卡片,所述至少一个卡片用于显示由所述控件图像识别出的信息。
第三方面,本申请实施例提供了一种移动终端,包括触摸屏、存储器以及处理器,所述触摸屏与所述存储器耦接到所述处理器,所述存储器存储指令,当所述指令由所述处理器执行时所述处理器执行上述方法。
第四方面,本申请实施例提供了一种计算机可读取存储介质,所述计算机可读取存储介质中存储有程序代码,所述程序代码可被处理器调用执行上述方法。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1示出了本申请一个实施例提供的文本识别方法的流程示意图;
图2示出了本申请实施例提供的移动终端的用户界面的一种示意图;
图3示出了本申请又一个实施例提供的文本识别方法的流程示意图;
图4示出了本申请的图3所示的实施例提供的文本识别方法的步骤S240流程示意图;
图5示出了本申请实施例提供的移动终端的用户界面的另一种示意图;
图6示出了本申请的图3所示的实施例提供的文本识别方法的步骤S270流程示意图;
图7示出了本申请再一个实施例提供的文本识别方法的流程示意图;
图8示出了本申请一个实施例提供的文本识别装置的模块框图;
图9示出了本申请又一个实施例提供的文本识别装置的模块框图;
图10示出了本申请实施例提供的一种移动终端的结构示意图;
图11示出了用于执行根据本申请实施例的文本识别方法的移动终端的框图;
图12示出了本申请实施例的用于保存或者携带实现根据本申请实施例的文本识别方法的程序代码的存储单元。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。
目前,用户在通过移动终端上网聊天、阅读文字、查看图片或者观看视频时,经常会对其中的一些内容产生兴趣并进行搜索获取更加详细的信息,此时,用户首先需要复制感兴趣的内容或牢记感兴趣的内容,然后打开浏览器,并将复制的内容粘贴到浏览器的搜索框中或将牢记的内容输入到浏览器的搜索框中进行搜索以获得详细信息,导致操作过程十分的繁琐,耗时较长且容易产生错误。发明人经过长期的研究发现,可以利用系统自带的辅助模式,根据用户的触控操作进行文本内容的获取并识别,但是这种方式往往会因为用户的触控操作的影响而造成文本内容的获取失败或错误。针对上述技术问题,发明人提出了本申请实施例提供的文本识别方法、装置、移动终端以及存储介质,通过图像识别技术,提升取词识别的快捷性和准确性,以提升用户体验。其中,具体的文本识别方法在后续的实施例中进行详细的说明。
请参阅图1,图1示出了本申请一个实施例提供的文本识别方法的流程示意图。所述文本识别方法用于通过图像识别技术,提升取词识别的快捷性和准确性,以提升用户体验。在具体的实施例中,所述文本识别方法应用于如图8所示的文本识别装置200以及配置有所述文本识别装置200的移动终端(图10)。下面将以移动终端为例,说明本实施例的具体流程,当然,可以理解的,本实施例所应用的移动终端可以为智能手机、平板电脑、穿戴式电子设备等,在此不做具体的限定。下面将针对图1所示的流程进行详细的阐述,所述文本识别方法具体可以包括以下步骤:
步骤S110:检测作用于用户界面的触控操作,当所述触控操作满足预设条件时,对与所述触控操作的位置对应的所述用户界面上的界面元素进行识别。
在本实施例中,对作用于用户界面的触控操作进行检测,作为一种方式,该触控操作可以包括单指点击、多指点击、单指长按、多指长按、重压、多次点击、滑动操作、复制操作、按压面积等,其中,所述单指点击是指单指在用户界面上进行点击的操作;多指点击是指多指在用户界面上同时进行点击的操作;单指长按是指单指在用户界面上按压超过预设时长;多指长按是指多指同时在用户界面上按压超过预设时长;重压是指在用户界面上按压力度超过预设力度;多次点击是指在预设时间内点击次数超过预设次数;滑动操作是指单指在用户界面上进行滑动的操作;复制操作是指在用户界面将文本信息复制到粘贴板的操作;按压面积是指在用户界面上的单指按压面积超过预设面积。
进一步地,移动终端预先设置并存储有预设条件,其中,该预设条件用于作为触控操作的判断依据,即在检测获取所述触控操作后,将该触控操作与预设条件进行比较,以判断该触控操作是否满足预设条件,作为一种方式,当该触控操作满足预设条件时,获取该触控操作的位置,例如,获取该触控操作的位置对应的坐标信息,然后对位于该触控操作的位置对应的用户界面上的界面元素进行识别。具体地,所述界面元素包括但不仅限于文本、图片、音频以及视频,同时,所述移动终端可以基于该触控操作的位置确定至少一个界面元素并识别,例如,可以对该触控操作的位置处的界面元素进行识别、可以对与该触控操作的位置处的界面元素位于同一段落的所有界面元素均进行识别等,在此不做具体的限定。
步骤S120:当未识别成功时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别。
其中,当触控操作的位置对应的界面元素未识别成功,即对该触控操作的位置对应的界面元素进行识别的结果为空时,自动对触控操作的位置对应的文本进行图像截取获取控件图像,并对该控件图像进行识别。具体地,系统自动获取触控操作的位置对应的文本所在的控件,并对该控件进行截取,然后通过后台调用图像转文字识别OCR(Optical Character Recognition,光学字符识别)模块进行识别。其中,图像转文字识别为利用图像转文字识别技术,可以采取离线,即将图像转文字的识别库移植到移动终端的方式。具体地,根据移动终端内的图像转文字识别库对图片信息进行图像转文字识别操作;也可以通过在线的方式,即将图像传送至远程图像转文字服务器进行识别。将图片信息上传至图像转文字服务器,图像转文字服务器根据内部的图像转文字识别库图片信息进行图像转文字识别操作,并将识别结果发送至移动终端。进一步地,图像转文字除了识别返回图像中的文字信息以外,还可以附带每个文字的x坐标、y坐标、宽度以及高度等,在此不再赘述。
作为一种方式,在系统进行图像转文字识别的过程中,所述移动终端的用户界面可以显示提示信息,其中,该提示信息用于提示用户当前正在进行图像转文字识别操作。
步骤S130:在所述用户界面的部分区域上叠加显示至少一个卡片,所述至少一个卡片用于显示由所述控件图像识别出的信息。
请参阅图2,图2示出了本申请实施例提供的移动终端的用户界面的一种示意图。该用户界面的部分区域可以位于该用户界面的下半部的部分区域、可以位于该用户界面的上半部的部分区域、可以位于该用户界面的左半部的部分区域、也可以位于该用户界面的右半部的部分区域等,可选的,在本实施例中,所述部分区域位于用户界面的下半部靠近底部的区域,其大小不做具体的限定。具体地,对截取的控件图像进行图像转文字识别,获取该控件图像中的至少一个关键词,并对所述至少一个关键词进行搜索,以获取与所述控件图像的内容对应的搜索结果信息,该搜索结果信息以卡片的形式进行显示,其中,所述卡片作为承载所述搜索结果信息的载体,每个卡片至少可以显示一个搜索结果信息,至少一个卡片中的每个卡片显示的搜索结果信息的数量可以相同,也可以不同,且每个卡片显示的搜索结果信息可以来自同一应用程序,或来自不同的应用程序。进一步地,所述至少一个卡片还可以显示分词信息,即可以显示基于该控件图像进行识别后,获取的至少一个关键词,用户可以基于该分词信息进行选词编辑,例如,对分词信息中的关键词进行搜索、翻译、分享等。
进一步地,所述至少一个卡片以叠加的形式显示在所述用户界面的部分区域上,可以理解的,此时,所述卡片可以层叠显示在所述用户界面的部分区域的上方,所述卡片也可以覆盖所述用户界面的部分区域且与所述用户界面在不同层级显示。另外,在本实施例中,当所述至少一个卡片叠加显示在所述用户界面的部分区域时,位于所述部分区域的原始内容仍部分可见,不完全被遮挡,以供用户点击操作。
本申请一个实施例提供的文本识别方法,检测作用于用户界面的触控操作,当该触控操作满足预设条件时,对与该触控操作的位置对应的用户界面上的界面元素进行识别,当未识别成功时,截取与该触控操作的位置对应的控件图像并对该控件图像进行识别,在用户界面的部分区域上叠加显示至少一个卡片,该至少一个卡片用于显示由该控件图像识别出的信息,通过图像识别技术,提升取词识别的快捷性和准确性,以提升用户体验。
请参阅图3,图3示出了本申请又一个实施例提供的文本识别方法的流程示意图。下面将针对图3所示的流程进行详细的阐述,所述方法具体可以包括以下步骤:
步骤S210:检测作用于用户界面的触控操作,当所述触控操作满足预设条件时,对与所述触控操作的位置对应的所述用户界面上的界面元素进行识别。
步骤S220:当未识别成功时,获取所述用户界面对应的应用程序。
其中,一个应用程序包括有多个用户界面,在获取用户界面后,基于该用户界面可以获取其所对应的应用程序。作为一种方式,通过所述用户界面,可以获取该应用程序的类型、获取该应用程序的名称或获取该应用程序的用途等。
步骤S230:判断所述应用程序是否为重点应用程序,若否,执行步骤S270,若是,执行步骤S240。
进一步地,移动终端预先设置并存储有重点应用程序,该重点应用程序用于作为应用程序的判断依据,其中,重点应用程序可以为系统原生应用程序,也可以为用户下载安装的第三方应用程序,并且,该重点应用程序可以由移动终端系统预先自行配置,也可以由用户手动配置等。具体地,当重点应用程序由移动终端系统自行配置时,该系统可以根据应用程序的使用频率进行配置,例如,将使用频率高于某个频率阈值的应用程序作为重点应用程序,将使用频率不高于某个频率阈值的应用程序作为非重点应用程序;或者当重点应用程序由移动终端系统自行配置时,可以根据应用程序的类型进行配置,例如,将文本显示类或即时通讯类的应用程序作为非重点应用程序,如微信、QQ、微博、新闻类、浏览器类,将视频显示类的应用程序作为重点应用程序等。当重点应用程序由用户手动配置时,可以根据用户的喜好或需求选择一个或多个应用程序作为重点应用程序。
步骤S240:当所述应用程序是所述重点应用程序时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别。
其中,在判断所述应用程序是重点应用程序时,对触控操作的位置对应的文本进行图像截取获取控件图像,并对该控件图像进行识别。
请参阅图4,图4示出了本申请的图3所示的实施例提供的文本识别方法的步骤S240的流程示意图。下面将针对图4所示的流程进行详细的阐述,所述方法具体可以包括以下步骤:
步骤S241:当所述应用程序是所述重点应用程序时,获取与所述触控操作的位置对应的控件类型。
作为一种方式,在判断所述应用程序是重点应用程序时,对当前触控操作的位置对应的控件的控件类型进行检测并获取。可以理解的,该控件类型至少可以包括文本类型、图片类型、视频类型等。
步骤S242:判断所述控件类型是否满足预设类型。
进一步地,所述移动终端预先设置并存储有预设类型,该预设类型用于作为控件类型的判断依据,作为一种方式,该预设类型可以为text view,因此,在检测获取所述控件类型后,将控件类型与text view进行比较,以判断该控件类型是否满足text view。
步骤S243:当所述控件类型满足预设类型时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别。
其中,当判断该控件类型满足预设类型时,则截取与触控操作的位置对应的控件图像并进行自动OCR对控件图像进行识别。
步骤S250:判断所述控件图像是否能够识别出有效信息,其中,所述有效信息的置信概率高于预设值;若是,执行步骤S260,若否,执行步骤S270。
进一步地,对所述控件图像是否能够识别出有效信息进行判断,其中,该有效信息的置信概率高于预设值,具体地,对将控件图像进行识别后获取的信息进行检测,作为一种方式,首先检测该信息是否包含文本信息,当该信息不包含文本信息时,表征该信息为空,识别失败;当该信息包含文本信息时,继续获取该文本信息的置信概率并进行判断,作为一种方式,移动终端预先存储有置信概率的算法和预设值,通过该算法可以计算该信息的置信概率,再将该置信概率与预设值进行比较,以判断该置信概率是否高于该预设值,其中,当该置信概率高于预设值时,表征该控件图像能够识别出有效信息。
作为一种方式,若对控件图像进行解析识别时出现乱码文本,则需要对解析出来的结果 进行初步筛选,对乱码和字符进行过滤,过滤后如果没有有效信息,则在用户界面显示选择控件,如果有有效信息,则在用户界面的部分区域上叠加显示至少一个卡片。
步骤S260:在所述用户界面的部分区域上叠加显示至少一个卡片。
进一步地,若该控件图像能够识别出该有效信息,则展示结果,即在用户界面的部分区域上叠加显示至少一个卡片。作为一种方式,在所述卡片下方显示选择控件与所述卡片位于同一界面,以在用户对该有效信息不满意时,提供用户进行手动框选的入口。
步骤S270:在所述用户界面显示选择控件,其中,所述选择控件用于触发手动框选或取消识别。
请参阅图5,图5示出了本申请实施例提供的移动终端的用户界面的另一种示意图。进一步地,若该应用程序不是重点应用程序或者该控件图像不能够识别出该有效信息,则在用户界面显示所述选择控件,其中,该选择控件用于触发手动框选或取消识别。
请参阅图6,图6示出了本申请的图3所示的实施例提供的文本识别方法的步骤S270的流程示意图。下面将针对图6所示的流程进行详细的阐述,所述方法具体可以包括以下步骤:
步骤S271:获取对所述控件图像进行识别的时长。
步骤S272:判断所述时长是否超过预设时长。
作为一种方式,在系统对控件图像进行识别时,对其识别的时长进行获取,并将该时长与预设时长进行比较,其中,该预设时长在移动终端中预先设置并存储用于作为该时长的判断依据,例如,该预设时长可以为8s、10s等。在本实施例中,当该时长超过预设时长时,表征控件图像识别过长,识别失败,则在用户界面显示选择控件,由用户选择是否继续进行手动框选。
步骤S273:当所述时长超过所述预设时长时,在所述用户界面显示所述选择控件。
作为一种方式,若用户选择手动框选,在手动框选对应的区域后,在框选控件的下方显示二维码识别控件、商品识别控件以及文本识别控件,根据用户触发的二维码识别控件可对截取图像进行二维码识别;根据用户触发的商品识别控件可以对截取图像的进行商品识别;根据用户触发的文本识别控件对截取图像进行文本识别。进一步地,在识别过程中,在用户界面显示一个圆圈进度提示,并在识别结束后弹出响应的卡片。
本申请又一个实施例提供的文本识别方法,检测作用于用户界面的触控操作,当该触控操作满足预设条件时,对与该触控操作的位置对应的用户界面上的界面元素进行识别,当未识别成功时,获取用户界面对应的应用程序,判断该应用程序是否为重点应用程序,当该应用程序不是重点应用程序时,在用户界面显示选择控件,该选择控件用于触发手动款选或取消识别,当应用程序是重点应用程序时,截取与触控操作的位置对应的控件图像并对该控件图像进行识别,判断该控件图像是否能够识别出有效信息,其中,该有效信息的置信概率高于预设值,当该控件图像能够识别出有效信息时,在用户界面的部分区域上叠加显示至少一个卡片,当控件图像不能识别出有效信息时,在用户界面显示选择控件,从而通过自动框选识别和手动框选识别的方式,提升取词识别的快捷性和准确性,以提升用户体验。
请参阅图7,图7示出了本申请再一个实施例提供的文本识别方法的流程示意图。下面将针对图7所示的流程进行详细的阐述,所述方法具体可以包括以下步骤:
步骤S310:检测作用于用户界面的触控操作,当所述触控操作满足预设条件时,对与所述触控操作的位置对应的所述用户界面上的界面元素进行识别。
步骤S320:当未识别成功时,获取所述触控操作对应的触控中心位置。
在本实施例中,当未识别成功时,获取该触控操作对应的触控中心位置,作为一种方式,获取该触控操作的触控区域,基于该触控区域进行计算,获取该触控区域的中心位置,其中,该中心位置为触控操作对应的触控中心位置。
步骤S330:判断所述触控中心位置是否在所述用户界面的有效控件上,其中,所述有效控件至少包括一个界面元素。
进一步地,用户界面包括有多个控件,作为一种方式,可以通过判断该多个控件是否包 括有界面元素对控件进行划分,其中,当控件包括至少一个界面元素时,则可以将该控件视作有效控件;当该控件没有包括界面元素时,则可以将该控件视作空白控件或无效控件。在本实施例中,对各个有效控件的坐标位置进行检测,并通过该触控中心位置的坐标位置和有效控件的坐标位置判断该触控中心位置是否在该有效控件上。
步骤S340:当所述触控中心位置在所述有效控件上时,截取有效控件图像并对所述有效控件图像进行识别。
作为一种方式,当该触控中心位置在有效控件上,则截取该有效控件图像并对该有效控件图像进行图像转文字识别。其中,可以对该有效控件上的文本进行截取识别。
步骤S350:当所述触控中心位置不在所述有效控件上时,截取用户界面图像并对所述用户界面图像进行识别。
作为另一种方式,当该触控中心位置不再有效控件上时,则截取用户界面图像并对该用户界面图像进行图像转文字识别。其中,可以对该用户界面的全屏进行截取识别。
步骤S360:在所述用户界面的部分区域上叠加显示至少一个卡片,所述至少一个卡片用于显示由所述有效控件图像或所述用户界面图像识别出的信息。
本申请再一个实施例提供的文本识别方法,检测作用于用户界面的触控操作,当该触控操作满足预设条件时,对于该触控操作的位置对应的用户界面上的界面元素进行识别,当未识别成功时,获取触控操作对应的触控中心位置,判断该触控中心位置是否在用户界面的有效控件上,其中,该有效控件至少包括一个界面元素,当该触控中心位置在有效控件上时,截取有效控件图像并对有效控件图像进行识别,当触控中心位置不在有效控件上时,截取用户界面图像并对用户界面图像进行识别,从而根据触控中心位置进行图像截取识别,提升取词识别的快捷性,以提升用户体验。
请参阅图8,图8示出了本申请一格实施例提供的文本识别装置200的模块框图。下面将针对图8所示的框图进行阐述,所述文本识别装置200包括:界面元素识别模块210、图像截取模块220以及卡片显示模块230,其中:
界面元素识别模块210,用于检测作用于用户界面的触控操作,当所述触控操作满足预设条件时,对所述触控操作的位置对应的所述用户界面上的界面元素进行识别。
图像截取模块220,用于当未识别成功时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别。请参阅图9,图9示出了本申请又一个实施例提供的文本识别装置200的模块框图,进一步地,所述图像截取模块220包括:应用程序获取子模块221、应用程序判断子模块222、控件图像识别子模块223、选择控件显示子模块224、触控中心位置获取子模块225、触控中心位置判断子模块226、有效控件图像识别子模块227以及用户界面图像识别子模块228,其中:
应用程序获取子模块221,用于当未识别成功时,获取所述用户界面对应的应用程序。
应用程序判断子模块222,用于断所述应用程序是否为重点应用程序。
控件图像识别子模块223,用于当所述应用程序是所述重点应用程序时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别。进一步地,所述控件图像识别子模块223包括:控件类型获取单元、控件类型判断单元以及控件图像识别单元,其中:
控件类型获取单元,用于当所述应用程序是所述重点应用程序时,获取与所述触控操作的位置对应的控件类型。
控件类型判断单元,用于判断所述控件类型是否满足预设类型。
控件图像识别单元,用于当所述控件类型满足预设类型时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别。
选择控件显示子模块224,用于当所述应用程序不是所述重点应用程序时,在所述用户界面显示选择控件,其中,所述选择控件用于触发手动框选或取消识别。
触控中心位置获取子模块225,用于当未识别成功时,获取所述触控操作对应的触控中心位置。
触控中心位置判断子模块226,用于判断所述触控中心位置是否在所述用户界面的有效控件上,其中,所述有效控件至少包括一个界面元素。
有效控件图像识别子模块227,用于当所述触控中心位置在所述有效控件上时,截取有效控件图像并对所述有效控件图像进行识别。
用户界面图像识别子模块228,用于当所述触控中心位置不在所述有效控件上时,截取用户界面图像并对所述用户界面图像进行识别。
卡片显示模块230,用于在所述用户界面的部分区域上叠加显示至少一个卡片,所述至少一个卡片用于显示由所述控件图像识别出的信息。进一步地,所述卡片显示模块230包括:有效信息判断子模块231、卡片显示子模块232以及选择控件显示子模块233,其中:
有效信息判断子模块231,用于判断所述控件图像是否能够识别出有效信息,其中,所述有效信息的置信概率高于预设值。
卡片显示子模块232,用于当所述控件图像能够识别出所述有效信息时,在所述用户界面的部分区域上叠加显示至少一个卡片。
选择控件显示子模块233,用于当所述控件图像不能识别出所述有效信息时,在所述用户界面显示所述选择控件。进一步地,所述选择控件显示子模块233包括:时长获取单元、时长判断单元以及选择控件显示单元,其中:
时长获取单元,用于获取对所述控件图像进行识别的时长。
时长判断单元,用于判断所述时长是否超过预设时长。
选择控件显示单元,用于当所述时长超过所述预设时长时,在所述用户界面显示所述选择控件。
综上所述,本申请实施例提供的一种文本识别方法、装置、移动终端以及存储介质,检测作用于用户界面的触控操作,当该触控操作满足预设条件时,对于该触控操作的位置对应的用户界面上的界面元素进行识别,当未识别成功时,截取与该触控操作的位置对应的控件图像并对该控件图像进行识别,在用户界面的部分区域上叠加显示至少一个卡片,该至少一个卡片用于显示由该控件图像识别出的信息,通过图像识别技术,提升取词识别的准确性,以提升用户体验。
需要说明的是,本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。对于装置类实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。对于方法实施例中的所描述的任意的处理方式,在装置实施例中均可以通过相应的处理模块实现,装置实施例中不再一一赘述。
请再次参阅图10,基于上述的文本识别方法、装置,本申请实施例还提供一种移动终端100,其包括电子本体部10,所述电子本体部10包括壳体12及设置在所述壳体12上的主显示屏120。所述壳体12可采用金属、如钢材、铝合金制成。本实施例中,所述主显示屏120通常包括显示面板111,也可包括用于响应对所述显示面板111进行触控操作的电路等。所述显示面板111可以为一个液晶显示面板(Liquid Crystal Display,LCD),在一些实施例中,所述显示面板111同时为一个触摸屏109。
请同时参阅图11,在实际的应用场景中,所述移动终端100可作为智能手机终端进行使用,在这种情况下所述电子本体部10通常还包括一个或多个(图中仅示出一个)处理器102、存储器104、RF(Radio Frequency,射频)模块106、音频电路110、传感器114、输入模块118、电源模块122。本领域普通技术人员可以理解,图11所示的结构仅为示意,其并不对所述电子本体部10的结构造成限定。例如,所述电子本体部10还可包括比图11中所示更多或者更少的组件,或者具有与图11所示不同的配置。
本领域普通技术人员可以理解,相对于所述处理器102来说,所有其它的组件均属于外设,所述处理器102与这些外设之间通过多个外设接口124相耦合。所述外设接口124可基于以下标准实现:通用异步接收/发送装置(Universal Asynchronous  Receiver/Transmitter,UART)、通用输入/输出(General Purpose Input Output,GPIO)、串行外设接口(Serial Peripheral Interface,SPI)、内部集成电路(Inter-Integrated Circuit,I2C),但不并限于上述标准。在一些实例中,所述外设接口124可仅包括总线;在另一些实例中,所述外设接口124还可包括其它元件,如一个或者多个控制器,例如用于连接所述显示面板111的显示控制器或者用于连接存储器的存储控制器。此外,这些控制器还可以从所述外设接口124中脱离出来,而集成于所述处理器102内或者相应的外设内。
所述存储器104可用于存储软件程序以及模块,所述处理器102通过运行存储在所述存储器104内的软件程序以及模块,从而执行各种功能应用以及数据处理。所述存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其它非易失性固态存储器。在一些实例中,所述存储器104可进一步包括相对于所述处理器102远程设置的存储器,这些远程存储器可以通过网络连接至所述电子本体部10或所述主显示屏120。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
所述RF模块106用于接收以及发送电磁波,实现电磁波与电信号的相互转换,从而与通讯网络或者其它设备进行通讯。所述RF模块106可包括各种现有的用于执行这些功能的电路元件,例如,天线、射频收发器、数字信号处理器、加密/解密芯片、用户身份模块(SIM)卡、存储器等等。所述RF模块106可与各种网络如互联网、企业内部网、无线网络进行通讯或者通过无线网络与其它设备进行通讯。上述的无线网络可包括蜂窝式电话网、无线局域网或者城域网。上述的无线网络可以使用各种通信标准、协议及技术,包括但并不限于全球移动通信系统(Global System for Mobile Communication,GSM)、增强型移动通信技术(Enhanced Data GSM Environment,EDGE),宽带码分多址技术(wideband code division multiple access,W-CDMA),码分多址技术(Code division access,CDMA)、时分多址技术(time division multiple access,TDMA),无线保真技术(Wireless,Fidelity,WiFi)(如美国电气和电子工程师协会标准IEEE 802.10A,IEEE 802.11b,IEEE802.11g和/或IEEE 802.11n)、网络电话(Voice over internet protocal,VoIP)、全球微波互联接入(Worldwide Interoperability for Microwave Access,Wi-Max)、其它用于邮件、即时通讯及短消息的协议,以及任何其它合适的通讯协议,甚至可包括那些当前仍未被开发出来的协议。
音频电路110、听筒101、声音插孔103、麦克风105共同提供用户与所述电子本体部10或所述主显示屏120之间的音频接口。具体地,所述音频电路110从所述处理器102处接收声音数据,将声音数据转换为电信号,将电信号传输至所述听筒101。所述听筒101将电信号转换为人耳能听到的声波。所述音频电路110还从所述麦克风105处接收电信号,将电信号转换为声音数据,并将声音数据传输给所述处理器102以进行进一步的处理。音频数据可以从所述存储器104处或者通过所述RF模块106获取。此外,音频数据也可以存储至所述存储器104中或者通过所述RF模块106进行发送。
所述传感器114设置在所述电子本体部10内或所述主显示屏120内,所述传感器114的实例包括但并不限于:光传感器、运行传感器、压力传感器、重力加速度传感器、以及其它传感器。
具体地,所述传感器114可包括光线传感器114F、压力传感器114G。其中,压力传感器114G可以检测由按压在移动终端100产生的压力的传感器。即,压力传感器114G检测由用户和移动终端之间的接触或按压产生的压力,例如由用户的耳朵与移动终端之间的接触或按压产生的压力。因此,压力传感器114G可以用来确定在用户与移动终端100之间是否发生了接触或者按压,以及压力的大小。
请再次参阅图11,具体地在图11所示的实施例中,所述光线传感器114F及所述压力传感器114G邻近所述显示面板111设置。所述光线传感器114F可在有物体靠近所述 主显示屏120时,例如所述电子本体部10移动到耳边时,所述处理器102关闭显示输出。
作为运动传感器的一种,重力加速度传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别所述移动终端100姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等。另外,所述电子本体部10还可配置陀螺仪、气压计、湿度计、温度计等其它传感器,在此不再赘述,
本实施例中,所述输入模块118可包括设置在所述主显示屏120上的所述触摸屏109,所述触摸屏109可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在所述触摸屏109上或在所述触摸屏109附近的操作),并根据预先设定的程序驱动相应的连接装置。可选的,所述触摸屏109可包括触摸检测装置和触摸控制器。其中,所述触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给所述触摸控制器;所述触摸控制器从所述触摸检测装置上接收触摸信息,并将该触摸信息转换成触点坐标,再送给所述处理器102,并能接收所述处理器102发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现所述触摸屏109的触摸检测功能。除了所述触摸屏109,在其它变更实施方式中,所述输入模块118还可以包括其它输入设备,如按键107。所述按键107例如可包括用于输入字符的字符按键,以及用于触发控制功能的控制按键。所述控制按键的实例包括“返回主屏”按键、开机/关机按键等等。
所述主显示屏120用于显示由用户输入的信息、提供给用户的信息以及所述电子本体部10的各种图形用户接口,这些图形用户接口可以由图形、文本、图标、数字、视频和其任意组合来构成,在一个实例中,所述触摸屏109可设置于所述显示面板111上从而与所述显示面板111构成一个整体。
所述电源模块122用于向所述处理器102以及其它各组件提供电力供应。具体地,所述电源模块122可包括电源管理系统、一个或多个电源(如电池或者交流电)、充电电路、电源失效检测电路、逆变器、电源状态指示灯以及其它任意与所述电子本体部10或所述主显示屏120内电力的生成、管理及分布相关的组件。
所述移动终端100还包括定位器119,所述定位器119用于确定所述移动终端100所处的实际位置。本实施例中,所述定位器119采用定位服务来实现所述移动终端100的定位,所述定位服务,应当理解为通过特定的定位技术来获取所述移动终端100的位置信息(如经纬度坐标),在电子地图上标出被定位对象的位置的技术或服务。
应当理解的是,上述的移动终端100并不局限于智能手机终端,其应当指可以在移动中使用的计算机设备。具体而言,移动终端100,是指搭载了智能操作系统的移动计算机设备,移动终端100包括但不限于智能手机、智能手表、平板电脑,等等。
请参阅图12,图12示出了本申请实施例提供的一种计算机可读存储介质的结构框图。计算机可读存储介质300中存储有程序代码,所述程序代码可被处理器调用执行上述方法实施例中所描述的方法。
计算机可读存储介质300可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。可选地,计算机可读存储介质300包括非易失性计算机可读介质(non-transitory computer-readable storage medium)。计算机可读存储介质300具有执行上述方法中的任何方法步骤的程序代码310的存储空间。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。程序代码310可以例如以适当形式进行压缩。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以 在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本申请的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
流程图中或在此以其它方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现特定逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本申请的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本申请的实施例所属技术领域的技术人员所理解。
在流程图中表示或在此以其它方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其它可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,"计算机可读介质"可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(移动终端),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其它合适的介质,因为可以例如通过对纸或其它介质进行光学扫描,接着进行编辑、解译或必要时以其它合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。
应当理解,本申请的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。此外,在本申请各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。
上述提到的存储介质可以是只读存储器,磁盘或光盘等。尽管上面已经示出和描述了本申请的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本申请的限制,本领域的普通技术人员在本申请的范围内可以对上述实施例进行变化、修改、替换和变型。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不驱使相应技术方案的本质脱离本申请各实施例技术方案的精 神和范围。

Claims (20)

  1. 一种文本识别方法,其特征在于,所述方法包括:
    检测作用于用户界面的触控操作,当所述触控操作满足预设条件时,对与所述触控操作的位置对应的所述用户界面上的界面元素进行识别;
    当未识别成功时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别;
    在所述用户界面的部分区域上叠加显示至少一个卡片,所述至少一个卡片用于显示由所述控件图像识别出的信息。
  2. 根据权利要求1所述的方法,其特征在于,所述当未识别成功时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别,还包括:
    当未识别成功时,获取所述用户界面对应的应用程序;
    判断所述应用程序是否为重点应用程序;
    当所述应用程序是所述重点应用程序时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别;
    当所述应用程序不是所述重点应用程序时,在所述用户界面显示选择控件,其中,所述选择控件用于触发手动框选或取消识别。
  3. 根据权利要求2所述的方法,其特征在于,所述当所述应用程序是所述重点应用程序时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别,包括:
    当所述应用程序是所述重点应用程序时,获取与所述触控操作的位置对应的控件类型;
    判断所述控件类型是否满足预设类型;
    当所述控件类型满足预设类型时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别。
  4. 根据权利要求3所述的方法,其特征在于,所述在所述用户界面的部分区域上叠加显示至少一个卡片,包括:
    判断所述控件图像是否能够识别出有效信息,其中,所述有效信息的置信概率高于预设值;
    当所述控件图像能够识别出所述有效信息时,在所述用户界面的部分区域上叠加显示至少一个卡片;
    当所述控件图像不能识别出所述有效信息时,在所述用户界面显示所述选择控件。
  5. 根据权利要求4所述的方法,其特征在于,所述当所述控件图像不能识别出所述有效信息时,在所述用户界面显示所述选择控件,包括:
    获取对所述控件图像进行识别的时长;
    判断所述时长是否超过预设时长;
    当所述时长超过所述预设时长时,在所述用户界面显示所述选择控件。
  6. 根据权利要求4所述的方法,其特征在于,所述当所述控件图像不能识别出所述有效信息时,在所述用户界面显示所述选择控件,还包括:
    当所述控件图像不能识别出所述有效信息时,在所述用户界面显示所述选择控件和识别不成功的提示信息。
  7. 根据权利要求3-6任一项所述的方法,其特征在于,所述预设类型为text view。
  8. 根据权利要求2-7任一项所述的方法,其特征在于,所述判断所述应用程序是否为重点应用程序,包括:
    获取所述应用程序的使用频率;
    判断所述使用频率是否大于频率阈值;
    当所述使用频率大于所述频率阈值时,确定所述应用程序为所述重点应用程序。
  9. 根据权利要求2-7任一项所述的方法,其特征在于,所述判断所述应用程序是否为重 点应用程序,包括:
    获取所述应用程序的类型;
    判断所述类型是否为预设类型;
    当所述类型为所述预设类型时,确定所述应用程序为所述重点应用程序。
  10. 根据权利要求2-9任一项所述的方法,其特征在于,所述在所述用户界面显示选择控件之后,还包括:
    若检测到针对所述选择控件的手段框选操作,在所述用户界面显示框选控件、二维码识别控件、商品识别控件以及文本识别控件。
  11. 根据权利要求1-10任一项所述的方法,其特征在于,所述当未识别成功时,截取与所述触控操作位置对应的控件图像并对所述控件图像进行识别,还包括:
    当未识别成功时,获取所述触控操作对应的触控中心位置;
    判断所述触控中心位置是否在所述用户界面的有效控件上,其中,所述有效控件至少包括一个界面元素;
    当所述触控中心位置在所述有效控件上时,截取有效控件图像并对所述有效控件图像进行识别;
    当所述触控中心位置不在所述有效控件上时,截取用户界面图像并对所述用户界面图像进行识别。
  12. 根据权利要求11所述的方法,其特征在于,所述判断所述触控中心位置是否在所述用户界面的有效控件上,包括:
    检测所述用户界面的各有效控件的坐标信息和所述触控中心的坐标信息;
    判断所述触控中心的坐标信息是否落在所述各有效控件中的任意有效控件的坐标信息内。
  13. 根据权利要求1-12任一项所述的方法,其特征在于,所述对与所述触控操作的位置对应的所述用户界面上的界面元素进行识别,包括:
    获取所述触控操作的位置对应的坐标信息;
    获取位于所述坐标信息上的至少一个界面元素,对所述至少一个界面元素进行识别。
  14. 根据权利要求1-12任一项所述的方法,其特征在于,所述对与所述触控操作的位置对应的所述用户界面上的界面元素进行识别,包括:
    获取所述触控操作的位置对应的坐标信息;
    获取位于所述坐标信息上的至少一个元素,并获取所述至少一个元素所在的段落;
    获取所述段落所包括的所有元素,将所述段落所包括的所有元素确定为所述至少一个界面元素。
  15. 根据权利要求1-14任一项所述的方法,其特征在于,所述截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别,包括:
    截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别,以及在所述用户界面显示提示信息,其中,所述提示信息用于提示用户正在进行图像转文字识别操作。
  16. 根据权利要求1-15任一项所述的方法,其特征在于,所述截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别,包括:
    截取与所述触控操作的位置对应的控件图像并获取所述控件图像中的至少一个关键词;
    对所述至少一个关键词进行搜索,获得与所述控件图像的内容对应的搜索结果信息。
  17. 根据权利要求1-16任一项所述的方法,其特征在于,所述界面元素包括文本、图片、音频和/或视频。
  18. 一种文本识别装置,其特征在于,所述装置包括:
    界面元素识别模块,用于检测作用于用户界面的触控操作,当所述触控操作满足预设 条件时,对所述触控操作的位置对应的所述用户界面上的界面元素进行识别;
    图像截取模块,用于当未识别成功时,截取与所述触控操作的位置对应的控件图像并对所述控件图像进行识别;
    卡片显示模块,用于在所述用户界面的部分区域上叠加显示至少一个卡片,所述至少一个卡片用于显示由所述控件图像识别出的信息。
  19. 一种移动终端,其特征在于,包括触摸屏、存储器以及处理器,所述触摸屏与所述存储器耦接到所述处理器,所述存储器存储指令,当所述指令由所述处理器执行时所述处理器执行如权利要求1-17任一项所述的方法。
  20. 一种计算机可读取存储介质,其特征在于,所述计算机可读取存储介质村存储有程序代码,所述程序代码可被处理器调用执行如权利要求1-17任一项所述的方法。
PCT/CN2019/084377 2018-06-07 2019-04-25 文本识别方法、装置、移动终端以及存储介质 WO2019233212A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810586716.2A CN109002759A (zh) 2018-06-07 2018-06-07 文本识别方法、装置、移动终端以及存储介质
CN201810586716.2 2018-06-07

Publications (1)

Publication Number Publication Date
WO2019233212A1 true WO2019233212A1 (zh) 2019-12-12

Family

ID=64600402

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/084377 WO2019233212A1 (zh) 2018-06-07 2019-04-25 文本识别方法、装置、移动终端以及存储介质

Country Status (2)

Country Link
CN (1) CN109002759A (zh)
WO (1) WO2019233212A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598128A (zh) * 2020-04-09 2020-08-28 腾讯科技(上海)有限公司 用户界面的控件状态识别、控制方法、装置、设备及介质
CN113159029A (zh) * 2020-12-18 2021-07-23 深圳简捷电子科技有限公司 一种图片中局部信息精准抓取的方法和系统
CN113900621A (zh) * 2021-11-09 2022-01-07 杭州逗酷软件科技有限公司 操作指令处理方法、控制方法、装置以及电子设备
CN114967994A (zh) * 2021-02-26 2022-08-30 Oppo广东移动通信有限公司 文本处理方法、装置以及电子设备
CN115035520A (zh) * 2021-11-22 2022-09-09 荣耀终端有限公司 图像的文字识别方法、电子设备及存储介质
CN117148981A (zh) * 2023-08-28 2023-12-01 广州文石信息科技有限公司 墨水屏的文字输入方法、装置、设备以及存储介质

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109002759A (zh) * 2018-06-07 2018-12-14 Oppo广东移动通信有限公司 文本识别方法、装置、移动终端以及存储介质
CN109803050B (zh) * 2019-01-14 2020-09-25 南京点明软件科技有限公司 一种适用于盲人操作手机的全屏引导点击方法
CN109857673B (zh) * 2019-02-25 2022-02-15 北京云测信息技术有限公司 控件识别方法和装置
CN110069161B (zh) * 2019-04-01 2023-03-10 努比亚技术有限公司 屏幕识别方法、移动终端和计算机可读存储介质
CN110704153B (zh) * 2019-10-10 2021-11-19 深圳前海微众银行股份有限公司 界面逻辑解析方法、装置、设备及可读存储介质
CN114332887A (zh) * 2019-12-26 2022-04-12 腾讯科技(深圳)有限公司 一种图像处理方法、装置、计算机设备和存储介质
CN111338540B (zh) * 2020-02-11 2022-02-18 Oppo广东移动通信有限公司 图片文本处理方法、装置、电子设备和存储介质
CN111242109B (zh) * 2020-04-26 2021-02-02 北京金山数字娱乐科技有限公司 一种手动取词的方法及装置
CN112068763A (zh) * 2020-09-22 2020-12-11 深圳市欢太科技有限公司 目标信息的管理方法、装置、电子设备以及存储介质
CN114564141A (zh) * 2020-11-27 2022-05-31 华为技术有限公司 文本提取方法及装置
CN112596656A (zh) * 2020-12-28 2021-04-02 北京小米移动软件有限公司 内容识别方法、装置及存储介质
CN112822539B (zh) * 2020-12-30 2023-07-14 咪咕文化科技有限公司 信息显示方法、装置、服务器及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294665A (zh) * 2012-02-22 2013-09-11 汉王科技股份有限公司 电子阅读器文本翻译的方法及电子阅读器
CN105045504A (zh) * 2015-07-23 2015-11-11 小米科技有限责任公司 图像内容提取方法及装置
US9690474B2 (en) * 2007-12-21 2017-06-27 Nokia Technologies Oy User interface, device and method for providing an improved text input
CN106951893A (zh) * 2017-05-08 2017-07-14 奇酷互联网络科技(深圳)有限公司 文字信息获取方法、装置及移动终端
CN107256109A (zh) * 2017-05-27 2017-10-17 北京小米移动软件有限公司 信息显示方法、装置及终端
CN107391017A (zh) * 2017-07-20 2017-11-24 广东欧珀移动通信有限公司 文字处理方法、装置、移动终端及存储介质
CN109002759A (zh) * 2018-06-07 2018-12-14 Oppo广东移动通信有限公司 文本识别方法、装置、移动终端以及存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778194A (zh) * 2014-12-26 2015-07-15 北京奇虎科技有限公司 基于触摸操作的搜索方法和装置
CN111381751A (zh) * 2016-10-18 2020-07-07 北京字节跳动网络技术有限公司 一种文本处理方法及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9690474B2 (en) * 2007-12-21 2017-06-27 Nokia Technologies Oy User interface, device and method for providing an improved text input
CN103294665A (zh) * 2012-02-22 2013-09-11 汉王科技股份有限公司 电子阅读器文本翻译的方法及电子阅读器
CN105045504A (zh) * 2015-07-23 2015-11-11 小米科技有限责任公司 图像内容提取方法及装置
CN106951893A (zh) * 2017-05-08 2017-07-14 奇酷互联网络科技(深圳)有限公司 文字信息获取方法、装置及移动终端
CN107256109A (zh) * 2017-05-27 2017-10-17 北京小米移动软件有限公司 信息显示方法、装置及终端
CN107391017A (zh) * 2017-07-20 2017-11-24 广东欧珀移动通信有限公司 文字处理方法、装置、移动终端及存储介质
CN109002759A (zh) * 2018-06-07 2018-12-14 Oppo广东移动通信有限公司 文本识别方法、装置、移动终端以及存储介质

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598128A (zh) * 2020-04-09 2020-08-28 腾讯科技(上海)有限公司 用户界面的控件状态识别、控制方法、装置、设备及介质
CN111598128B (zh) * 2020-04-09 2023-05-12 腾讯科技(上海)有限公司 用户界面的控件状态识别、控制方法、装置、设备及介质
CN113159029A (zh) * 2020-12-18 2021-07-23 深圳简捷电子科技有限公司 一种图片中局部信息精准抓取的方法和系统
CN114967994A (zh) * 2021-02-26 2022-08-30 Oppo广东移动通信有限公司 文本处理方法、装置以及电子设备
CN113900621A (zh) * 2021-11-09 2022-01-07 杭州逗酷软件科技有限公司 操作指令处理方法、控制方法、装置以及电子设备
CN115035520A (zh) * 2021-11-22 2022-09-09 荣耀终端有限公司 图像的文字识别方法、电子设备及存储介质
CN117148981A (zh) * 2023-08-28 2023-12-01 广州文石信息科技有限公司 墨水屏的文字输入方法、装置、设备以及存储介质

Also Published As

Publication number Publication date
CN109002759A (zh) 2018-12-14

Similar Documents

Publication Publication Date Title
WO2019233212A1 (zh) 文本识别方法、装置、移动终端以及存储介质
CN108496150B (zh) 一种屏幕截图和读取的方法及终端
US11237703B2 (en) Method for user-operation mode selection and terminals
US20180253422A1 (en) Method, apparatus, and system for providing translated content
WO2019233316A1 (zh) 数据处理方法、装置、移动终端以及存储介质
CN109074171B (zh) 输入方法及电子设备
CN108958576B (zh) 内容识别方法、装置及移动终端
CN108388671B (zh) 信息分享方法、装置、移动终端和计算机可读介质
US11250046B2 (en) Image viewing method and mobile terminal
CN111597542A (zh) 验证信息共享方法、装置及电子设备及存储介质
CN108881979B (zh) 信息处理方法、装置、移动终端及存储介质
CN109032491A (zh) 数据处理方法、装置以及移动终端
CN109085982B (zh) 内容识别方法、装置及移动终端
CN112306799A (zh) 异常信息获取方法、终端设备及可读存储介质
WO2019223484A1 (zh) 信息显示方法、装置、移动终端以及存储介质
WO2019201109A1 (zh) 文字处理方法、装置、移动终端及存储介质
WO2019228370A1 (zh) 数据处理方法、装置、移动终端以及存储介质
CN111602134A (zh) 一种邮件翻译的方法及电子设备
CN109062648B (zh) 信息处理方法、装置、移动终端及存储介质
CN108762641B (zh) 一种文本编辑方法及终端设备
CN108810262B (zh) 一种应用的配置方法、终端和计算机可读存储介质
CN108803961B (zh) 数据处理方法、装置以及移动终端
CN112749074A (zh) 一种测试用例推荐方法以及装置
CN105513098B (zh) 一种图像处理的方法和装置
CN109684006B (zh) 一种终端控制方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19814008

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19814008

Country of ref document: EP

Kind code of ref document: A1