CN117935273A - Real-time text recognition and interaction method based on augmented reality - Google Patents

Real-time text recognition and interaction method based on augmented reality Download PDF

Info

Publication number
CN117935273A
CN117935273A CN202311837498.2A CN202311837498A CN117935273A CN 117935273 A CN117935273 A CN 117935273A CN 202311837498 A CN202311837498 A CN 202311837498A CN 117935273 A CN117935273 A CN 117935273A
Authority
CN
China
Prior art keywords
text
real
environment
recognition
interaction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311837498.2A
Other languages
Chinese (zh)
Inventor
章惠龙
郭磊
王乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Longyao Vision Technology Co ltd
Original Assignee
Beijing Longyao Vision Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Longyao Vision Technology Co ltd filed Critical Beijing Longyao Vision Technology Co ltd
Priority to CN202311837498.2A priority Critical patent/CN117935273A/en
Publication of CN117935273A publication Critical patent/CN117935273A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses a real-time text recognition and interaction method and system based on augmented reality, and relates to the technical field of optical character recognition, wherein the method comprises the following steps: the front end acquires a text image in a real environment, and a target text region is extracted from the text image; the rear end detects a text region from the target text region by using a first preset model, character recognition is carried out on the text region by using a second preset model, text content is obtained, and the text content is sent to the front end; the front end performs visual display on the text content in an AR environment. The method utilizes OCR technology and AR technology to realize the functions of real-time recognition and text interaction in a real environment. Through cooperation of the front end and the rear end, text content can be detected and identified rapidly and accurately, visual display is carried out in an AR environment, rich interaction instructions are provided, and a user can interact with the text content flexibly.

Description

Real-time text recognition and interaction method based on augmented reality
Technical Field
The invention relates to the technical field of optical character recognition, in particular to a real-time text recognition and interaction method based on augmented reality.
Background
Augmented Reality (AR) technology is a technology that combines virtual information with the real world. In recent years, as the AR technology is continuously developed, its application in various fields is becoming wider and wider. However, implementing real-time text recognition and interaction functions in an AR environment remains a technical challenge. Traditional Optical Character Recognition (OCR) systems focus mainly on recognizing text from static images, ignoring the dynamic and real-time requirements in AR environments. Therefore, combining OCR technology with AR technology to provide real-time text recognition and interaction functions is an urgent problem in the current technical field. In existing AR applications, most are still focused on the enhancement of images and video, while interactions with text are ignored. This prevents users from acquiring, recognizing and understanding text information in real-time in an AR environment, limiting the application scope of AR technology. Therefore, the method capable of identifying and interacting the text in real time is developed, and has important significance for expanding the application field of the AR technology and improving the user experience.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a real-time text recognition and interaction method based on augmented reality.
In one aspect, a real-time text recognition and interaction method based on augmented reality includes:
The front end acquires a text image in a real environment, and a target text region is extracted from the text image;
the rear end detects a text region from the target text region by using a first preset model, character recognition is carried out on the text region by using a second preset model, text content is obtained, and the text content is sent to the front end;
the front end performs visual display on the text content in an AR environment.
Preferably, the front end is constructed based on the a-Frame technology.
Preferably, marking a text region in the text image includes:
Extracting a target text region from the text image, comprising:
identifying a text region from the text image, and highlighting the text region;
Acquiring a region selection instruction of a user;
And determining a target text region from the text regions according to the region selection instruction.
Preferably, the first preset model is DBNet deep learning model, and the second preset model is CRNN model.
Preferably, the text content is sent to the front end through a transmission method, and the method further comprises the following steps: and formatting the text content according to a preset format.
Preferably, the front end performs visual display on the text content in an AR environment, including:
acquiring a first text interaction instruction of a user;
executing the first text interaction instruction in an AR environment;
wherein the first text interaction instruction includes adding an access link and translating text.
Preferably, the front end performs visual display on the text content in an AR environment, and further includes:
Acquiring a second text interaction instruction of a user;
rendering the text according to the second text interaction instruction, and updating the visual display effect of the text content in the AR environment;
The first text interaction instruction comprises one or more of zooming in, zooming out, rotating, splitting and assembling.
On the other hand, the real-time text recognition and interaction system based on augmented reality comprises a front end and a back end;
the front end is used for acquiring a text image in a real environment, acquiring a target text area from the text image and sending the target text area to the rear end;
the rear end is used for detecting a text area from the target text area by using a first preset model, carrying out character recognition on the text area by using a second preset model, obtaining text content, and sending the text content to the front end;
the front end is also used for visually displaying the text content in an AR environment.
The beneficial effects of the invention are as follows: the invention provides a real-time text recognition and interaction method and system based on augmented reality, which realize the functions of real-time text recognition and interaction in a real environment by utilizing an OCR technology and an AR technology. Through cooperation of the front end and the rear end, text content can be detected and identified rapidly and accurately, visual display is carried out in an AR environment, rich interaction instructions are provided, and a user can interact with the text content flexibly.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. Like elements or portions are generally identified by like reference numerals throughout the several figures. In the drawings, elements or portions thereof are not necessarily drawn to scale.
FIG. 1 is a flowchart of a real-time text recognition and interaction method based on augmented reality according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a real-time text recognition and interaction system based on augmented reality according to an embodiment of the present invention.
Detailed Description
Embodiments of the technical scheme of the present invention will be described in detail below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present invention, and thus are merely examples, and are not intended to limit the scope of the present invention.
It is noted that unless otherwise indicated, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs.
Example 1
As shown in fig. 1, an embodiment of the present invention provides a real-time text recognition and interaction method based on augmented reality, which includes:
And step 1, acquiring a text image in a real environment by the front end, and extracting a target text region from the text image.
In an embodiment of the invention, the front end is built based on the A-Frame technology. Of course, other WEB technologies may be selected to construct the front end, which is not limited in the embodiment of the present invention.
In an embodiment of the present invention, marking a text region in the text image includes: extracting a target text region from the text image, comprising: identifying a text region from the text image, and highlighting the text region; acquiring a region selection instruction of a user; and determining a target text region from the text regions according to the region selection instruction.
The front end provides an AR function, can identify text areas in the real world, and highlights the text areas on a user interface in real time, so that a user can quickly locate and pay attention to specific text areas in a given text image according to own needs, and the operation is simple and convenient.
And 2, detecting a text region from the target text region by the rear end through a first preset model, carrying out character recognition on the text region through a second preset model, acquiring text content, and sending the text content to the front end.
In the embodiment of the invention, the first preset model is DBNet deep learning model, DBNet is a deep learning model specially designed for text detection, image features are extracted through a Convolutional Neural Network (CNN), and then prediction and positioning of a text region are performed by combining with the RNN. The second preset model is a CRNN model, and the CRNN is a deep learning model for sequence identification and generation, and comprises a convolution layer, a circulation layer and a transcription layer, so that the serialization data can be effectively processed. Through the CRNN model, the rear end can accurately identify characters and acquire text contents. . Of course, other text region recognition models and character recognition models may be selected, and are not limited in the embodiment of the present invention.
In the embodiment of the present invention, the text content is sent to the front end through a method, which further includes: and formatting the text content according to a preset format.
To better present the text content in the AR environment, the backend formats the identified text content in a pre-set format (e.g., font, color, size, etc.). The formatted text content is sent to the front end again for display.
And 3, the front end performs visual display on the text content in an AR environment.
In the embodiment of the invention, the front end performs visual display on the text content in an AR environment, and the method comprises the following steps: acquiring a first text interaction instruction of a user; executing the first text interaction instruction in an AR environment; wherein the first text interaction instruction includes adding an access link and translating text.
In the embodiment of the invention, the front end performs visual display on the text content in the AR environment, and the method further comprises the following steps: acquiring a second text interaction instruction of a user; rendering the text according to the second text interaction instruction, and updating the visual display effect of the text content in the AR environment; the first text interaction instruction comprises one or more of zooming in, zooming out, rotating, splitting and assembling.
The user may issue a first text interaction instruction, such as an operation instruction to add an access link and translate text, through a gesture, an interface control, or other interaction means. The front end will perform corresponding operations in the AR environment, such as adding an access link or performing text translation, etc., according to the user's first text interaction instruction.
The interaction mode provided by the embodiment enables the user to acquire more information more conveniently, and cross-cultural exchange and understanding of text contents are deepened.
Illustratively, the scaling operation: the user may zoom in or out on the text via gestures or interface controls to view a particular portion of the text in more detail. Rotation viewing: the rotation function is provided, so that a user can observe the text from different angles, and the naturalness and intuitiveness of interaction are improved. Text splitting and assembling: allowing the user to split and reassemble the text to explore different portions of the text or related content. Touch and gesture control: advanced touch and gesture recognition techniques are utilized to enable users to interact directly with text, providing intuitive operational experience.
The user can easily perform interactive operation on the text in the AR environment, and the functions of enlarging, reducing, rotating, splitting, assembling and the like are realized, so that visual display and exploration of text contents are performed more intuitively and flexibly.
The embodiment of the invention provides a real-time text recognition and interaction method based on augmented reality, which comprises the following steps: the front end acquires a text image in a real environment, and a target text region is extracted from the text image; the rear end detects a text region from the target text region by using a first preset model, character recognition is carried out on the text region by using a second preset model, text content is obtained, and the text content is sent to the front end; the front end performs visual display on the text content in an AR environment. The method utilizes OCR technology and AR technology to realize the functions of real-time recognition and text interaction in a real environment. Through cooperation of the front end and the rear end, text content can be detected and identified rapidly and accurately, visual display is carried out in an AR environment, rich interaction instructions are provided, and a user can interact with the text content flexibly.
Example two
As shown in fig. 2, an embodiment of the present invention provides a real-time text recognition and interaction system based on augmented reality, which includes a front end and a back end; the front end is used for acquiring a text image in a real environment, acquiring a target text area from the text image and sending the target text area to the rear end; the rear end is used for detecting a text area from the target text area by using a first preset model, carrying out character recognition on the text area by using a second preset model, obtaining text content, and sending the text content to the front end; the front end is also used for visually displaying the text content in an AR environment.
It should be understood that, as shown in fig. 2 provided by the embodiment of the present invention, a real-time text recognition and interaction system based on augmented reality provided by the embodiment of the present invention and a real-time text recognition and interaction method based on augmented reality provided by the foregoing embodiment of the present invention are for the same inventive concept, and reference may be made to the foregoing embodiment for more specific working principles of each module in the embodiment of the present invention, which is not repeated in the embodiment of the present invention.
It should be understood that, for the same inventive concept, the reliable communication system based on the accumulated value provided in the embodiments of the present invention and the reliable communication method based on the accumulated value provided in the foregoing embodiments, reference may be made to the foregoing embodiments for more detailed working principles of each module and unit in the embodiments of the present invention, which are not repeated in the embodiments of the present invention.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention, and are intended to be included within the scope of the appended claims and description.

Claims (8)

1. A real-time text recognition and interaction method based on augmented reality, comprising:
The front end acquires a text image in a real environment, and a target text region is extracted from the text image;
the rear end detects a text region from the target text region by using a first preset model, character recognition is carried out on the text region by using a second preset model, text content is obtained, and the text content is sent to the front end;
the front end performs visual display on the text content in an AR environment.
2. The augmented reality-based real-time text recognition and interaction method of claim 1, wherein the front end is constructed based on an a-Frame technology.
3. The augmented reality-based real-time text recognition and interaction method of claim 1, wherein extracting a target text region from the text image comprises:
identifying a text region from the text image, and highlighting the text region;
Acquiring a region selection instruction of a user;
And determining a target text region from the text regions according to the region selection instruction.
4. The augmented reality-based real-time text recognition and interaction method according to claim 1, wherein the first preset model is DBNet deep learning model and the second preset model is CRNN model.
5. The augmented reality-based real-time text recognition and interaction method of claim 1, wherein the text content is sent to a front end by sending it to a front end, further comprising: and formatting the text content according to a preset format.
6. The augmented reality-based real-time text recognition and interaction method of claim 1, wherein the front-end visually presents the text content in an AR environment, comprising:
acquiring a first text interaction instruction of a user;
executing the first text interaction instruction in an AR environment;
wherein the first text interaction instruction includes adding an access link and translating text.
7. The augmented reality-based real-time text recognition and interaction method of claim 6, wherein the front-end visually presents the text content in an AR environment, further comprising:
Acquiring a second text interaction instruction of a user;
rendering the text according to the second text interaction instruction, and updating the visual display effect of the text content in the AR environment;
The first text interaction instruction comprises one or more of zooming in, zooming out, rotating, splitting and assembling.
8. The real-time text recognition and interaction system based on augmented reality is characterized by comprising a front end and a rear end;
the front end is used for acquiring a text image in a real environment, acquiring a target text area from the text image and sending the target text area to the rear end;
the rear end is used for detecting a text area from the target text area by using a first preset model, carrying out character recognition on the text area by using a second preset model, obtaining text content, and sending the text content to the front end;
the front end is also used for visually displaying the text content in an AR environment.
CN202311837498.2A 2023-12-28 2023-12-28 Real-time text recognition and interaction method based on augmented reality Pending CN117935273A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311837498.2A CN117935273A (en) 2023-12-28 2023-12-28 Real-time text recognition and interaction method based on augmented reality

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311837498.2A CN117935273A (en) 2023-12-28 2023-12-28 Real-time text recognition and interaction method based on augmented reality

Publications (1)

Publication Number Publication Date
CN117935273A true CN117935273A (en) 2024-04-26

Family

ID=90758475

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311837498.2A Pending CN117935273A (en) 2023-12-28 2023-12-28 Real-time text recognition and interaction method based on augmented reality

Country Status (1)

Country Link
CN (1) CN117935273A (en)

Similar Documents

Publication Publication Date Title
US10078376B2 (en) Multimodel text input by a keyboard/camera text input module replacing a conventional keyboard text input module on a mobile device
CN106484266B (en) Text processing method and device
KR20170014353A (en) Apparatus and method for screen navigation based on voice
CN109918685B (en) Computer-aided translation method, device, computer equipment and storage medium
CN114461122B (en) RPA element picking and on-screen switching method and system
CN115238214A (en) Presentation method, presentation device, computer equipment, storage medium and program product
KR101394874B1 (en) Device and method implementing for particular function based on writing
CN114637662B (en) RPA process automatic recording control method and system
CN107977155B (en) Handwriting recognition method, device, equipment and storage medium
CN101869484A (en) Medical diagnosis device having touch screen and control method thereof
WO2021254251A1 (en) Input display method and apparatus, and electronic device
CN114241501B (en) Image document processing method and device and electronic equipment
CN107391015B (en) Control method, device and equipment of intelligent tablet and storage medium
CN113552977A (en) Data processing method and device, electronic equipment and computer storage medium
CN109445900B (en) Translation method and device for picture display
CN117057318A (en) Domain model generation method, device, equipment and storage medium
CN117935273A (en) Real-time text recognition and interaction method based on augmented reality
JP2002169637A (en) Document display mode conversion device, document display mode conversion method, recording medium
KR20150097250A (en) Sketch retrieval system using tag information, user equipment, service equipment, service method and computer readable medium having computer program recorded therefor
CN112417095A (en) Voice message processing method and device
CN112118491A (en) Bullet screen generation method and device and computer readable storage medium
CN113536037A (en) Video-based information query method, device, equipment and storage medium
KR20150093045A (en) Sketch Retrieval system, user equipment, service equipment and service method based on meteorological phenomena information and computer readable medium having computer program recorded therefor
CN112183149B (en) Graphic code processing method and device
EP4250285A1 (en) Speech recognition method and apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination