CN117935273A - Real-time text recognition and interaction method based on augmented reality - Google Patents
Real-time text recognition and interaction method based on augmented reality Download PDFInfo
- Publication number
- CN117935273A CN117935273A CN202311837498.2A CN202311837498A CN117935273A CN 117935273 A CN117935273 A CN 117935273A CN 202311837498 A CN202311837498 A CN 202311837498A CN 117935273 A CN117935273 A CN 117935273A
- Authority
- CN
- China
- Prior art keywords
- text
- real
- environment
- recognition
- interaction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 55
- 238000000034 method Methods 0.000 title claims abstract description 29
- 230000003190 augmentative effect Effects 0.000 title claims abstract description 24
- 238000005516 engineering process Methods 0.000 claims abstract description 17
- 230000000007 visual effect Effects 0.000 claims abstract description 16
- 102100032202 Cornulin Human genes 0.000 claims description 5
- 101000920981 Homo sapiens Cornulin Proteins 0.000 claims description 5
- 238000013136 deep learning model Methods 0.000 claims description 5
- 230000000694 effects Effects 0.000 claims description 3
- 238000009877 rendering Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 abstract description 8
- 238000012015 optical character recognition Methods 0.000 abstract description 8
- 238000004891 communication Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Abstract
The invention discloses a real-time text recognition and interaction method and system based on augmented reality, and relates to the technical field of optical character recognition, wherein the method comprises the following steps: the front end acquires a text image in a real environment, and a target text region is extracted from the text image; the rear end detects a text region from the target text region by using a first preset model, character recognition is carried out on the text region by using a second preset model, text content is obtained, and the text content is sent to the front end; the front end performs visual display on the text content in an AR environment. The method utilizes OCR technology and AR technology to realize the functions of real-time recognition and text interaction in a real environment. Through cooperation of the front end and the rear end, text content can be detected and identified rapidly and accurately, visual display is carried out in an AR environment, rich interaction instructions are provided, and a user can interact with the text content flexibly.
Description
Technical Field
The invention relates to the technical field of optical character recognition, in particular to a real-time text recognition and interaction method based on augmented reality.
Background
Augmented Reality (AR) technology is a technology that combines virtual information with the real world. In recent years, as the AR technology is continuously developed, its application in various fields is becoming wider and wider. However, implementing real-time text recognition and interaction functions in an AR environment remains a technical challenge. Traditional Optical Character Recognition (OCR) systems focus mainly on recognizing text from static images, ignoring the dynamic and real-time requirements in AR environments. Therefore, combining OCR technology with AR technology to provide real-time text recognition and interaction functions is an urgent problem in the current technical field. In existing AR applications, most are still focused on the enhancement of images and video, while interactions with text are ignored. This prevents users from acquiring, recognizing and understanding text information in real-time in an AR environment, limiting the application scope of AR technology. Therefore, the method capable of identifying and interacting the text in real time is developed, and has important significance for expanding the application field of the AR technology and improving the user experience.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a real-time text recognition and interaction method based on augmented reality.
In one aspect, a real-time text recognition and interaction method based on augmented reality includes:
The front end acquires a text image in a real environment, and a target text region is extracted from the text image;
the rear end detects a text region from the target text region by using a first preset model, character recognition is carried out on the text region by using a second preset model, text content is obtained, and the text content is sent to the front end;
the front end performs visual display on the text content in an AR environment.
Preferably, the front end is constructed based on the a-Frame technology.
Preferably, marking a text region in the text image includes:
Extracting a target text region from the text image, comprising:
identifying a text region from the text image, and highlighting the text region;
Acquiring a region selection instruction of a user;
And determining a target text region from the text regions according to the region selection instruction.
Preferably, the first preset model is DBNet deep learning model, and the second preset model is CRNN model.
Preferably, the text content is sent to the front end through a transmission method, and the method further comprises the following steps: and formatting the text content according to a preset format.
Preferably, the front end performs visual display on the text content in an AR environment, including:
acquiring a first text interaction instruction of a user;
executing the first text interaction instruction in an AR environment;
wherein the first text interaction instruction includes adding an access link and translating text.
Preferably, the front end performs visual display on the text content in an AR environment, and further includes:
Acquiring a second text interaction instruction of a user;
rendering the text according to the second text interaction instruction, and updating the visual display effect of the text content in the AR environment;
The first text interaction instruction comprises one or more of zooming in, zooming out, rotating, splitting and assembling.
On the other hand, the real-time text recognition and interaction system based on augmented reality comprises a front end and a back end;
the front end is used for acquiring a text image in a real environment, acquiring a target text area from the text image and sending the target text area to the rear end;
the rear end is used for detecting a text area from the target text area by using a first preset model, carrying out character recognition on the text area by using a second preset model, obtaining text content, and sending the text content to the front end;
the front end is also used for visually displaying the text content in an AR environment.
The beneficial effects of the invention are as follows: the invention provides a real-time text recognition and interaction method and system based on augmented reality, which realize the functions of real-time text recognition and interaction in a real environment by utilizing an OCR technology and an AR technology. Through cooperation of the front end and the rear end, text content can be detected and identified rapidly and accurately, visual display is carried out in an AR environment, rich interaction instructions are provided, and a user can interact with the text content flexibly.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. Like elements or portions are generally identified by like reference numerals throughout the several figures. In the drawings, elements or portions thereof are not necessarily drawn to scale.
FIG. 1 is a flowchart of a real-time text recognition and interaction method based on augmented reality according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a real-time text recognition and interaction system based on augmented reality according to an embodiment of the present invention.
Detailed Description
Embodiments of the technical scheme of the present invention will be described in detail below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present invention, and thus are merely examples, and are not intended to limit the scope of the present invention.
It is noted that unless otherwise indicated, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs.
Example 1
As shown in fig. 1, an embodiment of the present invention provides a real-time text recognition and interaction method based on augmented reality, which includes:
And step 1, acquiring a text image in a real environment by the front end, and extracting a target text region from the text image.
In an embodiment of the invention, the front end is built based on the A-Frame technology. Of course, other WEB technologies may be selected to construct the front end, which is not limited in the embodiment of the present invention.
In an embodiment of the present invention, marking a text region in the text image includes: extracting a target text region from the text image, comprising: identifying a text region from the text image, and highlighting the text region; acquiring a region selection instruction of a user; and determining a target text region from the text regions according to the region selection instruction.
The front end provides an AR function, can identify text areas in the real world, and highlights the text areas on a user interface in real time, so that a user can quickly locate and pay attention to specific text areas in a given text image according to own needs, and the operation is simple and convenient.
And 2, detecting a text region from the target text region by the rear end through a first preset model, carrying out character recognition on the text region through a second preset model, acquiring text content, and sending the text content to the front end.
In the embodiment of the invention, the first preset model is DBNet deep learning model, DBNet is a deep learning model specially designed for text detection, image features are extracted through a Convolutional Neural Network (CNN), and then prediction and positioning of a text region are performed by combining with the RNN. The second preset model is a CRNN model, and the CRNN is a deep learning model for sequence identification and generation, and comprises a convolution layer, a circulation layer and a transcription layer, so that the serialization data can be effectively processed. Through the CRNN model, the rear end can accurately identify characters and acquire text contents. . Of course, other text region recognition models and character recognition models may be selected, and are not limited in the embodiment of the present invention.
In the embodiment of the present invention, the text content is sent to the front end through a method, which further includes: and formatting the text content according to a preset format.
To better present the text content in the AR environment, the backend formats the identified text content in a pre-set format (e.g., font, color, size, etc.). The formatted text content is sent to the front end again for display.
And 3, the front end performs visual display on the text content in an AR environment.
In the embodiment of the invention, the front end performs visual display on the text content in an AR environment, and the method comprises the following steps: acquiring a first text interaction instruction of a user; executing the first text interaction instruction in an AR environment; wherein the first text interaction instruction includes adding an access link and translating text.
In the embodiment of the invention, the front end performs visual display on the text content in the AR environment, and the method further comprises the following steps: acquiring a second text interaction instruction of a user; rendering the text according to the second text interaction instruction, and updating the visual display effect of the text content in the AR environment; the first text interaction instruction comprises one or more of zooming in, zooming out, rotating, splitting and assembling.
The user may issue a first text interaction instruction, such as an operation instruction to add an access link and translate text, through a gesture, an interface control, or other interaction means. The front end will perform corresponding operations in the AR environment, such as adding an access link or performing text translation, etc., according to the user's first text interaction instruction.
The interaction mode provided by the embodiment enables the user to acquire more information more conveniently, and cross-cultural exchange and understanding of text contents are deepened.
Illustratively, the scaling operation: the user may zoom in or out on the text via gestures or interface controls to view a particular portion of the text in more detail. Rotation viewing: the rotation function is provided, so that a user can observe the text from different angles, and the naturalness and intuitiveness of interaction are improved. Text splitting and assembling: allowing the user to split and reassemble the text to explore different portions of the text or related content. Touch and gesture control: advanced touch and gesture recognition techniques are utilized to enable users to interact directly with text, providing intuitive operational experience.
The user can easily perform interactive operation on the text in the AR environment, and the functions of enlarging, reducing, rotating, splitting, assembling and the like are realized, so that visual display and exploration of text contents are performed more intuitively and flexibly.
The embodiment of the invention provides a real-time text recognition and interaction method based on augmented reality, which comprises the following steps: the front end acquires a text image in a real environment, and a target text region is extracted from the text image; the rear end detects a text region from the target text region by using a first preset model, character recognition is carried out on the text region by using a second preset model, text content is obtained, and the text content is sent to the front end; the front end performs visual display on the text content in an AR environment. The method utilizes OCR technology and AR technology to realize the functions of real-time recognition and text interaction in a real environment. Through cooperation of the front end and the rear end, text content can be detected and identified rapidly and accurately, visual display is carried out in an AR environment, rich interaction instructions are provided, and a user can interact with the text content flexibly.
Example two
As shown in fig. 2, an embodiment of the present invention provides a real-time text recognition and interaction system based on augmented reality, which includes a front end and a back end; the front end is used for acquiring a text image in a real environment, acquiring a target text area from the text image and sending the target text area to the rear end; the rear end is used for detecting a text area from the target text area by using a first preset model, carrying out character recognition on the text area by using a second preset model, obtaining text content, and sending the text content to the front end; the front end is also used for visually displaying the text content in an AR environment.
It should be understood that, as shown in fig. 2 provided by the embodiment of the present invention, a real-time text recognition and interaction system based on augmented reality provided by the embodiment of the present invention and a real-time text recognition and interaction method based on augmented reality provided by the foregoing embodiment of the present invention are for the same inventive concept, and reference may be made to the foregoing embodiment for more specific working principles of each module in the embodiment of the present invention, which is not repeated in the embodiment of the present invention.
It should be understood that, for the same inventive concept, the reliable communication system based on the accumulated value provided in the embodiments of the present invention and the reliable communication method based on the accumulated value provided in the foregoing embodiments, reference may be made to the foregoing embodiments for more detailed working principles of each module and unit in the embodiments of the present invention, which are not repeated in the embodiments of the present invention.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention, and are intended to be included within the scope of the appended claims and description.
Claims (8)
1. A real-time text recognition and interaction method based on augmented reality, comprising:
The front end acquires a text image in a real environment, and a target text region is extracted from the text image;
the rear end detects a text region from the target text region by using a first preset model, character recognition is carried out on the text region by using a second preset model, text content is obtained, and the text content is sent to the front end;
the front end performs visual display on the text content in an AR environment.
2. The augmented reality-based real-time text recognition and interaction method of claim 1, wherein the front end is constructed based on an a-Frame technology.
3. The augmented reality-based real-time text recognition and interaction method of claim 1, wherein extracting a target text region from the text image comprises:
identifying a text region from the text image, and highlighting the text region;
Acquiring a region selection instruction of a user;
And determining a target text region from the text regions according to the region selection instruction.
4. The augmented reality-based real-time text recognition and interaction method according to claim 1, wherein the first preset model is DBNet deep learning model and the second preset model is CRNN model.
5. The augmented reality-based real-time text recognition and interaction method of claim 1, wherein the text content is sent to a front end by sending it to a front end, further comprising: and formatting the text content according to a preset format.
6. The augmented reality-based real-time text recognition and interaction method of claim 1, wherein the front-end visually presents the text content in an AR environment, comprising:
acquiring a first text interaction instruction of a user;
executing the first text interaction instruction in an AR environment;
wherein the first text interaction instruction includes adding an access link and translating text.
7. The augmented reality-based real-time text recognition and interaction method of claim 6, wherein the front-end visually presents the text content in an AR environment, further comprising:
Acquiring a second text interaction instruction of a user;
rendering the text according to the second text interaction instruction, and updating the visual display effect of the text content in the AR environment;
The first text interaction instruction comprises one or more of zooming in, zooming out, rotating, splitting and assembling.
8. The real-time text recognition and interaction system based on augmented reality is characterized by comprising a front end and a rear end;
the front end is used for acquiring a text image in a real environment, acquiring a target text area from the text image and sending the target text area to the rear end;
the rear end is used for detecting a text area from the target text area by using a first preset model, carrying out character recognition on the text area by using a second preset model, obtaining text content, and sending the text content to the front end;
the front end is also used for visually displaying the text content in an AR environment.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311837498.2A CN117935273A (en) | 2023-12-28 | 2023-12-28 | Real-time text recognition and interaction method based on augmented reality |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311837498.2A CN117935273A (en) | 2023-12-28 | 2023-12-28 | Real-time text recognition and interaction method based on augmented reality |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117935273A true CN117935273A (en) | 2024-04-26 |
Family
ID=90758475
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311837498.2A Pending CN117935273A (en) | 2023-12-28 | 2023-12-28 | Real-time text recognition and interaction method based on augmented reality |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117935273A (en) |
-
2023
- 2023-12-28 CN CN202311837498.2A patent/CN117935273A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10078376B2 (en) | Multimodel text input by a keyboard/camera text input module replacing a conventional keyboard text input module on a mobile device | |
CN106484266B (en) | Text processing method and device | |
KR20170014353A (en) | Apparatus and method for screen navigation based on voice | |
CN109918685B (en) | Computer-aided translation method, device, computer equipment and storage medium | |
CN114461122B (en) | RPA element picking and on-screen switching method and system | |
CN115238214A (en) | Presentation method, presentation device, computer equipment, storage medium and program product | |
KR101394874B1 (en) | Device and method implementing for particular function based on writing | |
CN114637662B (en) | RPA process automatic recording control method and system | |
CN107977155B (en) | Handwriting recognition method, device, equipment and storage medium | |
CN101869484A (en) | Medical diagnosis device having touch screen and control method thereof | |
WO2021254251A1 (en) | Input display method and apparatus, and electronic device | |
CN114241501B (en) | Image document processing method and device and electronic equipment | |
CN107391015B (en) | Control method, device and equipment of intelligent tablet and storage medium | |
CN113552977A (en) | Data processing method and device, electronic equipment and computer storage medium | |
CN109445900B (en) | Translation method and device for picture display | |
CN117057318A (en) | Domain model generation method, device, equipment and storage medium | |
CN117935273A (en) | Real-time text recognition and interaction method based on augmented reality | |
JP2002169637A (en) | Document display mode conversion device, document display mode conversion method, recording medium | |
KR20150097250A (en) | Sketch retrieval system using tag information, user equipment, service equipment, service method and computer readable medium having computer program recorded therefor | |
CN112417095A (en) | Voice message processing method and device | |
CN112118491A (en) | Bullet screen generation method and device and computer readable storage medium | |
CN113536037A (en) | Video-based information query method, device, equipment and storage medium | |
KR20150093045A (en) | Sketch Retrieval system, user equipment, service equipment and service method based on meteorological phenomena information and computer readable medium having computer program recorded therefor | |
CN112183149B (en) | Graphic code processing method and device | |
EP4250285A1 (en) | Speech recognition method and apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination |