WO2022267696A1

WO2022267696A1 - Content recognition method and apparatus, electronic device, and storage medium

Info

Publication number: WO2022267696A1
Application number: PCT/CN2022/090382
Authority: WO
Inventors: 徐思琪
Original assignee: Oppo广东移动通信有限公司
Priority date: 2021-06-24
Filing date: 2022-04-29
Publication date: 2022-12-29
Also published as: CN115527135A

Abstract

Embodiments of the present application disclose a content recognition method and apparatus, an electronic device, and a storage medium. The method comprises: displaying a collected image in real time; if the displayed image includes specified content, displaying a prompting identifier on the specified content; recognizing the specified content in response to a touch operation acting on the prompting identifier; and outputting a recognition result. Thus, by using the described means, in the state in which an electronic device displays a collected image in real time, the electronic device automatically uses a means of a prompting identifier to recognize specified content appearing in the image displayed in real time, and then, by means of a touch operation acting on the prompting identifier, the electronic device can be directly triggered to recognize the specified content, thus the operation process of triggering the recognition of an image is simplified, and user experience is improved.

Description

Content identification method, device, electronic device and storage medium

Cross References to Related Applications

This application claims priority to Chinese application No. 202110706063.9 filed on June 24, 2021, which is hereby incorporated by reference in its entirety for all purposes.

technical field

The present application relates to the technical field of electronic equipment, and more specifically, to a content identification method, device, electronic equipment, and storage medium.

Background technique

With the development of content recognition technology, more electronic devices support image-based content recognition.

Contents of the invention

In view of the above problems, the present application proposes a content identification method, device, electronic device and storage medium to improve the above problems.

In the first aspect, the present application provides a method for content recognition, which is applied to electronic equipment, and the method includes: displaying the captured image in real time; if the displayed image includes specified content, displaying a prompt at the specified content identification; in response to a touch operation acting on the prompt identification, identify the specified content; and output an identification result.

In a second aspect, the present application provides a content recognition device that runs on electronic equipment, and the device includes: an image display unit, configured to display captured images in real time; and a content identification unit, configured to display images that include If there is specified content, a prompt mark is displayed at the specified content; a recognition unit is configured to identify the specified content in response to a touch operation acting on the prompt mark; a content output unit is configured to output a recognition result.

In a third aspect, the present application provides an electronic device, including one or more processors and a memory; one or more programs are stored in the memory and configured to be executed by the one or more processors, The one or more programs are configured to perform the methods described above.

In a fourth aspect, the present application provides a computer-readable storage medium storing program code executable by a processor, the computer-readable storage medium includes the stored program code, wherein, when the program code is running, the above-mentioned Methods.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present application. For those skilled in the art, other drawings can also be obtained based on these drawings without any creative effort.

FIG. 1 shows a schematic diagram of a scene of a content recognition method proposed by the present application;

FIG. 2 shows a flow chart of a content identification method proposed by an embodiment of the present application;

FIG. 3 shows a schematic diagram of an image collected by an electronic device in the present application;

Fig. 4 shows a schematic diagram of a reminder mark in this application;

Fig. 5 shows a schematic diagram of a variety of specified content in this application with a corresponding reminder for each specified content;

Fig. 6 shows a schematic diagram of a display mode of a recognition result in this application;

Fig. 7 shows a schematic diagram of another display mode of recognition results in this application;

FIG. 8 shows a flow chart of a content identification method proposed by another embodiment of the present application;

FIG. 9 shows a schematic diagram of displaying recognition results based on a full-screen mode in the present application;

FIG. 10 shows another schematic diagram of displaying recognition results based on a full-screen mode in this application;

FIG. 11 shows a flow chart of a content identification method proposed by another embodiment of the present application;

Fig. 12 shows a flow chart of a content identification method proposed by another embodiment of the present application;

Fig. 13 shows a schematic diagram of the locking interface in this application;

FIG. 14 shows a flow chart of a content identification method proposed by another embodiment of the present application;

Figure 15 shows a schematic diagram of the operation menu in this application;

FIG. 16 shows a flow chart of a content identification method proposed by another embodiment of the present application;

Fig. 17 shows a schematic diagram of a complete display of objects in this application;

Figure 18 shows a schematic diagram of the size comparison of objects after increasing the focal length in the present application;

FIG. 19 shows a flow chart of a content identification method proposed by another embodiment of the present application;

Fig. 20 shows a structural block diagram of a content identification device proposed in this application;

Fig. 21 shows a structural block diagram of another content identification device proposed by the present application;

FIG. 22 shows a structural block diagram of an electronic device for performing a content identification method according to an embodiment of the present application;

Fig. 23 is a storage unit for saving or carrying program codes for realizing the content identification method according to the embodiment of the present application according to the embodiment of the present application.

detailed description

The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of this application.

With the popularity of image-based content recognition technology in electronic devices, electronic devices can be used for content recognition in more scenarios. For example, in some scenarios, the WiFi password will be printed on some paper, or will be posted on the wall, then in this case, the user will first operate the electronic device to take pictures of the WiFi password, and get The image including the WiFi password, and then operate other applications to perform text recognition on the image including the WiFi password, thereby extracting the WiFi password. However, after researching the related identification operation process, the inventor found that in the related identification process, it is necessary to first take pictures of the electronic device to obtain the image to be identified, and then operate the electronic device to obtain the image to be identified by the photo. Image content recognition, which in turn results in a cumbersome recognition process and poor user experience.

Therefore, the inventor proposes a content recognition method, device, electronic device, and storage medium in the present application. In this method, the collected images are displayed in real time, and when the displayed images include specified content In this case, a prompt mark may be displayed at the specified content, and then the specified content may be identified in response to a touch operation acting on the prompt mark, and a recognition result may be output. Therefore, through the above method, when the electronic device is displaying the collected image in real time, after the electronic device automatically identifies the specified content appearing in the image displayed in real time by means of prompting and marking, it acts on the The touch operation of the prompt mark can directly trigger the electronic device to recognize the specified content, thereby simplifying the operation process of triggering the recognition of the image and improving the user experience.

In one implementation manner, the method provided in this embodiment may further include the following process: acquiring a background image, where the background image includes an image displayed by the electronic device when the touch operation acts on the prompt sign; The recognition result is displayed with the background image as the background.

In an implementation manner, the method provided in this embodiment may further include the following process: acquiring an image to be processed, the image to be processed is an image displayed by the electronic device when the touch operation acts on the prompt mark image; perform blurring processing on the image to be processed, and use the blurred image as a background image; display the recognition result with the background image as the background.

In an implementation manner, the method provided in this embodiment may further include the following process: displaying a first trigger control; and displaying the recognition result based on a full-screen mode in response to a touch operation acting on the first trigger control.

In one implementation, the method provided in this embodiment may further include the following process: displaying the recognition result; displaying the second trigger control; displaying a locking interface in response to a touch operation acting on the second trigger control, and the locking The interface includes an image displayed by the electronic device when a touch operation acts on the prompt mark, and a prompt mark corresponding to the specified content in the displayed image.

In an implementation manner, the method provided in this embodiment may further include the following process: in response to the first operation, resume real-time display of the collected images.

In an implementation manner, the method provided in this embodiment may further include the following procedures: displaying the recognition result; displaying a third trigger control; displaying an operation menu in response to a touch operation acting on the third trigger control, and the operation The menu includes at least one operation control, and each operation control corresponds to a different operation; in response to the touch operation acting on the operation control, the operation corresponding to the operation control with the touch operation is used as the target operation; the recognition result Execute the target action.

In an implementation manner, the method provided in this embodiment may further include the following process: performing zoom processing on the captured image in response to the zoom request.

In one embodiment, the method provided in this embodiment may further include the following process: if there is an area selection operation acting on the image displayed in real time, detect whether the object in the selected area is completely displayed; if it is completely displayed, Then generate a size-increasing zoom request, and the size-increasing zoom request is used to make the objects in the selected area be completely displayed with the first target size; in response to the size-increasing zoom request, the collected The image is zoomed.

In an implementation manner, the method provided in this embodiment may further include the following process: if it is not fully displayed, generate a reduced-size zoom request, and the reduced-size zoom request is used to make the selected area The object is displayed with the second target size; in response to the zoom request for reducing the size, the captured image is zoomed.

In some implementations, the specified content includes: text content or a target object.

In an implementation manner, the method provided in this embodiment may further include the following process: if the specified content is text content, use the scene image corresponding to the scene expressed by the semantics of the recognition result as the background image.

In an implementation manner, the method provided in this embodiment may further include the following process: if the specified content is text content, use an image corresponding to a keyword in the recognition result as a background image.

In one embodiment, the method provided in this embodiment may further include the following process: merging the recognition result and the background image into one image to obtain a fused image; displaying the fused image .

In an implementation manner, the method provided in this embodiment may further include the following process: displaying the background image, and suspending the recognition result on the displayed background image.

In an implementation manner, the method provided in this embodiment may further include the following process: if the specified content is text content, the recognition result is enlarged in size and then displayed.

In one embodiment, the method provided in this embodiment may further include the following process: start the camera program in response to the user's operation; after starting the camera program, display the captured image in real time in the interface of the displayed camera program .

The application environment involved in this application will first be introduced below.

As a manner, the content identification method provided in the embodiment of the present application may be independently executed by an electronic device. In this manner, the electronic device collects images through its own image acquisition device, then displays the image collected by the image acquisition device in real time, and executes the content recognition method provided by the embodiment of the present application on the image displayed in real time. As another manner, the content identification method provided in the embodiment of the present application may be executed cooperatively by at least two electronic devices. As shown in FIG. 1 , in the scene shown in FIG. 1 , there are electronic equipment 100 and electronic equipment 200, wherein the electronic equipment 200 can perform image acquisition through its configured image acquisition device, and then transmit the acquired image to itself The network module of the electronic device 100 can use the network module to transmit the captured image to the network module of the electronic device 100, and the processor of the electronic device 100 can obtain the image received by its own network module, and then execute the application embodiment provided by the obtained image content identification method. Wherein, the electronic devices (for example, the electronic device 100 and the electronic device 200 ) may be mobile phones, tablet computers and other devices. In the case where two electronic devices cooperate to execute the content identification method provided by the embodiment of the present application, the two electronic devices may be of the same type, or may be of different types. For example, in addition to the fact that the two electronic devices shown in FIG. 1 are both smart phones, it may also include that one electronic device is a smart phone and the other electronic device is a smart watch.

Various embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Please refer to Figure 2, a content identification method provided by this application is applied to electronic equipment, and the method includes:

S110: Display the collected images in real time.

Wherein, the displayed image is collected by an image acquisition device (for example, a camera), then displaying the image in real time can be understood as displaying the image collected by the image acquisition device in real time, so that the user can view the current electronic equipment. The images collected by the image acquisition device are previewed. Wherein, in the process of real-time displaying the collected images, if the area where the image collection device collects the images changes, the displayed content will also change synchronously.

As a manner, the electronic device can start the camera program in response to the user's operation, and after the camera program is started, the captured image can be displayed in real time on the displayed interface of the camera program.

S120: If the displayed image includes specified content, display a prompt logo at the specified content.

In the embodiment of the present application, the designated content may be text content, or may be a target object. Wherein, the target object may include a human face and the like. After it is detected that the displayed image includes the specified content, as a way of prompting the user, a prompt logo may be displayed at the specified content. Optionally, the prompt identifier can be a frame surrounding the specified content. Exemplarily, as shown in FIG. 3 , in the scene shown in FIG. 3 , the user uses an electronic device to shoot content in the remote projection screen 10 , and correspondingly, the electronic device will process the captured image in the interface 11 Real-time display, and then the projection screen 10 and the content in the projection screen 10 will be displayed synchronously on the interface 11 . If the projection screen 10 includes text content, the electronic device will display a frame 12 surrounding the text content as a prompt mark at the area where the text content is included. Wherein, it should be noted that the specific style of the prompt mark is not specifically limited in this embodiment of the application, and it may be other styles besides the box shown in FIG. 3 . For example, as shown in FIG. 4 , the prompt mark can also be a transparent layer 13 with color, and the transparent layer 13 can cover the specified content, so that the user can see the specified content through the transparent layer 13 . Wherein, the color of the transparent layer may be yellow or green.

As a way, different types of specified content may have different prompting signs corresponding to them, so that the user can quickly distinguish the specified content through the prompting mark. Exemplarily, as shown in Figure 5, in the image collected by the electronic device, there are characters and bus stop signs beside the characters, then the face of the characters will be determined as a specified content, and the bus stop Since the card contains text content, it can also be identified as a specified content. For the specified content of the face type, the frame 14 can be used as a prompt mark, and for the specified content of the text type, the transparent layer 15 can be used as a prompt mark. Furthermore, for the specified content of the face type, a transparent layer may also be used as a corresponding prompting mark, and a frame may also be used as a prompting mark for text content. Moreover, for different types of specified content, there may also be corresponding to the same prompt mark.

S130: Identify the specified content in response to a touch operation acting on the prompt mark.

In the embodiment of the present application, there may be various implementation manners for the touch operation. As a manner, the touch operation may be a click operation. Furthermore, as another manner, the touch operation may be a double-tap operation. Moreover, the touch operation that triggers the recognition of the specified content can also be configured by the user according to his own needs, so as to meet the personalized needs of the user. Wherein, the content identification methods corresponding to different specified content may be different. Optionally, if the specified content is text content, the corresponding recognition of the text content can be understood as extracting the text content from the image. Optionally, the electronic device may extract text content from the image based on optical character recognition (Optical Character Recognition, OCR). Among them, optical character recognition refers to the process of analyzing and recognizing image files of text materials to obtain text and layout information. That is to recognize the text in the image and return it in the form of text. It should be noted that, taking the specified content as an example of text content, in the aforementioned S120, recognizing that there is specified content in the image displayed in real time can be understood as recognizing a certain area in the image as an area containing text content, but the electronic The device does not yet know what the specific content of the text content in this area is. Then, by identifying the specified content, the specific content of the text content can be determined.

Similarly, if the specified content is a human face, when the electronic device detects a human face in the picture displayed in real time, the electronic device only detects a human face in the image, but does not know who the human face represents. Then, by recognizing the face, the identity information of the person to whom the face belongs can be further obtained. The identity information may include age or gender.

S140: output the recognition result.

In the embodiment of the present application, there may be multiple ways of outputting the recognition result. As a manner, outputting the recognition result may include displaying the recognition result. In this manner, the electronic device may display the recognition result on a certain interface. Exemplarily, as shown in FIG. 6 , there are two types of prompt signs, the transparent layer 15 and the frame 14, displayed in the left image of FIG. The device can recognize the text content on the transparent layer 15 in response to the touch operation acting on the transparent layer 15 , and then display the recognition result in the interface 16 as shown in the right image of FIG. 6 . Optionally, when the specified content is text content, the recognition result may be displayed after being enlarged when displaying the recognition result, so that the user can see the recognition result more clearly. Furthermore, if there is a touch operation acting on the frame 14, the electronic device can recognize the face at the frame 14 in response to the touch operation acting on the frame 14, and directly The recognition result is displayed in the original interface, and the recognition result includes the recognized gender and age. Optionally, in the case that the specified content is text content, the recognition result may be displayed after being enlarged in size. Wherein, the multiple of the size magnification may be a multiple of a screen adapted to the electronic device, so that the user can have a better reading experience.

As another manner, the electronic device may output the recognition result by sending it to an external device. In this way, the electronic device can directly send the content in the recognition result to the external device. Wherein, the external device may include a server, or may be an electronic device of another user. In the case where the external device is a server, the electronic device can directly upload the recognition result to the server, and instruct the server to store the uploaded content and the account corresponding to the electronic device, so that the electronic device can use the corresponding account from The uploaded content is retrieved from the server. In the case where the external device is an electronic device of another user, if a communication connection is established between the external device and the electronic device through short-range wireless communication (WiFi or Bluetooth), the electronic device can The communication method establishes a communication connection and sends the recognition result to the external device. If there is an association between the account of the external device and the account of the electronic device, the electronic device may send the identification result to the external device through the association between the accounts.

It should be noted that, when there are multiple output methods, there are multiple methods in the electronic device to determine which method is specifically adopted to identify the specified content.

As a manner, the electronic device may determine which manner to use for output by triggering a touch operation for identifying specified content. Optionally, in this manner, the click operation may be configured to correspond to outputting the recognition result by displaying, and the double-click operation may be configured to correspond to outputting the recognition result by sending it to an external device. Exemplarily, please refer to FIG. 6 again. If there is a click operation acting on the transparent layer 15, the electronic device will output the recognition result in the manner shown in the right image of FIG. Double-click operation on layer 15, the electronic device will not display the recognition result, but directly send it to the external device. Correspondingly, the click operation may also be configured to output the recognition result by sending it to an external device, and the double-click operation may be configured to output the recognition result by displaying it.

As another manner, a default output manner may be configured in a setting interface of the electronic device. Then, when the electronic device needs to output the recognition result, it will read the default output mode in the electronic device, and then output the recognition result according to the default output mode. For example, if the default output method is output by display, the electronic device will only display the recognition result.

It should be noted that if the recognition result is displayed through an interface other than the interface for real-time display of the collected images, the interface for displaying the recognition result can also be configured with a trigger to jump to the The controls of the interface for displaying the collected images in real time, the configuration mode and the trigger mode of the controls are not specifically limited in this embodiment of the present application.

This embodiment provides a content recognition method. In this method, the captured image will be displayed in real time, and if the displayed image contains specified content, a prompt mark can be displayed at the specified content, and then respond A touch operation acting on the prompt mark identifies the specified content and outputs a recognition result. Therefore, through the above method, when the electronic device is displaying the collected image in real time, after the electronic device automatically identifies the specified content appearing in the image displayed in real time by means of prompting and marking, it acts on the The touch operation of the prompt mark can directly trigger the electronic device to recognize the specified content, thereby simplifying the operation process of triggering the recognition of the image and improving the user experience.

Please refer to Figure 8, a content identification method provided by the present application is applied to electronic equipment, and the method includes:

S210: Display the collected images in real time.

S220: If the displayed image includes specified content, display a prompt logo at the specified content.

S230: Identify the specified content in response to a touch operation acting on the prompt mark.

S240: Acquire a background image, where the background image includes an image displayed by the electronic device when a touch operation is performed on the prompt sign.

In S240, when it is detected that there is a touch operation for prompting the logo, the image currently displayed by the electronic device may be used as a corresponding background image for subsequent recognition result display.

Furthermore, it should be noted that, in the embodiment of the present application, in addition to the manner of determining the background image shown in S240, there may be other manners of determining the background image. Optionally, the corresponding background image may also be determined according to the content of the recognition result.

As a manner, if the specified content is text content, then the scene image corresponding to the scene expressed by the semantics of the recognition result may be used as the background image. For example, if the semantics of the recognition result express teaching content, then the expressed scene may be a teaching scene, and then the scene image corresponding to the teaching scene may be determined as the background image. Optionally, the scene image corresponding to the teaching scene may be an image whose content is a classroom, or an image whose content is a blackboard. For another example, if the semantic expression of the recognition result is a bus as shown in FIG. 5 , then the expressed scene can be a traffic scene, and then the scene image corresponding to the traffic scene can be determined as the background image. Optionally, the scene image corresponding to the traffic scene may be a vehicle. Wherein, the electronic device can obtain the scene expressed by the recognition result through the neural network model obtained through pre-training.

As another way, if the specified content is text content, the image corresponding to the keyword in the recognition result may be used as the background image. Exemplarily, if the recognition result includes the keyword "mobile phone", then the image corresponding to the mobile phone can be used as the background image. For another example, if the recognition result includes "bus", then the image corresponding to the bus may be used as the background image. Wherein, the correspondence between keywords and images may be stored in the electronic device, so that after the electronic device obtains the keywords in the recognition result, the image corresponding to the keywords may be obtained through the correspondence.

S250: Display the recognition result with the background image as the background.

Wherein, when displaying the recognition result, the recognition result may be displayed based on the background image. Wherein, as a manner, displaying the recognition result based on the background image may include: fusing the recognition result and the background image into one image, and then displaying the fused image. In this way, after the recognition result is obtained, the background image can be obtained first, and then the recognition result is integrated into the background image to obtain the background image integrated with the recognition result. As another manner, displaying the recognition result based on the background image may include: displaying the background image, and suspending the recognition result on the displayed background image.

As a manner, the outputting the recognition result further includes: displaying a first trigger control; after displaying the recognition result with the background image as the background, it further includes: responding to a touch acting on the first trigger control control operation, and display the recognition result based on the full screen mode. Wherein, the full-screen mode can be understood as displaying the recognition result based on a full-screen display. Optionally, in a full-screen mode, the electronic device may only display the background image and the recognition result. In another full-screen display mode, in addition to displaying background images and recognition results, the electronic device can also display a status bar set on the top of the electronic device. The status column may include at least one of a battery status and a wireless signal status.

Exemplarily, as shown in FIG. 9, in S250, the recognition result may be displayed in the style shown in the left image in FIG. 9. In the style shown in the left image of FIG. In addition to the result interface 16, a status bar 17 and an operation area 18 are also displayed. In this way, a first trigger control named “Reading Mode” can be displayed in the status bar 17 . If it is detected that the first trigger control named "reading mode" is active, the electronic device switches to the style shown in the right image in FIG. 9 to display the recognition result. In the right image of FIG. 9 , the status bar 17 and the operation area 18 are canceled, and the full-screen display interface 19 is used to display the recognition result, thereby realizing full-screen display of the recognition result. Wherein, a background image (not shown in the figure) may be displayed on the interface 19 . As shown in FIG. 10, in another full-screen display mode, the interface 19 for displaying the recognition results will not cover all areas of the display screen as shown in FIG. In addition to the interface 19, a status bar 191 may also be displayed. In the status bar 191 , the battery status and the wireless signal status can be displayed. The wireless signal state may include a WiFi state and a mobile communication signal state.

It should be noted that for specific descriptions of steps in this embodiment that are the same as those in other embodiments, reference may be made to relevant content in other embodiments, and details are not repeated in this embodiment.

This embodiment provides a method for content recognition, so that in the state where the electronic device is displaying the collected images in real time, the electronic device automatically recognizes the images displayed in real time by means of prompting and marking After the specified content appears, the electronic device can be directly triggered to recognize the specified content through the touch operation acting on the prompt mark, thereby simplifying the operation process of triggering image recognition and improving user experience. Moreover, in this embodiment, when the recognition result is displayed, the recognition result will be displayed with the background image as the background, thereby helping to reduce the degree of interface change perceived by the user.

Please refer to FIG. 11 , a content identification method provided by the present application is applied to electronic devices, and the method includes:

S260: Display the collected images in real time.

S261: If the displayed image includes specified content, display a prompt logo at the specified content.

S262: Identify the specified content in response to a touch operation acting on the prompt mark.

S263: Acquire an image to be processed, where the image to be processed is an image displayed by the electronic device when a touch operation is performed on the prompt sign.

S264: Perform blurring processing on the image to be processed, and use the blurred image as a background image.

S265: Display the recognition result with the background image as the background.

This embodiment provides a method for content recognition, so that in the state where the electronic device is displaying the collected images in real time, the electronic device automatically recognizes the images displayed in real time by means of prompting and marking After the specified content appears, the electronic device can be directly triggered to recognize the specified content through the touch operation acting on the prompt mark, thereby simplifying the operation process of triggering image recognition and improving user experience. Blurring the image to be processed can make the final background image visually blurred, which makes it easier for the user to see the content of the recognition result, and also allows the user to pay more attention to the recognition result itself.

Please refer to FIG. 12, a content identification method provided by the present application is applied to electronic equipment, and the method includes:

S310: Display the collected images in real time.

S320: If the displayed image includes specified content, display a prompt logo at the specified content.

S330: Identify the specified content in response to a touch operation acting on the prompt mark.

S340: Display the recognition result.

S350: Displaying the second trigger control.

It should be noted that, in the embodiment of the present application, displaying the second trigger control and displaying the recognition result can be performed simultaneously, so that the user can visually perceive that the second trigger control and the recognition result are displayed together on the electronic device. on the screen.

S360: In response to the touch operation acting on the second trigger control, display a lock interface, where the lock interface includes an image displayed by the electronic device when the touch operation acts on the prompt sign, and the The prompt identifier corresponding to the specified content in the displayed image.

Exemplarily, as shown in FIG. 13 , if the touch operation on the transparent layer 15 is detected in the situation shown in the left image of FIG. 13 , the electronic device can specify the corresponding The content is recognized, and the recognition result is displayed through the style shown in the middle image of Figure 13. Moreover, the second trigger control 20 is also displayed therein. If a touch operation on the second trigger control 20 is detected, the electronic device will display the lock interface 21 shown in the right image of FIG. 13 . As shown in the right image of FIG. 13, when the image content in the locking interface 21 and the touch operation acting on the prompt mark are in effect, the image content displayed by the electronic device (that is, the image content shown in the left image of FIG. 13 content) are the same. The difference is that the image content in the locking interface 21 is a static image, while the content shown in the left image of FIG. 13 is an image captured by the image acquisition device of the electronic device in real time. Wherein, the static image in the locking interface 21 can be understood as that even if the image captured by the image acquisition device of the electronic device changes, the image content in the locking interface 21 remains unchanged. Correspondingly, in the situation shown in the left image of FIG. 13 , the content of the image displayed by the electronic device changes in real time as the image captured by the image capture device changes. Optionally, when the electronic device displays the locking interface, the electronic device may stop image collection during image collection, so as to reduce power consumption.

As a manner, after displaying the lock interface in response to the touch operation acting on the second trigger control, the method further includes: in response to the first operation, resuming real-time display of the collected images. Wherein, the first operation may be a double-click operation acting on the display screen. Furthermore, as another way, when the lock interface is displayed, the electronic device can display a recovery control that triggers the recovery of the real-time display of the collected image, so that the electronic device detects that the touch that acts on the recovery control When the control operation is performed, the real-time display of the collected images is resumed. Wherein, if in the process of displaying the locking interface, the image acquisition device stops image acquisition, then resuming the real-time display of the collected images may include: starting the image acquisition device, and performing the image acquisition on the images collected by the image acquisition device after starting real-time display. If in the process of displaying the locked interface, the image acquisition device is still collecting images and buffering the acquired images to a designated area, but does not read the buffered images from the designated area for real-time display, then the electronic device In the process of resuming the real-time display of the collected images, it can be known that the resuming is executed to read the images from the designated area for display.

This embodiment provides a method for content recognition, so that in the state where the electronic device is displaying the collected images in real time, the electronic device automatically recognizes the images displayed in real time by means of prompting and marking After the specified content appears, the electronic device can be directly triggered to recognize the specified content through the touch operation acting on the prompt mark, thereby simplifying the operation process of triggering image recognition and improving user experience. Moreover, in this embodiment, while displaying the recognition result, the second trigger control can also be displayed, so that the user can touch the second trigger control to cause the electronic device to display the touch operation function including the prompt mark, The image displayed by the electronic device, and the lock interface of the prompt logo corresponding to the specified content in the displayed image, so that when there are multiple prompt logos, the user can trigger other prompts again in the lock interface Prompt for identification to identify additional specified content and display the identification result.

Please refer to Figure 14, a content identification method provided by the present application is applied to electronic devices, and the method includes:

S410: Display the collected images in real time.

S420: If the displayed image includes specified content, display a prompt logo at the specified content.

S430: Identify the specified content in response to a touch operation acting on the prompt mark.

S440: Display the recognition result.

S450: Displaying a third trigger control.

It should be noted that, in the embodiment of the present application, displaying the third trigger control and displaying the recognition result can be performed simultaneously, so that the user can visually perceive that the third trigger control and the recognition result are displayed together on the electronic device. on the screen.

S460: Display an operation menu in response to a touch operation acting on the third trigger control, where the operation menu includes at least one operation control, and each operation control corresponds to a different operation.

As shown in FIG. 15 , in the right diagram of FIG. 15 , the third trigger control 22 may be displayed simultaneously when the recognition result is displayed. Then, in response to the touch operation acting on the third trigger control 22 , an operation menu 23 may be displayed. The operation menu 23 includes an operation control named send, an operation control named copy, an operation control named save as document, and an operation control named save as sticky note. Wherein, the operation corresponding to the touch operation named sending includes sending the recognition result to a third-party application program. The third-party application program may include an instant messaging application program or a short message program. The operation corresponding to the operation control named copy includes copying the recognition result, so that after the copy operation is performed, the electronic device can input the recognition result by pasting in other positions where text input is possible. The operation corresponding to the operation control named as saving the document may include storing the recognition result in the form of a document. The operation corresponding to the operation control named "Save Memo" may include storing the recognition result in the form of Memo.

S470: In response to a touch operation acting on the operation control, use an operation corresponding to the operation control with the touch operation as a target operation.

S480: Execute the target operation on the recognition result.

It should be noted that, when the electronic device displays the recognition result based on the full-screen mode, the electronic device can also display the third touch control at the same time, and the function of the third touch control displayed in the full-screen mode is the same as The functions of the third touch controls shown in FIG. 15 are the same.

This embodiment provides a method for content recognition, so that in the state where the electronic device is displaying the collected images in real time, the electronic device automatically recognizes the images displayed in real time by means of prompting and marking After the specified content appears, the electronic device can be directly triggered to recognize the specified content through the touch operation acting on the prompt mark, thereby simplifying the operation process of triggering image recognition and improving user experience. Moreover, in this embodiment, after the recognition result is displayed, a third trigger control that triggers further operations on the recognition result will also be displayed, so that the user can call up the operation menu by directly operating the third trigger control, and based on the operation menu The operation controls in to perform further operations on the recognition results.

Please refer to FIG. 16, a content identification method provided by the present application is applied to electronic devices, and the method includes:

S510: Display the collected images in real time.

S520: In response to the zoom request, perform zoom processing on the captured image.

It should be noted that, in some cases, the size of the object in the collected image is too small, so that the electronic device cannot effectively recognize the object in the image, or the electronic device cannot effectively recognize the text on the object . In other cases, the objects in the collected images cannot be completely displayed on the screen, which will also cause the electronic device to fail to effectively identify the objects. Then, through the zoom processing, the display range of the object in the captured image can be changed on the screen, so that the object can be completely displayed with a higher probability. Wherein, the zooming process in the embodiment of the present application may include zooming by changing the focal length of the image acquisition device, or may be zoomed by digital zooming. Wherein, in the process of digital zooming, the electronic device increases the area of each pixel in the captured image through a processor, so as to achieve the purpose of zooming.

As a manner, the zooming processing of the collected image in response to the zoom request includes: if there is an area selection operation acting on the image displayed in real time, detecting whether the object in the selected area is completely displayed; if is completely displayed, then generate a zoom request of increasing size, and the zoom request of increasing size is used to make the object in the selected area be completely displayed with the first target size; in response to the zoom request of increasing size, Perform zoom processing on the captured image. Wherein, the complete display of the object can be understood as that the overall outline of the object is within the collection range of the image collection device. Exemplarily, as shown in Figure 17, the left image of Figure 17 includes a girl and the bus stop sign next to the girl, and in the illustration on the left side of Figure 17, the overall outline of the girl and the bus stop sign are all in Within the collection range of the image collection device, the girl and the bus stop sign are completely displayed. In the right image of Figure 17, neither the girl's feet nor the right side of the bus stop can be seen, so in the right image of Figure 17, neither the girl nor the bus stop are fully displayed

If it is not fully displayed, generate a zoom request for reducing the size, and the zoom request for reducing the size is used to display the object in the selected area with a second target size; responding to the zoom request for reducing the size , to perform zoom processing on the captured image.

It should be noted that the first target size in the embodiment of the present application is the corresponding maximum size when the objects in the selected area can be completely displayed. It can be understood that when the focal length is increased, the size of the object in the captured image will increase accordingly. Exemplarily, as shown in Figure 18, the image on the left side of Figure 18 shows the image before increasing the focal length, and the image on the right side of Figure 18 shows the image after increasing the focal length, in the image after increasing the focal length The size of the stop sign will be larger than the size in the image on the right, which is beneficial for the text content in the bus stop sign to be detected. However, when the object increases to a certain extent, the object may not be completely displayed. Therefore, in the process of increasing the focal length, ensuring the complete display of the object at the same time is conducive to improving the success detection of the specified content at the object. probability. The second target size is the size when the objects in the selected area can be displayed in the most complete state. It is understandable that during the process of reducing the focal length, it may be reduced to the minimum focal length supported by the image acquisition device in time, and the object cannot be completely displayed, but compared with before the focal length is reduced, the range of the object displayed on the screen will be smaller is larger, thereby benefiting the probability that the specified content at the object is successfully detected.

S530: If the displayed image includes specified content, display a prompt logo at the specified content.

S540: Identify the specified content in response to a touch operation acting on the prompt mark.

S550: Outputting a recognition result.

This embodiment provides a method for content recognition, so that in the state where the electronic device is displaying the collected images in real time, the electronic device automatically recognizes the images displayed in real time by means of prompting and marking After the specified content appears, the electronic device can be directly triggered to recognize the specified content through the touch operation acting on the prompt mark, thereby simplifying the operation process of triggering image recognition and improving user experience. Moreover, in this embodiment, the electronic device may zoom the image acquisition device for capturing images in response to the zoom request triggered by the area selection operation, so that objects in the area selected by the area selection operation can be displayed at a target size. displayed in order to increase the probability of being detected from the live image.

Next, a scenario is used to describe the embodiment of the present application, as shown in Figure 19, the method in this scenario includes:

S610: Turn on the camera.

S620: Enter the enlargement mode.

After the electronic device starts the camera, multiple modes can be configured in the camera. The multiple modes may include a magnification mode. After entering the magnification mode, the electronic device can detect whether there is target information in the image displayed in the viewfinder frame. Wherein, the target information here can be understood as the specified content in the foregoing embodiments. The image displayed in the viewfinder frame is the image collected by the image acquisition device displayed in real time.

S630: A target information pre-recognition frame appears in the viewfinder frame.

Wherein, it may further include: S631: Obtain the location circled by the user. The location circled by the user in S631 can be understood as the area selection operation in the foregoing embodiments, then after acquiring the circled location of the user, the operation performed by the electronic device is the same as the response to the region selection operation in the foregoing embodiments The operations performed afterwards are the same.

Wherein, the box with the pre-identification can be understood as a kind of prompt mark pointed out in the foregoing embodiments.

S640: Click the pre-identification box.

After the electronic device detects the operation of clicking the pre-recognition frame, it can recognize the target information at the pre-recognition frame.

S650: The recognition content is amplified and output.

After the recognition content is amplified and output, S651 may be executed: performing further operations through the operation menu. The further operation may include operations corresponding to the operation controls in the operation menu in the foregoing embodiments.

S660: Enter the reading mode.

Wherein, the reading mode can be understood as entering a full-screen mode in the foregoing embodiments to display the recognition result. After entering the reading mode, S661 can be executed: performing further operations through the operation menu. The further operation may include operations corresponding to the operation controls in the operation menu in the foregoing embodiments.

Please refer to FIG. 20 , a content identification device 600 provided by the present application runs on an electronic device, and the device 600 includes:

The image display unit 610 is configured to display the collected images in real time. Optionally, the image display unit 610 is specifically configured to start the camera program in response to the user's operation; after starting the camera program, display the captured image in real time on the displayed interface of the camera program. The content identification unit 620 is configured to display a prompt mark at the specified content if the displayed image includes the specified content. The identifying unit 630 is configured to identify the specified content in response to a touch operation acting on the prompt mark. The content output unit 640 is configured to output the recognition result. As one manner, the content output unit 640 is specifically configured to obtain a background image, the background image includes an image displayed by the electronic device when the touch operation acts on the prompt sign; the background image is The background displays the recognition result. Optionally, the content output unit 640 is specifically configured to acquire an image to be processed, the image to be processed is an image displayed by the electronic device when a touch operation is applied to the prompt mark; The image is blurred, and the blurred image is used as a background image. The content output unit 640 is also specifically configured to display a first trigger control; after displaying the recognition result with the background image as the background, it further includes: responding to a touch operation acting on the first trigger control, based on The full-screen mode displays the recognition result.

As a manner, the content output unit 640 is specifically configured to display the recognition result. The content output unit 640 is further configured to display a second trigger control; in response to a touch operation acting on the second trigger control, display a lock interface, and the lock interface includes a , the image displayed by the electronic device, and a prompt identifier corresponding to the specified content in the displayed image. The content output unit 640 is further configured to resume real-time display of the collected images in response to the first operation.

As a manner, the content output unit 640 is specifically configured to display the recognition result. The content output unit 640 is further configured to display a third trigger control; in response to a touch operation acting on the third trigger control, an operation menu is displayed, and the operation menu includes at least one operation control, and each operation control corresponds to The operations are different; in response to the touch operation acting on the operation control, the operation corresponding to the operation control with the touch operation is used as the target operation; and the target operation is performed on the recognition result.

Optionally, the content output unit 640 is specifically configured to use the scene image corresponding to the scene expressed by the semantics of the recognition result as the background image if the specified content is text content.

Optionally, the content output unit 640 is specifically configured to use an image corresponding to a keyword in the recognition result as a background image if the specified content is text content.

Optionally, the content output unit 640 is specifically configured to fuse the recognition result and the background image into one image to obtain a fused image; and display the fused image.

Optionally, the content output unit 640 is specifically configured to display the background image, and suspend the recognition result on the displayed background image.

Optionally, the content output unit 640 is specifically configured to, if the specified content is text content, enlarge the size of the recognition result before displaying it.

As one manner, as shown in FIG. 21 , the device further includes: a zoom unit 650 configured to perform zoom processing on the captured image in response to a zoom request. Optionally, the zoom unit 650 is specifically configured to detect whether the object in the selected area is fully displayed if there is an area selection operation acting on the image displayed in real time; if it is fully displayed, generate a zoom request for increasing the size , the size-increasing zoom request is used to completely display the object in the selected area with a first target size; in response to the size-increasing zoom request, zoom processing is performed on the captured image. If it is not fully displayed, generate a zoom request for reducing the size, and the zoom request for reducing the size is used to display the object in the selected area with a second target size; responding to the zoom request for reducing the size , to perform zoom processing on the captured image. Wherein, the specified content includes: text content or a target object.

The content recognition device provided by the present application can display the collected image in real time, and when the displayed image contains specified content, it can display a prompt mark at the specified content, and then respond to the action on the specified content. The touch operation of the above-mentioned prompt mark is used to identify the specified content and output the recognition result. Therefore, through the above method, when the electronic device is displaying the collected image in real time, after the electronic device automatically identifies the specified content appearing in the image displayed in real time by means of prompting and marking, it acts on the The touch operation of the prompt mark can directly trigger the electronic device to recognize the specified content, thereby simplifying the operation process of triggering the recognition of the image and improving the user experience.

It should be noted that the device embodiments in this application correspond to the foregoing method embodiments, and the specific implementation principles of each unit in the device embodiments are similar to those in the foregoing method embodiments. The specific content of the device embodiments Reference may be made to the method embodiments, and details are not repeated in the device embodiments.

An electronic device provided by the present application will be described below with reference to FIG. 22 .

Please refer to FIG. 22 , based on the above text processing method and apparatus, another electronic device 1000 that can implement the above text processing method is provided in the embodiment of the present application. The electronic device 1000 includes one or more (only one is shown in the figure) processors 102 , memory 104 , network module 106 , sensor module 108 , image acquisition device 110 and screen 112 coupled to each other. Wherein, the memory 104 stores programs capable of executing the contents of the foregoing embodiments, and the processor 102 can execute the programs stored in the memory 104 .

Wherein, the processor 102 may include one or more cores for processing data. The processor 102 uses various interfaces and circuits to connect various parts of the entire electronic device 1000, and executes or executes instructions, programs, code sets, or instruction sets stored in the memory 104, and calls data stored in the memory 104 to execute Various functions of the electronic device 1000 and processing data. Optionally, the processor 102 may adopt at least one of Digital Signal Processing (Digital Signal Processing, DSP), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), and Programmable Logic Array (Programmable Logic Array, PLA). implemented in the form of hardware. The processor 102 may integrate one or a combination of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), a modem, and the like. Among them, the CPU mainly handles the operating system, user interface and application programs, etc.; the GPU is used to render and draw the displayed content; the modem is used to handle wireless communication. It can be understood that the above-mentioned modem may not be integrated into the processor 102, but may be realized by a communication chip alone.

The memory 104 may include random access memory (Random Access Memory, RAM), and may also include read-only memory (Read-Only Memory). Memory 104 may be used to store instructions, programs, codes, sets of codes, or sets of instructions. The memory 104 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playback function, an image playback function, etc.) , instructions for implementing the following method embodiments, and the like. For example, content identification means may be stored in the memory 104 . The device for content identification may be the aforementioned device 600 . The storage data area can also store data created by the electronic device 1000 during use (such as phonebook, audio and video data, chat record data) and the like.

The network module 106 is used to receive and send electromagnetic waves, realize mutual conversion between electromagnetic waves and electrical signals, and communicate with communication networks or other devices, such as audio playback devices. The network module 106 may include various existing circuit elements for performing these functions, such as antennas, radio frequency transceivers, digital signal processors, encryption/decryption chips, Subscriber Identity Module (SIM) cards, memory, etc. . The network module 106 can communicate with various networks such as the Internet, intranet, wireless network or communicate with other devices through the wireless network. The wireless network mentioned above may include a cellular telephone network, a wireless local area network or a metropolitan area network. For example, the network module 106 can perform information exchange with the base station.

The sensor module 108 may include at least one sensor. Specifically, the sensor module 108 may include, but is not limited to: a light sensor, a motion sensor, a pressure sensor, an infrared heat sensor, a distance sensor, an acceleration sensor, and other sensors.

Wherein, the pressure sensor may be a sensor for detecting pressure generated by pressing on the electronic device 1000 . That is, the pressure sensor detects pressure generated by contact or press between the user and the electronic device, eg, contact or press between the user's ear and the mobile terminal. Therefore, the pressure sensor can be used to determine whether contact or pressure occurs between the user and the electronic device 1000, and the magnitude of the pressure.

Among them, the acceleration sensor can detect the magnitude of acceleration in various directions (generally three axes), and can detect the magnitude and direction of gravity when it is stationary, and can be used to identify the application of the posture of the electronic device 1000 (such as horizontal and vertical screen switching, related games, Magnetometer posture calibration), vibration recognition related functions (such as pedometer, tapping), etc. In addition, the electronic device 1000 may also be configured with other sensors such as a gyroscope, a barometer, a hygrometer, and a thermometer, which will not be repeated here.

The image acquisition device 110 can be used for image acquisition, so that the electronic device 1000 can display the acquired image on the screen 112 .

Please refer to FIG. 23 , which shows a structural block diagram of a computer-readable storage medium provided by an embodiment of the present application. Program codes are stored in the computer-readable storage medium 1100, and the program codes can be invoked by a processor to execute the methods described in the foregoing method embodiments.

The computer readable storage medium 1100 may be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM. Optionally, the computer-readable storage medium 1100 includes a non-transitory computer-readable storage medium (non-transitory computer-readable storage medium). The computer-readable storage medium 1100 has a storage space for program code 1110 for executing any method steps in the above methods. These program codes can be read from or written into one or more computer program products. Program code 1110 may, for example, be compressed in a suitable form.

In summary, the present application provides a content identification method, device, electronic equipment, and storage medium. In this method, the collected image is displayed in real time, and when the displayed image includes specified content, a prompt mark can be displayed at the specified content, and then in response to the touch operation acting on the prompt mark, Recognize the specified content and output the recognition result. Therefore, through the above method, when the electronic device is displaying the collected image in real time, after the electronic device automatically identifies the specified content appearing in the image displayed in real time by means of prompting and marking, it acts on the The touch operation of the prompt mark can directly trigger the electronic device to recognize the specified content, thereby simplifying the operation process of triggering the recognition of the image and improving the user experience.

In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic is included in at least one embodiment or example of the present application. In this specification, the schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.

In addition, the terms "first" and "second" are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, the features defined as "first" and "second" may explicitly or implicitly include at least one of these features. In the description of the present application, "plurality" means at least two, such as two, three, etc., unless otherwise specifically defined.

Any process or method descriptions in flowcharts or otherwise described herein may be understood to represent modules, segments or portions of code comprising one or more executable instructions for implementing specific logical functions or steps of the process , and the scope of preferred embodiments of the present application includes additional implementations in which functions may be performed out of the order shown or discussed, including in substantially simultaneous fashion or in reverse order depending on the functions involved, which shall It should be understood by those skilled in the art to which the embodiments of the present application belong.

The logic and/or steps represented in the flowcharts or otherwise described herein, for example, can be considered as a sequenced listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium, For use with instruction execution systems, devices, or devices (such as computer-based systems, systems including processors, or other systems that can fetch instructions from instruction execution systems, devices, or devices and execute instructions), or in conjunction with these instruction execution systems, devices or equipment used.

It should be understood that each part of the present application may be realized by hardware, software, firmware or a combination thereof. In the embodiments described above, various steps or methods may be implemented by software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented by any one or combination of the following techniques known in the art: Discrete logic circuits, ASICs with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), etc.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still Modifications are made to the technical solutions described in the foregoing embodiments, or equivalent replacements are made to some of the technical feature diagrams; and these modifications or replacements do not drive the essence of the corresponding technical solutions away from the spirit and scope of the technical solutions of the various embodiments of the application .

Claims

A content identification method, which is applied to an electronic device, the method comprising:

Real-time display of the collected images;

If the displayed image includes specified content, a prompt logo is displayed at the specified content;

Responding to a touch operation acting on the prompt mark, identifying the specified content;

Output the recognition result.
The method according to claim 1, wherein said outputting the recognition result comprises:

Acquiring a background image, where the background image includes an image displayed by the electronic device when the touch operation acts on the prompt sign;

The recognition result is displayed with the background image as the background.
The method according to claim 1, wherein said outputting the recognition result comprises:

Acquiring an image to be processed, where the image to be processed is an image displayed by the electronic device when the touch operation acts on the prompt sign;

Perform blurring processing on the image to be processed, and use the blurred image as a background image;

The recognition result is displayed with the background image as the background.
The method according to claim 2 or 3, wherein the output recognition result further comprises:

Display the first trigger control;

After displaying the recognition result with the background image as the background, it also includes:

In response to a touch operation acting on the first trigger control, the recognition result is displayed based on a full-screen mode.
The method according to claim 2 or 3, wherein the method further comprises:

If the specified content is text content, use the scene image corresponding to the scene expressed by the semantics of the recognition result as the background image.
The method according to claim 2 or 3, wherein the method further comprises:

If the specified content is text content, the image corresponding to the keyword in the recognition result is used as the background image.
The method according to claim 2 or 3, wherein the displaying the recognition result with the background image as the background includes:

merging the recognition result and the background image into one image to obtain a fused image;

The fused image is displayed.
The method according to claim 2 or 3, wherein the displaying the recognition result with the background image as the background includes:

The background image is displayed, and the recognition result is suspended on the displayed background image.
The method according to any one of claims 1-8, wherein said outputting the recognition result comprises:

Display the recognition result;

The response acts on the touch operation of the prompt mark, and after identifying the specified content, it also includes:

Show the second trigger control;

In response to the touch operation acting on the second trigger control, a locking interface is displayed, and the locking interface includes the image displayed by the electronic device when the touch operation acting on the prompt sign is acted on, and the The prompt ID corresponding to the specified content in the displayed image.
The method according to claim 9, wherein, after displaying the locking interface in response to the touch operation acting on the second trigger control, further comprising:

In response to the first operation, real-time display of the acquired images is resumed.
The method according to any one of claims 1-8, wherein said outputting the recognition result comprises:

Display the recognition result;

The response acts on the touch operation of the prompt mark, and after identifying the specified content, it also includes:

Display the third trigger control;

In response to the touch operation acting on the third trigger control, an operation menu is displayed, the operation menu includes at least one operation control, and each operation control corresponds to a different operation;

In response to the touch operation acting on the operation control, using the operation corresponding to the operation control with the touch operation as the target operation;

performing the target operation on the recognition result.
The method according to claim 9 or 11, wherein said displaying the recognition result comprises:

If the specified content is text content, the recognition result is enlarged in size and then displayed.
The method according to any one of claims 1-12, wherein the response to the touch operation acting on the prompt mark, before identifying the specified content, further includes:

In response to the zoom request, the captured image is zoomed.
The method according to claim 13, wherein said performing zoom processing on the captured image in response to the zoom request comprises:

If there is an area selection operation in the image displayed in real time, detect whether the object in the selected area is completely displayed;

If it is fully displayed, then generate a zoom request for increasing the size, and the zoom request for increasing the size is used to make the object in the selected area be completely displayed with the first target size;

In response to the zoom request for increasing the size, zoom processing is performed on the captured image.
The method according to claim 14, wherein said performing zoom processing on the captured image in response to the zoom request further comprises:

If not completely displayed, generating a zoom request for reducing the size, the zoom request for reducing the size is used to display the object in the selected area with a second target size;

In response to the size-reducing zoom request, zoom processing is performed on the captured image.
The method according to any one of claims 1-15, wherein the real-time display of the collected images comprises:

launch the camera program in response to user actions;

After the camera program is started, the collected images are displayed in real time in the interface of the displayed camera program.
The method according to any one of claims 1-16, wherein the specified content includes: text content or a target object.
A content identification device, wherein, running on an electronic device, the device includes:

An image display unit is used to display the collected images in real time;

A content identification unit, configured to display a prompt mark at the specified content if the displayed image includes specified content;

An identification unit, configured to identify the specified content in response to a touch operation acting on the prompt mark;

The content output unit is used to output the recognition result.
An electronic device comprising one or more processors and memory;

One or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any one of claims 1-11.
A computer-readable storage medium storing program codes executable by a processor, wherein the computer-readable storage medium includes stored program codes, wherein any one of claims 1-11 is executed when the program codes run. the method described.