CN110971820B

CN110971820B - Photographing method, photographing device, mobile terminal and computer readable storage medium

Info

Publication number: CN110971820B
Application number: CN201911163621.0A
Authority: CN
Inventors: 吴恒刚
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2019-11-25
Filing date: 2019-11-25
Publication date: 2021-03-26
Anticipated expiration: 2039-11-25
Also published as: CN110971820A

Abstract

The application is applicable to the technical field of photographing, and provides a photographing method, a photographing device, a mobile terminal and a computer readable storage medium, which comprise: when the current photographing scene is a text scene, starting a text detection function; acquiring a plurality of first preview frames; performing text detection on each first preview frame in the plurality of first preview frames through the text detection function to obtain a detection result of each first preview frame; when a photographing instruction is received, a photographing frame is obtained; detecting whether the photographed frame contains a text or not through the text detection function; and if the photographed frame does not contain texts, acquiring a target frame according to the detection results of the plurality of first preview frames. The text recognition accuracy can be improved through the text recognition method and the text recognition device.

Description

Photographing method, photographing device, mobile terminal and computer readable storage medium

Technical Field

The present application belongs to the field of photographing technologies, and in particular, to a photographing method, a photographing apparatus, a mobile terminal, and a computer-readable storage medium.

Background

At present, most mobile terminals are integrated with a photographing function, and text recognition can be performed during photographing, but usually a manual switching is required to be performed to a corresponding text recognition mode, text recognition is performed in the mode, and usually text recognition is performed on a currently photographed photo, so that the recognition accuracy is low, for example, there may be a situation that a text exists on an object to be photographed but the text is not recognized on the photographed photo.

Disclosure of Invention

The application provides a photographing method, a photographing device, a mobile terminal and a computer readable storage medium, so as to improve the identification accuracy of a text.

In a first aspect, an embodiment of the present application provides a photographing method, where the photographing method includes:

when the current photographing scene is a text scene, starting a text detection function;

acquiring a plurality of first preview frames;

performing text detection on each first preview frame in the plurality of first preview frames through the text detection function to obtain a detection result of each first preview frame;

when a photographing instruction is received, a photographing frame is obtained;

detecting whether the photographed frame contains a text or not through the text detection function;

and if the photographed frame does not contain texts, acquiring a target frame according to the detection results of the plurality of first preview frames.

In a second aspect, an embodiment of the present application provides a photographing apparatus, including:

the function starting module is used for starting a text detection function when the current photographing scene is a text scene;

the first frame acquisition module is used for acquiring a plurality of first preview frames;

the result acquisition module is used for performing text detection on each first preview frame in the plurality of first preview frames through the text detection function to acquire a detection result of each first preview frame;

the photographing frame acquiring module is used for acquiring a photographing frame when a photographing instruction is received;

the text detection module is used for detecting whether the photographed frame contains a text or not through the text detection function;

and the target frame acquisition module is used for acquiring a target frame according to the detection results of the plurality of first preview frames if the photographed frame does not contain a text.

In a third aspect, an embodiment of the present application provides a mobile terminal, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the photographing method according to the first aspect when executing the computer program.

In a fourth aspect, the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the photographing method according to the first aspect.

In a fifth aspect, the present application provides a computer program product, which when run on a mobile terminal, causes the mobile terminal to execute the steps of the photographing method according to the first aspect.

Therefore, according to the scheme, when the shooting scene is a text scene, the text detection function is automatically started, the situation that a user manually switches the text shooting mode can be avoided, the operation of starting the text detection function is simplified, the starting efficiency of the text detection function is improved, the detection results of a plurality of preview frames and the shooting frames are obtained through the text detection function, when the detection result of the shooting frame is that no text is detected, whether the detection result of the shooting frame is wrong or not is verified by combining the detection results of the plurality of preview frames, and the identification accuracy rate of the text during shooting is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic flow chart illustrating an implementation of a photographing method according to an embodiment of the present application;

fig. 2 is a schematic flow chart illustrating an implementation of a photographing method provided in the second embodiment of the present application;

fig. 3 is a schematic structural diagram of a photographing device according to a third embodiment of the present application;

fig. 4 is a schematic structural diagram of a mobile terminal according to a fourth embodiment of the present application;

fig. 5 is a schematic structural diagram of a mobile terminal according to a fifth embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

As used in the specification and throughout this application, the term "if" may be interpreted contextually as "when" or "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

It should be understood that, the sequence numbers of the steps in this embodiment do not mean the execution sequence, and the execution sequence of each process should be determined by the function and the inherent logic of the process, and should not constitute any limitation to the implementation process of the embodiment of the present application.

In order to explain the technical solution described in the present application, the following description will be given by way of specific examples.

Referring to fig. 1, which is a schematic view of an implementation flow of a photographing method provided in an embodiment of the present application, where the photographing method is applied to a mobile terminal, and as shown in the figure, the photographing method may include the following steps:

and step S101, when the shooting scene is a text scene, starting a text detection function.

In the embodiment of the present application, the photographing scene may refer to a scene to which a photographing subject belongs, including but not limited to a portrait scene, a night scene, a text scene, an animal scene, a plant scene, and the like, for example, when the photographing scene is a portrait, it is determined that the photographing scene is a portrait scene; when the shooting subject is a night scene, determining the shooting scene as the night scene; when the photographing main body is a text, determining that the photographing scene is a text scene; when the photographing subject is an animal, determining that the photographing subject is an animal scene; and when the photographing main body is a plant, determining that the photographing scene is the plant scene.

In the embodiment of the present application, the text detection function may refer to a function of detecting whether text is contained in an image. When the mobile terminal detects that the current photographing scene is a text scene, the text detection function is automatically started, so that the situation that a user manually switches the photographing mode to the text recognition mode can be avoided, the operation of starting the text detection function is simplified, and the starting efficiency of the text detection function is improved.

Optionally, before starting the text detection function, the embodiment of the present application further includes:

acquiring a plurality of second preview frames;

and acquiring the current photographing scene according to the plurality of second preview frames.

In this embodiment of the application, a plurality of second preview frames for scene detection may be obtained in a preset photographing mode, where the preset photographing mode may refer to a preset photographing mode in which a text detection function is not started, for example, a normal photographing mode in a mobile phone, in which a text detection function is not started in the normal photographing mode, and photographing modes such as a portrait and a night scene are not used. The second preview frame may refer to an image of a captured preview screen, and specifically may refer to an image of a preview screen captured by the mobile terminal in a preset photographing mode. When the camera of the mobile terminal is opened, the picture can be captured in real time, the preview picture is also the picture captured by the camera, and the preview picture can be displayed on the display screen of the mobile terminal, so that a user can conveniently check the current photographing content and the photographing effect through the preview picture.

When the current photographing scene is obtained according to the plurality of second preview frames, a scene recognition algorithm can be used for carrying out scene recognition on the plurality of second preview frames or inputting the plurality of second preview frames into a trained scene recognition model, the photographing scene of the plurality of recognized second preview frames is obtained, the photographing scene with the largest number in the plurality of photographing scenes is counted, and the photographing scene is determined to be the current photographing scene. For example, if there are five second preview frames, and the photographed scene of the five second preview frames includes three text scenes and two portrait scenes, it is determined that the current photographed scene is a text scene. The scene recognition algorithm is an algorithm for performing scene recognition on the image, and the scene recognition mode is a model for performing scene recognition on the image.

Step S102, a plurality of first preview frames are acquired.

The first preview frame may refer to an image of a preview screen captured when the text detection function is activated. After the text detection function is started, the image of the preview picture (for example, the image of the preview picture is collected every 2 seconds) may be collected at a preset time interval until a photographing instruction is received, and if the preset time interval is reached when the photographing instruction is received, the last image of the preview picture is collected, and the image of the preview picture is not collected any more subsequently.

Step S103, performing text detection on each of the plurality of first preview frames through the text detection function, and obtaining a detection result of each of the plurality of first preview frames.

In the embodiment of the application, after the plurality of first preview frames are obtained, text detection can be performed on each first preview frame in the plurality of first preview frames through a text detection function, so that a detection result of each first preview frame is obtained; or when a first preview frame is acquired, text detection is performed on the first preview frame through a text detection function, so as to acquire a detection result of the first preview frame. The detection result of one first preview frame comprises an undetected text and a detected text, and when the text is detected, the detection result further comprises the position of the text in the first preview frame or the image of the area where the text is located in the first preview frame.

In this embodiment of the application, after the detection result of each of the plurality of first preview frames is obtained, the detection result of each of the plurality of first preview frames may be cached in the mobile terminal, so that when the photographed frame does not detect a text, the detection results of the plurality of first preview frames are obtained from the cache.

And step S104, acquiring a photographing frame when the photographing instruction is received.

The photographing instruction may be an instruction for instructing the mobile terminal to photograph, for example, when a click operation of a photographing button on a photographing interface by a user is detected, the photographing instruction is triggered. The photographing frame may refer to an image photographed when a photographing instruction is received.

Step S105, detecting whether the photographed frame contains text or not through the text detection function.

In the embodiment of the application, when the photographed frame is detected not to contain the text, the detection results of the plurality of first preview frames are obtained from the cache, and the target frame is obtained according to the detection results of the plurality of first preview frames; when the photographed frame is detected to contain the text, the image of the region where the text is located can be obtained from the photographed frame, and the image is determined to be the target frame.

And step S106, if the photographed frame does not contain texts, acquiring a target frame according to the detection results of the plurality of first preview frames.

In the embodiment of the application, when it is detected that the photographed frame does not contain the text, text recognition can be performed again by combining the text detection results of the plurality of preview frames, so as to further verify whether the text detection result of the photographed frame is wrong or not, and the recognition rate of the file during photographing is improved. The target frame may refer to one frame of image finally saved in the mobile terminal.

Optionally, after the target frame is obtained, the embodiment of the present application further includes:

detecting whether the target frame is the photographing frame;

if the target frame is not the photographing frame, acquiring the size of the target frame;

acquiring the size of the photographing frame;

and if the size of the target frame is different from the size of the photographing frame, adjusting the size of the target frame to the size of the photographing frame.

Wherein, the size of the target frame may refer to the length and width of the target frame; the size of the photograph frame may refer to the length and width of the photograph frame. The length and width of the image (e.g., the subject frame and the photographed frame) may be in pixels or centimeters, and pixels and resolution pixels are the most basic units of the image.

In the embodiment of the present application, in order to ensure the accuracy of the target frame, after the target frame is acquired, whether the target frame is a photo frame may be detected first, and when the target frame is not the photo frame, the size of the target frame is adjusted to the size of the photo frame, so as to ensure that the size of the target frame is the same as the size of the photo frame. For example, if the image resolution is selected as the size of the target frame and the photo frame in units of Dots Per Inch (Dots Per inc, DPI), the size of the target frame is 640x480, and the size of the photo frame is 1024x768, the size of the target frame needs to be adjusted from 640x480 to 1024x 768. DPI is a unit of measure for a dot-matrix digital image, which refers to the number of sampled, displayable or output dots per inch of length.

According to the text detection method and the text detection device, when the shooting scene is a text scene, the text detection function is automatically started, the situation that a user manually switches the text shooting mode can be avoided, the operation of starting the text detection function is simplified, the starting efficiency of the text detection function is improved, the detection results of the plurality of preview frames and the shooting frame are obtained through the text detection function, when the detection result of the shooting frame is that no text is detected, whether the detection result of the shooting frame is wrong or not is verified by combining the detection results of the plurality of preview frames, and the identification accuracy of the text during shooting is improved.

Referring to fig. 2, which is a schematic view of an implementation flow of a photographing method provided in the second embodiment of the present application, where the photographing method is applied to a mobile terminal, as shown in the figure, the photographing method may include the following steps:

step S201, when the current photo scene is a text scene, starting a text detection function.

The step is the same as step S101, and reference may be made to the related description of step S101, which is not repeated herein.

In step S202, a plurality of first preview frames are acquired.

The step is the same as step S102, and reference may be made to the related description of step S102, which is not repeated herein.

Optionally, the embodiment of the present application further includes:

acquiring a timestamp of each first preview frame in the plurality of first preview frames, wherein the timestamp of each first preview frame refers to the time for acquiring each first preview frame;

and storing the timestamp of each first preview frame as a key and the detection result of each first preview frame as a value in a hash map.

In the embodiment of the application, when each first preview frame is acquired, the timestamp of each first preview frame is acquired, and the corresponding relationship between the timestamp of each first preview frame and the detection result is established, so that the mobile terminal can acquire the corresponding detection result according to the timestamp. Specifically, the corresponding relationship between the timestamp of each first preview frame and the detection result may be established in a cache through hash mapping. Optionally, the corresponding relationship between the timestamp of each first preview frame and the detection result may also be established in the cache through other storage structures, which is not limited herein. HashMap is an implementation of a Map interface based on a hash table, which provides all optional mapping operations, and stores key-value (key-value) mapping, where the embodiment takes a timestamp as a key and takes a detection result as a value.

Step S203, performing text detection on each of the plurality of first preview frames through the text detection function, and obtaining a detection result of each of the plurality of first preview frames.

The step is the same as step S103, and reference may be made to the related description of step S103, which is not described herein again.

In step S204, when the photographing instruction is received, a photographing frame is acquired.

The step is the same as step S104, and reference may be made to the related description of step S104, which is not repeated herein.

Optionally, the embodiment of the present application further includes:

and acquiring the time stamp of the photographing frame, wherein the time stamp of the photographing frame refers to the time of acquiring the photographing frame.

Step S205, detecting whether the photographed frame includes a text or not by the text detection function.

The step is the same as step S105, and reference may be made to the related description of step S105, which is not repeated herein.

Step S206, if the photographed frame does not include a text, detecting whether a first target preview frame matching the photographed frame exists in the plurality of first preview frames.

The first target preview frame is a first preview frame matched with the photographed frame in the plurality of first preview frames, and the detection result of the first target preview frame is a detected text.

Optionally, the detecting whether a first target preview frame matched with the photographed frame exists in the plurality of first preview frames includes:

acquiring the time stamp of the first preview frame closest to the time stamp of the photographing frame from the time stamps of the plurality of first preview frames;

step a, acquiring a detection result of a first preview frame closest to the timestamp of the photographed frame from the Hash mapping according to the timestamp of the first preview frame closest to the timestamp of the photographed frame;

b, if the detection result of the first preview frame closest to the timestamp of the photographing frame is the detected text, determining the first preview frame closest to the timestamp of the photographing frame as the first target preview frame;

step c, if the detection result of the first preview frame closest to the timestamp of the photographed frame is that no text is detected, acquiring the first preview frame with the timestamp closest to the timestamp of the photographed frame from the rest first preview frames;

repeatedly executing the steps a, b and c until the detection result of the first preview frame closest to the timestamp of the photographing frame is that a text is detected or the first preview frames are traversed;

and if the detection results of the first preview frames which are closest to the time stamps of the photographing frames are all undetected texts after traversing the plurality of first preview frames, determining that no first target preview frame matched with the photographing frames exists in the plurality of first preview frames.

In the embodiment of the application, according to the timestamps of the plurality of first preview frames and the timestamp of the photographing frame, the first preview frame with the timestamp closest to the timestamp of the photographing frame is obtained, the content of the first preview frame with the timestamp closer to the photographing frame is more similar to the content of the photographing frame, the first target preview frame is searched through the timestamp, the first target preview frame can be obtained from the preview frame with the similarity to the photographing frame, and therefore more accurate text content is obtained, because the content of the preview frame with the similarity to the photographing frame is the content that a user wants to obtain.

The method comprises the steps that a shooting instruction is received, and a first preview frame is obtained when the shooting instruction is received, so that the time stamp of each first preview frame in a plurality of first preview frames is smaller than or equal to the time stamp, the time stamp of the shooting frame is subtracted from the time stamp of the first preview frame, and the first preview frame of the time stamp corresponding to the minimum difference value in the obtained difference values is the first preview frame with the time stamp closest to the time stamp of the shooting frame.

Step S207, if yes, acquiring an image of an area where the text in the first target preview frame is located, and determining the image of the area where the text in the first target preview frame is located as a target frame.

In the embodiment of the application, when the detection result of the first target preview frame includes the position of the text in the first target preview frame, the position of the text in the first target preview frame is obtained from the detection result of the first target preview frame, and the image of the area where the text is located is obtained from the first target preview frame according to the position; and when the detection result of the first target preview frame comprises the image of the area where the text is located, acquiring the image of the area where the text is located in the first target preview frame from the detection result of the first target preview frame.

In step S208, if not, it is determined that the photographed frame is the target frame.

In the embodiment of the application, after traversing the plurality of first preview frames, if it is detected that the first target preview frame matched with the photographed frame does not exist in the plurality of first preview frames, it is determined that the detection result of the photographed frame is correct, and at this time, the photographed frame can be used as the target frame.

According to the embodiment of the application, the preview frame matched with the photographing frame can be rapidly acquired through the timestamp, whether the detection result of the photographing frame is wrong or not is verified by combining the preview frame matched with the photographing frame, and the identification accuracy rate of the text during photographing is improved.

Referring to fig. 3, a schematic diagram of a photographing apparatus provided in the third embodiment of the present application is shown, and for convenience of description, only the parts related to the third embodiment of the present application are shown.

The photographing apparatus includes:

the function starting module 31 is configured to start a text detection function when the current shooting scene is a text scene;

a first frame acquiring module 32, configured to acquire a plurality of first preview frames;

a result obtaining module 33, configured to perform text detection on each of the plurality of first preview frames through the text detection function, and obtain a detection result of each of the plurality of first preview frames;

a photographing frame acquiring module 34, configured to acquire a photographing frame when a photographing instruction is received;

a text detection module 35, configured to detect whether the photographed frame includes a text through the text detection function;

and a target frame obtaining module 36, configured to, if the photographed frame does not include a text, obtain a target frame according to the detection results of the plurality of first preview frames.

Optionally, the target frame acquiring module 36 includes:

the detection unit is used for detecting whether a first target preview frame matched with the photographed frame exists in the plurality of first preview frames or not, wherein the detection result of the first target preview frame is a detected text;

a first determining unit, configured to, if the detection result of the detecting unit is yes, obtain an image of an area where the text is located in the first target preview frame, and determine the image of the area where the text is located in the first target preview frame as the target frame;

and the second determining unit is used for determining the photographed frame as the target frame if the detection result of the detecting unit is not present.

Optionally, the photographing apparatus further includes:

a first time obtaining module, configured to obtain a timestamp of each of the plurality of first preview frames, where the timestamp of each of the plurality of first preview frames refers to a time when each of the plurality of first preview frames is obtained;

and the second time acquisition module is used for acquiring the time stamp of the photographing frame, wherein the time stamp of the photographing frame refers to the time for acquiring the photographing frame.

Optionally, the photographing apparatus further includes:

and the storage module is used for storing the timestamp of each first preview frame as a key and the detection result of each first preview frame as a value in a hash map.

Optionally, the detecting unit includes:

a first acquiring subunit configured to acquire, from the timestamps of the plurality of first preview frames, a timestamp of a first preview frame closest to the timestamp of the photographed frame;

the second obtaining subunit is configured to obtain, from the hash mapping, a detection result of the first preview frame closest to the timestamp of the photographed frame according to the timestamp of the first preview frame closest to the timestamp of the photographed frame;

a first determining subunit, configured to determine, if a detection result of a first preview frame closest to the timestamp of the photographed frame is a detected text, that the first preview frame closest to the timestamp of the photographed frame is the first target preview frame;

a third obtaining subunit, configured to, if a detection result of a first preview frame closest to the timestamp of the photographed frame is that no text is detected, obtain, from remaining first preview frames, a first preview frame whose timestamp is closest to the timestamp of the photographed frame;

the processing subunit is configured to repeatedly execute the second acquiring subunit, the first determining subunit and the third acquiring subunit until the detection result of the acquired first preview frame closest to the timestamp of the photographed frame is that a text is detected or the plurality of first preview frames are traversed;

and the second determining subunit is configured to determine that no first target preview frame matched with the photographed frame exists in the plurality of first preview frames if all the detection results of the first preview frames obtained closest to the timestamp of the photographed frame are undetected texts after the plurality of first preview frames are traversed.

Optionally, the photographing apparatus further includes:

the frame detection module is used for detecting whether the target frame is the photographing frame;

a first size obtaining module, configured to obtain a size of the target frame if the target frame is not the photo frame;

the second size acquisition module is used for acquiring the size of the photographing frame;

and the size adjusting module is used for adjusting the size of the target frame to the size of the photographing frame if the size of the target frame is different from the size of the photographing frame.

The photographing device provided in the embodiment of the present application can be applied to the first method embodiment and the second method embodiment, and for details, reference is made to the description of the first method embodiment and the second method embodiment, and details are not repeated herein.

Fig. 4 is a schematic diagram of a mobile terminal according to a fourth embodiment of the present application. The mobile terminal as shown in the figure may include: one or more processors 401 (only one shown); one or more input devices 402 (only one shown), one or more output devices 403 (only one shown), and memory 404. The processor 401, the input device 402, the output device 403, and the memory 404 are connected by a bus 405. The memory 404 is used for storing instructions and the processor 401 is used for executing the instructions stored by the memory 404. Wherein:

the processor 401 is configured to start a text detection function when the current photographing scene is a text scene; acquiring a plurality of first preview frames; performing text detection on each first preview frame in the plurality of first preview frames through the text detection function to obtain a detection result of each first preview frame; when a photographing instruction is received, a photographing frame is obtained; detecting whether the photographed frame contains a text or not through the text detection function; and if the photographed frame does not contain texts, acquiring a target frame according to the detection results of the plurality of first preview frames.

Optionally, the processor 401 is specifically configured to:

detecting whether a first target preview frame matched with the photographed frame exists in the plurality of first preview frames or not, wherein the detection result of the first target preview frame is a detected text;

if the target frame exists, acquiring an image of an area where the text in the first target preview frame is located, and determining the image of the area where the text in the first target preview frame is located as the target frame;

and if the target frame does not exist, determining the photographing frame as the target frame.

Optionally, the processor 401 is further configured to:

Optionally, the processor 401 is specifically configured to:

Optionally, the processor 401 is further configured to:

detecting whether the target frame is the photographing frame;

acquiring the size of the photographing frame;

Optionally, the processor 401 is further configured to:

acquiring a plurality of second preview frames;

It should be understood that, in the embodiment of the present Application, the Processor 401 may be a Central Processing Unit (CPU), and the Processor may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The input device 402 may include a touch pad, a fingerprint sensor (for collecting fingerprint information of a user and direction information of the fingerprint), a microphone, a data receiving interface, and the like. The output devices 403 may include a display (LCD, etc.), speakers, a data transmission interface, and the like.

The memory 404 may include a read-only memory and a random access memory, and provides instructions and data to the processor 401. A portion of the memory 404 may also include non-volatile random access memory. For example, the memory 404 may also store device type information.

In a specific implementation, the processor 401, the input device 402, the output device 403, and the memory 404 described in this embodiment of the present application may execute the implementation described in the embodiment of the photographing method provided in this embodiment of the present application, and may also execute the implementation described in the photographing apparatus described in the third embodiment, which is not described herein again.

Fig. 5 is a schematic diagram of a mobile terminal according to a fifth embodiment of the present application. As shown in fig. 5, the mobile terminal 5 of this embodiment includes: one or more processors 50 (only one shown), a memory 51, and a computer program 52 stored in the memory 51 and executable on the at least one processor 50. The processor 50 implements the steps of the above-described various photographing method embodiments when executing the computer program 52.

The mobile terminal 5 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The mobile terminal may include, but is not limited to, a processor 50, a memory 51. Those skilled in the art will appreciate that fig. 5 is merely an example of a mobile terminal 5 and does not constitute a limitation of the mobile terminal 5 and may include more or less components than those shown, or some of the components may be combined, or different components, e.g., the mobile terminal may also include input-output devices, network access devices, buses, etc.

The processor 50 may be a central processing unit CPU, but may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate arrays FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 51 may be an internal storage unit of the mobile terminal 5, such as a hard disk or a memory of the mobile terminal 5. The memory 51 may also be an external storage device of the mobile terminal 5, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, provided on the mobile terminal 5. Further, the memory 51 may also include both an internal storage unit and an external storage device of the mobile terminal 5. The memory 51 is used for storing the computer program and other programs and data required by the mobile terminal. The memory 51 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/mobile terminal and method may be implemented in other ways. For example, the above-described apparatus/mobile terminal embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

When the computer program product runs on a mobile terminal, the steps in the method embodiments can be realized when the mobile terminal executes the computer program product.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A photographing method, comprising:

when the current photographing scene is a text scene, starting a text detection function, wherein the text detection function is a function for detecting whether a text is contained in an image;

acquiring a plurality of first preview frames;

2. The photographing method of claim 1, wherein the acquiring a target frame according to the detection results of the plurality of first preview frames comprises:

3. The photographing method according to claim 2, wherein the photographing method further comprises:

4. The photographing method according to claim 3, wherein the photographing method further comprises:

5. The photographing method of claim 4, wherein the detecting whether there is a first target preview frame matching the photographed frame in the plurality of first preview frames comprises:

6. The photographing method of claim 1, further comprising, after acquiring the target frame:

detecting whether the target frame is the photographing frame;

acquiring the size of the photographing frame;

7. The photographing method according to any one of claims 1 to 6, further comprising, before starting the text detection function:

acquiring a plurality of second preview frames;

8. A photographing apparatus, comprising:

the function starting module is used for starting a text detection function when the current photographing scene is a text scene, wherein the text detection function is a function for detecting whether a text is contained in an image;

9. A mobile terminal comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the steps of the photographing method according to any one of claims 1 to 7 are implemented when the processor executes the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the photographing method according to any one of claims 1 to 7.