CN111428445A

CN111428445A - Document generation method and electronic equipment

Info

Publication number: CN111428445A
Application number: CN202010206740.6A
Authority: CN
Inventors: 张汉霖
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2020-03-23
Filing date: 2020-03-23
Publication date: 2020-07-17

Abstract

The embodiment of the invention discloses a document generation method and electronic equipment, and aims to solve the problems of troublesome operation and low efficiency caused by the fact that a user needs to input text contents recorded by a paper pen to the electronic equipment for the second time. The method is applied to electronic equipment, the electronic equipment comprises a camera device and an equipment body, the camera device is detachably connected with the equipment body, and the method comprises the following steps: acquiring a dynamic image through the camera device, wherein the dynamic image comprises text content currently recorded by a user; identifying text content in the dynamic image; and generating and storing a document based on the identified text content.

Description

Document generation method and electronic equipment

Technical Field

The embodiment of the invention relates to the technical field of electronic equipment, in particular to a document generation method and electronic equipment.

Background

With the increasing popularity of electronic devices (such as mobile phones, computers or tablets), carriers of text content are also becoming electronic. Reading and editing documents in formats such as DOC/PDF/TXT has become a habit for users in their daily lives.

Generally, a user has an instant demand for recording text contents at any place and time due to various transactions, such as recording the text contents with paper, sorting the ideas, and the like. However, the difference between the text content recorded by the paper pen and the supporting body of the document also causes the problems that the user is difficult to find and the efficiency of retrieving the paper and electronic content (paper content or document) is low.

In a real life scene, a user generally needs to input the text content recorded by the paper pen to the electronic device for the second time, which is not only troublesome to operate, but also inefficient. Specifically, for example, when a user uses a paper pen to quickly record text contents in a meeting, and when the user desires to forward the text contents, the text contents recorded by the paper pen must be input to a computer for a second time, which is troublesome to operate and inefficient.

Disclosure of Invention

The embodiment of the invention provides a document generation method and electronic equipment, and aims to solve the problems of troublesome operation and low efficiency caused by the fact that a user needs to input text contents recorded by a paper pen to the electronic equipment for the second time.

In order to solve the above technical problem, the embodiment of the present invention is implemented as follows:

in a first aspect, a document generation method is provided, and is applied to an electronic device, where the electronic device includes an image capture apparatus and a device body, and the image capture apparatus and the device body are detachably connected, and the method includes:

acquiring a dynamic image through the camera device, wherein the dynamic image comprises text content currently recorded by a user;

identifying text content in the dynamic image;

and generating and storing a document based on the identified text content.

In a second aspect, an electronic device is provided, which includes an image pickup apparatus, a device body, a text recognition module, and a document generation module, wherein the image pickup apparatus is detachably connected to the device body; wherein,

the camera device is used for acquiring a dynamic image, and the dynamic image comprises text content currently recorded by a user;

the text recognition module is used for recognizing text contents in the dynamic images;

and the document generating module is used for generating and storing a document based on the identified text content.

In a third aspect, an electronic device is provided, comprising a processor, a memory and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the method according to the first aspect.

In a fourth aspect, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, realizes the steps of the method according to the first aspect.

In the embodiment of the invention, the dynamic image comprising the text content currently recorded by the user can be conveniently acquired in real time through the separated camera device, and after the dynamic image is obtained, the electronic equipment can identify the text content in the dynamic image and generate and store the document based on the identified text content. The embodiment of the invention can generate and store the corresponding document while the user writes the text content on the paper and other carriers, does not need secondary input of the user and has higher efficiency.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a flowchart illustrating a method for generating a document according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a partially embedded state of a camera device and an apparatus body according to an embodiment of the present invention;

fig. 3 is a schematic diagram of the camera device and the apparatus body in a separated state according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating a document generation method according to another embodiment of the present invention;

FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;

fig. 6 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As described above, in the related art, the user has to input the text content recorded by the pen and paper to the computer twice word by word and sentence by sentence, which is troublesome to operate and inefficient.

To solve the above technical problem, as shown in fig. 1, an embodiment of the present invention provides a document generating method 100, which may be executed by an electronic device, in other words, the method may be executed by software or hardware installed in the electronic device, where the electronic device includes an image pickup apparatus and a device body, and the image pickup apparatus (which may also be referred to as a detachable camera, or the like) and the device body are detachably (or referred to as a detachable connection, an embeddable connection, or the like) connected.

For the separate camera mentioned in the embodiments of the present invention, refer to fig. 2 and 3. In fig. 2, the separate camera and the apparatus body are in a partially embedded state; in fig. 3, the separate camera and the apparatus body are in a separated state.

As shown in fig. 1, the method 100 includes the steps of:

s102: and acquiring a dynamic image through the separated camera, wherein the dynamic image comprises the text content currently recorded by the user.

Generally, users have an immediate need to record text content at any place and time due to various transactions, for example, when a meeting is opened, the text content is quickly written on a paper carrier by a pen.

In such a scenario, a user can take out the separate camera from the electronic device, and fix the separate camera at a position of a collar or a glasses frame of the user, so that a paper carrier for recording text content and the like are within a viewing angle range of the separate camera.

Thus, when a user records (writes) text content on the paper carrier, the detachable camera acquires a dynamic image (or image frame) containing the text content currently recorded by the user in real time.

It will be appreciated that recording of text content by a user on a paper carrier is an ongoing process, and thus S102 is generally an ongoing process, i.e. a relatively large number of (dynamic) images are captured comprising text content currently being recorded by the user, and is therefore referred to as a dynamic image in this embodiment. The dynamic image may also be understood as a video or the like comprising a plurality of image frames, each image frame comprising text content that the user is currently recording.

It is to be understood that the embodiment of the present invention is not limited to the above application scenario, and in another scenario, the text content included in the dynamic image acquired by the separate camera is the content being written on the blackboard. In this scenario, the blackboard for recording text content is within the viewing angle range of the separate camera, and the separate camera (held by the student) collects a moving image including the currently recorded text content in real time while the teacher writes the text content on the blackboard with chalk or the like.

S104: text content in the dynamic image is identified.

Generally, the text content currently recorded by the user includes a plurality of characters, and the plurality of characters are recorded (written) one by the user, that is, the plurality of characters appear in the dynamic image according to the sequence, so that the step can sequentially identify the characters included in the dynamic image according to the sequence of the characters appearing in the dynamic image.

Considering that in practical applications the text recorded by the user on the paper carrier may be less regular (e.g. abbreviated, or cursive), in which case no part of the text (characters) may be recognized.

Optionally, in an example, in a case that recognition of a certain character fails, the step may further intercept a text image only including the unrecognized character(s), so that the text image is subsequently inserted into a corresponding position of the document when the document is generated and stored, thereby ensuring readability and continuity of the generated document and improving user experience.

Optionally, the size of the text image (i.e., the image including the character that failed to be recognized) inserted into the document is equal to the size of one character (occupied area) in the document, so that the document layout is more orderly, and the user experience is improved.

Alternatively, S102 is performed by a separate camera, and the separate camera and the device body are connected by a near field communication technology (e.g., bluetooth). In this way, the separate camera may further send the moving image to the device body after acquiring the moving image, and the device body (e.g., the processor) of the electronic device executes S104 and S106.

S106: and generating and storing a document based on the identified text content.

As described above, in S104, the characters included in the dynamic image may be sequentially recognized according to the sequence of appearance of the characters, and therefore, in this step, the document may be generated based on the sequence recognized by the characters, and the document may be stored, so that the sequence of the characters in the document is consistent with the sequence recorded by the user, and the readability of the document is improved.

Optionally, in S104, in a case that the identification of the target text content in the dynamic image fails, the embodiment may further intercept a text image containing only the target text content; the S106 may generate and store a document based on the recognized text content and the text image. Therefore, after the document is generated and stored, the editing request of the user can be received, the text image in the document is replaced by the characters input by the user, the user can edit and modify the document conveniently, and the user experience is improved.

The document generation method provided by the embodiment of the invention is applied to the electronic equipment comprising the separated camera, so that the separated camera can conveniently acquire the dynamic image comprising the text content currently recorded by the user in real time, after the dynamic image is obtained, the electronic equipment can identify the text content in the dynamic image, and generate and store the document based on the identified text content. The embodiment of the invention can generate and store the corresponding document while the user writes the text content on the paper and other carriers, does not need secondary input of the user and has higher efficiency.

Optionally, as an embodiment, before identifying the text content in the image at S104, the method further includes: identifying a position of a target object in the dynamic image; wherein the target object comprises at least one of: a hand of a user; a carrier (e.g., pen, etc.) for recording text content; determining a position of the text content in the dynamic image based on a position of the target object in the dynamic image.

According to the embodiment, the position of the text content currently written by the user can be quickly located by identifying the position of the target object, so that the text content can be identified based on the position of the text content currently written in S106, and the identification efficiency is improved conveniently.

Alternatively, as an embodiment, the recognizing the text content in the moving image in S104 includes: performing binarization processing on a local image in a dynamic image, wherein the local image is an image only comprising text content; carrying out feature extraction processing on the text content in the local image, and comparing the extracted features with the features of the texts in a text library; and determining the text content obtained by recognition according to the comparison result.

Since most of the images acquired by the separate camera are color images, the color images contain a large amount of information, and in order to recognize text content more quickly, the embodiment may process the color images first, so that the images only contain foreground information and background information, and the foreground information can be simply defined as black and the background information as white, thereby implementing binarization processing of the images.

In addition, since the size of the partial image having the text content is relatively smaller than that of the entire image, performing subsequent operations such as text recognition on the partial image will greatly shorten the processing time and acquire the text content more quickly than performing operations such as text recognition on the entire moving image.

Of course, after the binarization processing is performed on the local image containing the text content in the dynamic image, the noise removal operation may also be performed on the local image, so as to improve the accuracy of the finally identified text content.

Features are key information used to identify text content (e.g., characters), and each different character can be distinguished from other characters by the feature. For the numbers and the English letters, the feature extraction is easier, because the numbers are only 10, and the English letters are only 52, which are small character sets. For Chinese characters, feature extraction is difficult, firstly, the Chinese characters are large character sets, and 3755 first-level Chinese characters which are most commonly used in national standards exist; the second Chinese character has a complex structure and many similar shapes.

After determining which feature to use, feature dimensionality reduction can be performed, if the feature dimensionality is too high (the feature is generally represented by a vector, and the dimensionality is the number of components of the vector), the efficiency of the classifier is greatly affected, and in order to increase the recognition rate, the dimensionality reduction is often performed, so that the feature vector after the dimensionality reduction still retains enough information (to distinguish different characters).

After the character features are extracted, no matter the statistical or structural features are used, a comparison character library or a feature database is needed for comparison, the content of the character library can contain all the characters to be identified, and the character feature extraction method further comprises a feature group obtained by performing feature extraction on the characters in the character library.

And determining the character string according to the comparison result of the character characteristics in the step.

To describe the document generation method provided by the above embodiment of the present invention in detail, a specific embodiment will be described below.

In the embodiment, the text content with dynamically changed entity is captured and captured by the separated camera of the mobile phone and is converted into the electronic visual information in real time.

For the user, the user only needs to take out the separate camera in the mobile phone, fix the position of the separate camera (such as a glasses frame, a collar and the like) through a built-in accessory (such as a fixing clip), and start the detection mode of the separate camera.

Therefore, the electronic equipment can automatically capture a dynamically changed main body (namely, the process of identifying the handwritten font and the trigger position), convert the dynamically changed main body into characters which can be coded by a computer, automatically create a document in real time in a mobile phone synchronously, input the identified text into the document and finally store the document, greatly reduce the cost of secondary text content input by a user, and avoid the conditions of difficult searching and difficult retrieval after the user records across materials.

As shown in fig. 4, this embodiment 400 includes the steps of:

s402: the separated camera identifies the handwritten fonts generated in the dynamic process through a dynamic capturing system and a character identification system.

Before the embodiment is executed, the user can take out the separated camera from the mobile phone; fixing the separated camera at the position of a collar, a glasses frame and the like of a user by using a fixing clamp at the back of the separated camera of the mobile phone, so that paper for writing texts of the user is in the visual angle range of the separated camera; and clicking an opening button at the top end of the separated camera to synchronously open the dynamic capturing system and the character recognition system.

S404: the hand-written character is converted into the character which can be coded by the computer.

S406: a document is created.

This step can automatically create a new document (or called an electronic document) based on the date and time of the document specified in the mobile phone.

It is understood that the three steps of S402, S404 and S408 are a continuous process, which is continuously performed while the user writes text on paper. In the embodiment of the present invention, the order of S406 is not limited, and S406 may be before S402.

S408: and synchronizing the characters obtained in the S404 to the document through Bluetooth.

S410: and the separated camera stops recording and informs the mobile phone to save the document.

In this step, the user can turn off the scanning mode by clicking the button of the separate camera again.

The embodiment of the invention can support dynamic character capture through the detection mode of the separated camera, synchronously convert the characters into the characters which can be coded by the computer in real time, automatically create the document in the mobile phone in real time, input the document, and synchronize any notepad on the mobile phone, thereby realizing real paper-electricity integration, greatly reducing the cost of secondary input content of a user, and avoiding the conditions of difficult searching and difficult retrieval after the user records across materials.

The document generation method according to the embodiment of the present invention is described in detail above with reference to fig. 1 to 4. An electronic device according to an embodiment of the present invention will be described in detail below with reference to fig. 5, and fig. 5 is a schematic structural diagram of the electronic device according to the embodiment of the present invention. As shown in fig. 5, the electronic device 500 includes: the system comprises a camera 502, an equipment body (not shown), a text recognition module 504 and a document generation module 506, wherein the camera 502 is detachably connected with the equipment body; wherein,

the camera 502 may be configured to acquire a dynamic image, where the dynamic image includes text content currently recorded by a user;

the text recognition module 504 may be configured to recognize text content in the dynamic image;

the document generating module 506 may be configured to generate and store a document based on the identified text content.

The electronic equipment provided by the embodiment of the invention comprises the separated camera, the dynamic image comprising the text content currently recorded by the user can be conveniently collected in real time through the separated camera, after the dynamic image is obtained, the electronic equipment can identify the text content in the dynamic image, and a document is generated and stored based on the identified text content. The embodiment of the invention can generate and store the corresponding document while the user writes the text content on the paper and other carriers, does not need secondary input of the user and has higher efficiency.

Optionally, as an embodiment, the text recognition module 504 may be configured to:

identifying a location of a target object in the dynamic image, the target object including at least one of: a hand of the user; a carrier for recording text content; determining a position of text content in the dynamic image based on the position of the target object in the dynamic image;

the text recognition module 504 may be further configured to recognize text content in the dynamic image based on a position of the text content in the dynamic image.

Optionally, as an embodiment, the text recognition module 504 is further configured to, in a case that recognition of a target text content in the dynamic image fails, obtain a text image including the target text content;

the document generating module 506 may be configured to generate and store a document based on the identified text content and the text image.

Optionally, as an embodiment, the text content obtained by the recognition includes a plurality of characters, and the document generating module 506 may be configured to generate and store a document based on a sequence recognized by the plurality of characters;

the text recognition module 504 may be configured to sequentially recognize the characters in the dynamic image according to the appearance sequence of the characters in the dynamic image.

performing binarization processing on a local image in the dynamic image, wherein the local image comprises text content;

carrying out feature extraction processing on the text content in the local image, and comparing the extracted features with the features of the texts in a text library;

and determining the text content obtained by recognition according to the comparison result.

The electronic device according to the embodiment of the present invention may refer to a flow of a document generation method corresponding to the embodiment of the present invention, and each unit/module and the other operations and/or functions in the electronic device are respectively for implementing a corresponding flow in the document generation method, and are not described herein again for brevity.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts in the embodiments are referred to each other. For the embodiment of the electronic device, since it is basically similar to the embodiment of the method, the description is simple, and for the relevant points, reference may be made to the partial description of the embodiment of the method.

Fig. 6 is a schematic diagram of a hardware structure of an electronic device 600 for implementing various embodiments of the present invention, where the electronic device 600 includes, but is not limited to: a radio frequency unit 601, a network module 602, an audio output unit 603, an input unit 604, a sensor 605, a display unit 606, a user input unit 607, an interface unit 608, a memory 609, a processor 610, and a power supply 611. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 6 does not constitute a limitation of the electronic device, and that the electronic device may include more or fewer components than shown, or some components may be combined, or a different arrangement of components. In the embodiment of the present invention, the electronic device includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted terminal, a wearable device, a pedometer, and the like.

The input unit 604 includes a camera (graphic processor 6041) detachably connected to an apparatus body of the electronic apparatus, the camera being configured to capture a moving image; wherein the dynamic image comprises text content currently being recorded by a user; a processor 610 for identifying text content in the dynamic image; and generating and storing a document based on the identified text content.

It should be understood that, in the embodiment of the present invention, the radio frequency unit 601 may be used for receiving and sending signals during a message sending and receiving process or a call process, and specifically, receives downlink data from a base station and then processes the received downlink data to the processor 610; in addition, the uplink data is transmitted to the base station. In general, radio frequency unit 601 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. Further, the radio frequency unit 601 may also communicate with a network and other devices through a wireless communication system.

The electronic device provides wireless broadband internet access to the user via the network module 602, such as assisting the user in sending and receiving e-mails, browsing web pages, and accessing streaming media.

The audio output unit 603 may convert audio data received by the radio frequency unit 601 or the network module 602 or stored in the memory 609 into an audio signal and output as sound. Also, the audio output unit 603 may also provide audio output related to a specific function performed by the electronic apparatus 600 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 603 includes a speaker, a buzzer, a receiver, and the like.

The input unit 604 is used to receive audio or video signals. The input Unit 604 may include a Graphics Processing Unit (GPU) 6041 and a microphone 6042, and the Graphics processor 6041 processes image data of a still picture or video obtained by an image capturing apparatus (such as a camera) in a video capture mode or an image capture mode. The processed image frames may be displayed on the display unit 606. The image frames processed by the graphic processor 6041 may be stored in the memory 609 (or other storage medium) or transmitted via the radio frequency unit 601 or the network module 602. The microphone 6042 can receive sound, and can process such sound into audio data. The processed audio data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 601 in case of the phone call mode.

The electronic device 600 also includes at least one sensor 605, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 6061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 6061 and/or the backlight when the electronic apparatus 600 is moved to the ear. As one type of motion sensor, an accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used to identify the posture of an electronic device (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), and vibration identification related functions (such as pedometer, tapping); the sensors 605 may also include fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., which are not described in detail herein.

The Display unit 606 may include a Display panel 6061, and the Display panel 6061 may be configured in the form of a liquid Crystal Display (L acquired Crystal Display, L CD), an Organic light-Emitting Diode (O L ED), or the like.

The user input unit 607 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device. Specifically, the user input unit 607 includes a touch panel 6071 and other input devices 6072. Touch panel 6071, also referred to as a touch screen, may collect touch operations by a user on or near it (e.g., operations by a user on or near touch panel 6071 using a finger, stylus, or any suitable object or accessory). The touch panel 6071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 610, receives a command from the processor 610, and executes the command. In addition, the touch panel 6071 can be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The user input unit 607 may include other input devices 6072 in addition to the touch panel 6071. Specifically, the other input devices 6072 may include, but are not limited to, a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a track ball, a mouse, and a joystick, which are not described herein again.

Further, the touch panel 6071 can be overlaid on the display panel 6061, and when the touch panel 6071 detects a touch operation on or near the touch panel 6071, the touch operation is transmitted to the processor 610 to determine the type of the touch event, and then the processor 610 provides a corresponding visual output on the display panel 6061 according to the type of the touch event. Although the touch panel 6071 and the display panel 6061 are shown in fig. 6 as two separate components to implement the input and output functions of the electronic device, in some embodiments, the touch panel 6071 and the display panel 6061 may be integrated to implement the input and output functions of the electronic device, and this is not limited here.

The interface unit 608 is an interface for connecting an external device to the electronic apparatus 600. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 608 may be used to receive input (e.g., data information, power, etc.) from external devices and transmit the received input to one or more elements within the electronic device 600 or may be used to transmit data between the electronic device 600 and external devices.

The memory 609 may be used to store software programs as well as various data. The memory 609 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 609 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The processor 610 is a control center of the electronic device, connects various parts of the whole electronic device by using various interfaces and lines, performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 609, and calling data stored in the memory 609, thereby performing overall monitoring of the electronic device. Processor 610 may include one or more processing units; preferably, the processor 610 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 610.

The electronic device 600 may further include a power supply 611 (e.g., a battery) for supplying power to the various components, and preferably, the power supply 611 may be logically connected to the processor 610 via a power management system, such that the power management system may be used to manage charging, discharging, and power consumption.

In addition, the electronic device 600 includes some functional modules that are not shown, and are not described in detail herein.

Preferably, an embodiment of the present invention further provides an electronic device, which includes a processor 610, a memory 609, and a computer program stored in the memory 609 and capable of running on the processor 610, where the computer program, when executed by the processor 610, implements each process of the above-mentioned document generation method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not described here again.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the embodiment of the document generation method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A document generation method is applied to electronic equipment, the electronic equipment comprises an image pickup device and an equipment body, the image pickup device is detachably connected with the equipment body, and the method comprises the following steps:

identifying text content in the dynamic image;

and generating and storing a document based on the identified text content.

2. The method of claim 1, wherein prior to identifying text content in the dynamic image, the method further comprises:

identifying a location of a target object in the dynamic image, the target object including at least one of: a hand of the user; a carrier for recording text content;

determining a position of text content in the dynamic image based on the position of the target object in the dynamic image;

wherein the identifying text content in the dynamic image comprises: identifying text content in the dynamic image based on a location of the text content in the dynamic image.

3. The method of claim 1, wherein prior to generating and storing a document based on the identified textual content, the method further comprises:

under the condition that the identification of the target text content in the dynamic image fails, acquiring a text image containing the target text content;

wherein the generating and storing a document based on the recognized text content comprises: and generating and storing a document based on the identified text content and the text image.

4. The method of any of claims 1 to 3, wherein the recognized text content comprises a plurality of characters, and wherein generating and storing a document based on the recognized text content comprises:

generating and storing a document based on the sequence recognized by the characters;

wherein the identifying text content in the dynamic image comprises: and sequentially identifying the characters in the dynamic image according to the appearance sequence of the characters in the dynamic image.

5. The method of claim 1, wherein the identifying text content in the dynamic image comprises:

6. An electronic device is characterized by comprising a camera device, a device body, a text recognition module and a document generation module, wherein the camera device is detachably connected with the device body; wherein,

7. The electronic device of claim 6, wherein the text recognition module is further configured to:

the text recognition module is further configured to recognize text content in the dynamic image based on a position of the text content in the dynamic image.

8. The electronic device of claim 6,

the text recognition module is further configured to acquire a text image including the target text content under the condition that recognition of the target text content in the dynamic image fails;

and the document generating module is used for generating and storing a document based on the identified text content and the text image.

9. The electronic device according to any one of claims 6 to 8, wherein the recognized text content includes a plurality of characters, and the document generation module is configured to generate and store a document based on a sequence recognized by the plurality of characters;

the text recognition module is used for sequentially recognizing the characters in the dynamic image according to the appearance sequence of the characters in the dynamic image.

10. The electronic device of claim 6, wherein the text recognition module is configured to:

11. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing a method of generating a document according to any one of claims 1 to 5.

12. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, realizes the method of generating a document according to any one of claims 1 to 5.