CN111147948A - Information processing method and device and electronic equipment - Google Patents

Information processing method and device and electronic equipment Download PDF

Info

Publication number
CN111147948A
CN111147948A CN201811301088.5A CN201811301088A CN111147948A CN 111147948 A CN111147948 A CN 111147948A CN 201811301088 A CN201811301088 A CN 201811301088A CN 111147948 A CN111147948 A CN 111147948A
Authority
CN
China
Prior art keywords
video
information
message
text information
window
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811301088.5A
Other languages
Chinese (zh)
Inventor
罗永浩
耿达维
田作辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kuailu Technology Co Ltd
Original Assignee
Beijing Kuailu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kuailu Technology Co Ltd filed Critical Beijing Kuailu Technology Co Ltd
Priority to CN201811301088.5A priority Critical patent/CN111147948A/en
Publication of CN111147948A publication Critical patent/CN111147948A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • General Engineering & Computer Science (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The invention provides an information processing method, an information processing device and electronic equipment, when a user needs to send a video message and record a video, voice recognition processing can be carried out on voice information acquired in the video recording process to obtain text information, namely the content of the voice information, the video information obtained by recording and the text information are combined to obtain a video message to be sent, the video information obtained by recording is not directly used as the video message to be sent, the diversity of the content of the video message is realized, and the requirements of the user on multiple aspects such as image watching, voice listening, voice content watching and the like of the video message are met.

Description

Information processing method and device and electronic equipment
Technical Field
The present invention relates to the field of communications technologies, and in particular, to an information processing method and apparatus, and an electronic device.
Background
With the rapid development of the internet, instant messaging software in electronic devices such as mobile phones and notebook computers becomes a main tool for people to communicate daily, voice and video communication can be carried out anytime and anywhere only by connecting electronic devices of two parties to the internet, and the method is very convenient.
In practical application, when a user uses instant messaging software for communication, the user can directly initiate a voice or video communication request to carry out voice or video communication with the other party, if the other party is not on line at present or the electronic equipment is no longer nearby and cannot receive the communication request in time, the user can also directly edit text messages or record voice or video for sending, the electronic equipment of the receiving party can cache the messages, the user of the receiving party can be ensured to view the messages, and convenience is provided for daily life and work of the user.
Disclosure of Invention
In view of this, the present invention provides an information processing method, an information processing apparatus, and an electronic device, so as to implement diversity of video messages to be sent and meet communication requirements of different users.
In order to achieve the above purpose, the invention provides the following technical scheme:
the embodiment of the invention provides an information processing method, which is applied to electronic equipment and comprises the following steps:
acquiring voice information in the video recording process;
carrying out voice recognition processing on the voice information to obtain text information;
and combining the text information and the video information obtained by recording to obtain the video information to be sent.
Preferably, the merging the text information and the video information obtained by recording to obtain the video message to be sent includes:
establishing an incidence relation between the text information and the video information obtained by recording;
generating a data packet by the text information, the video information and the incidence relation, and taking the data packet as a video message to be sent;
and responding to a message sending instruction, and sending the message to be sent to second electronic equipment which establishes communication connection with the electronic equipment.
Preferably, during the video recording process, the method further comprises:
outputting a video recording interface, wherein the video recording interface comprises a first window and a second window, the first window is used for displaying a video image to be recorded, and the second window is used for displaying the text information;
and caching the video image displayed by the first window and the text information displayed by the second window.
Preferably, the merging the text information and the video information obtained by recording includes:
according to the caching time, correlating the cached video images and the cached text information, and generating a data packet by all the cached video images and the cached text information;
or after the video recording is finished, associating the video file generated by the cached video image with the text file generated by the cached text information, and generating a data packet by the video file and the text file.
Preferably, the method further comprises:
receiving an editing instruction aiming at the currently displayed text information of the second window;
and responding to the editing instruction, and updating the cached corresponding text information by using the text information obtained by editing.
Preferably, in the video recording process, acquiring the voice information includes:
responding to a video recording instruction, recording a video in the current environment, and extracting voice information from the video data after the video data is obtained; alternatively, the first and second electrodes may be,
and in the video recording process, acquiring the voice information acquired by the voice acquisition device.
Preferably, the method further comprises:
receiving a video message;
responding to a playing instruction aiming at the video message, and outputting a video playing interface in a current session window;
and outputting the video image of the video message in a first display area of the video playing interface, and outputting the text information in the video message in a second display area of the video playing interface.
An embodiment of the present invention further provides an information processing apparatus, which is applied to an electronic device, and the apparatus includes:
the voice information acquisition module is used for acquiring voice information in the video recording process;
the text information acquisition module is used for carrying out voice recognition processing on the voice information to obtain text information;
and the information processing module is used for combining the text information and the video information obtained by recording to obtain a video message to be sent.
Preferably, the apparatus further comprises:
the information output module is used for outputting a video recording interface, the video recording interface comprises a first window and a second window, the first window is used for displaying a video image to be recorded, and the second window is used for displaying the text information;
and the cache module is used for caching the video image displayed by the first window and the text information displayed by the second window.
An embodiment of the present invention further provides an electronic device, where the electronic device includes:
audio and video acquisition equipment; a display screen; a communication interface;
a memory for storing a program for implementing the information processing method as described above;
a processor for loading and executing the memory-stored program, the program for:
acquiring voice information in the video recording process;
carrying out voice recognition processing on the voice information to obtain text information;
and combining the text information and the video information obtained by recording to obtain the video information to be sent.
Therefore, compared with the prior art, the invention provides an information processing method, an information processing device and electronic equipment, when a user needs to send a video message and record a video, voice recognition processing can be carried out on voice information acquired in the video recording process to obtain text information, namely the content of the voice information, the video information obtained by recording and the text information are combined to obtain a video message to be sent, the video information obtained by recording is not directly used as the video message to be sent, the diversity of the content of the video message is realized, and the requirements of the user on multiple aspects such as image watching, voice listening, voice content watching and the like of the video message are met.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic view of a scene of recording a video message in an instant messaging process;
fig. 2 is a schematic view of a scene in which a video message is received and output in the instant messaging process;
fig. 3 is a schematic flowchart of an information processing method according to an embodiment of the present invention;
fig. 4 is a schematic view of a video recording interface according to an embodiment of the present invention;
FIG. 5 is a flow chart illustrating another information processing method according to an embodiment of the present invention;
fig. 6a is a schematic view of a video playing interface for outputting a received video message in the instant messaging process according to an embodiment of the present invention;
fig. 6b is a schematic view of another video playing interface for outputting a received video message in the instant messaging process according to the embodiment of the present invention;
FIG. 7 is a flowchart illustrating another information processing method according to an embodiment of the present invention;
FIG. 8 is a diagram illustrating an information processing apparatus according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of another information processing apparatus according to an embodiment of the present invention;
FIG. 10 is a diagram illustrating a structure of another information processing apparatus according to an embodiment of the present invention;
fig. 11 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present invention.
Detailed Description
The inventors of the present invention found that: when a user sends a video message by using instant messaging software, referring to the schematic diagram of sending a video message by using instant messaging software shown in fig. 1, generally, after a shooting button is clicked on a session interface, the user enters an interface shown in a left side diagram in fig. 1 to record a video, and a video file generated after the recording is completed can be automatically sent to an opposite side as a session message, such as a session window shown in a right side diagram in fig. 1, and the session message of the video file can display the content of video recording duration, a frame of image and the like. After receiving the video message, the calling party may also display the duration of the video message, a frame of image, and other content in the session window in the left side in fig. 2, and the user may play the content of the video message by touch, such as the video playing interface shown in the right side in fig. 2. Therefore, after the user records the video by using the instant messaging software, the video is usually directly sent to the other party, and the video cannot be edited and processed, so that the user requirements cannot be met.
Moreover, the inventor has noticed that the existing video message output mode is relatively single, that is, the explanation of the shot object by the user is usually performed with voice acquisition during the video recording process, so that the obtained video file contains video image information and voice information at the same time, the user plays the video file, and the voice information is played while the shot image is displayed, so that the message receiver can know the related information of the shot object.
However, in the case where the current environment needs to be muted or the receiving user is a tinnitus patient, the video file is directly played, and although the video image can be seen, the user cannot hear the voice information, and therefore the user is likely to be unintelligible because the user cannot understand the shooting object.
In order to solve the problem that in the application of video message communication by a user, the inventor proposes that while playing a video, a text description can be output, namely voice information in a video file is converted into text for the user to read, so that the diversity of video message output is realized, and the requirements of different users are met.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Referring to fig. 3, a flow chart of an information processing method according to an embodiment of the present invention is schematically shown, where the method may be applied to an electronic device such as a mobile phone, a notebook computer, and the like, as shown in fig. 3, the method may include, but is not limited to, the following steps:
step S11, acquiring voice information in the video recording process;
in the application scenario described above, that is, in the process of using the instant messaging software to communicate between the user a and the user B, the user a needs to record a segment of video and send the segment of video to the user B as a session message, and the video recording function can be started in the above-described manner to record the video, but the application scenario is not limited to the above-described starting manner.
The invention can be applied to video recording in electronic equipment, and can simultaneously acquire video and audio signals, and can extract voice information (namely audio data stream) from the video data after acquiring the recorded video data (namely video data stream). Or, an independent voice collector may collect the voice signal during the video recording process, and then obtain the collected voice information for subsequent processing, which is not limited in this embodiment.
Step S12, carrying out voice recognition processing on the voice information to obtain text information;
when the recorded video content needs to be explained in words, a user usually explains the recorded video content in the video recording process, so that the electronic equipment can acquire corresponding voice information while recording the video.
Optionally, in this embodiment, the voice recognition function of the instant messaging application may be utilized to perform voice recognition and conversion processing on the collected voice information, or other application software installed in the electronic device may be utilized to perform recognition and conversion processing on the collected voice information, and then the obtained text information is fed back to the instant messaging software, and the specific implementation method of this step S12 is not described in detail in the present invention.
In another optional embodiment of the present invention, referring to the schematic view of the video recording interface shown in fig. 4, in the video recording process, a display screen of the electronic device may output the video recording interface, and this embodiment may divide the video recording interface into two windows, that is, a first window and a second window, where the first window displays a video image to be recorded, and the second window displays text information obtained in the above manner.
Therefore, in the process of recording the video, the user can visually see the recorded video image and can also read the collected voice information content at the same time, namely the voice information content is output in a text form, and the video recording method is enriched.
Moreover, for the text information displayed on the second window of the video recording interface, a user can edit the text information according to actual needs, such as changing character fonts, states, contents, plate types and the like, so as to optimize the text information directly obtained through a voice recognition mode, ensure that the sent text information is smooth in sentences, highlight important characters and facilitate a receiver to quickly know the video content. The display mode and the editing mode of the text information are not limited.
And step S13, merging the text information and the recorded video information to obtain the video message to be sent.
Optionally, after the recorded video information and the corresponding text information are obtained in the above manner, the two information may be used as independent files to generate a video message to be sent, where the combining process may be to generate a data packet from the obtained independent files, that is, to place the text information and the video information in a data packet to be sent as a video message to be sent to the second electronic device.
Therefore, in this embodiment, after completing video recording, the association relationship between the obtained video information and the text information can be established, and then the obtained video information file, the text information file, and the association relationship therebetween are directly generated into a data packet, which is sent to the second electronic device as the video message to be sent, because the second electronic device can cache each received video message, if it caches each video message as a whole, the user can directly output the video information and the text information in the video message when viewing the video message, and if the second electronic device analyzes the data packet of the video message and classifies the video information and the text information contained therein, when outputting the video message, the association relationship between the two can be utilized to ensure that when playing the video information, and outputting the text information corresponding to the text information.
Naturally, the merging process in step S13 may be to combine the text information and the video information into one file, so that the other side can output the combined text information and video information directly on the same interface, and the specific combination method of the two information is not limited in the present invention, and may be different from the conventional generation method of video subtitles, that is, the text in the text information obtained by the present invention is not added to the corresponding video image, but the text is separated from the video image, as shown in the video interface shown in fig. 4, but is not limited to the display method of the video recording interface shown in fig. 4.
Based on this, since the video image and the text information may be synchronously cached, corresponding caching time is usually carried during caching, and in this embodiment, the cached video image and the text information may be associated according to the caching time, specifically, the video image and the text information may be synchronously associated, that is, for any frame of video image, the video image and the text information converted from the collected voice information may be associated when the video image is collected; alternatively, the present embodiment may also associate the video image collected within a period of time with the text information converted from the voice information collected within the short period of time, and the like. And then, generating a data packet by all the cached video images and text information, namely obtaining the video message to be sent.
Of course, in this embodiment, after the video recording is finished, the video file generated from all the cached video images may be associated with the text file generated from all the cached text information, and then the video file and the text file may be generated into one data packet, which is used as the video message to be sent to send the message.
Step S14, in response to the message sending instruction, sends the message to be sent to a second electronic device that establishes a communication connection with the electronic device.
In this embodiment, referring to the video recording interface shown in fig. 4, after the user releases the video recording button, the obtained video message to be sent can be directly sent to the other side, and in this case, after the video message to be sent is obtained, the message sending instruction can be directly responded to, and the message sending operation is completed.
Of course, the invention can also output whether to send a prompt on the session window interface after obtaining the video message to be sent, and the user selects whether to send the video message to be sent, in this case, the message sending instruction can be generated when the user confirms to send the video message to be sent, for example, the user clicks the "confirm" button, the message sending instruction can be generated, and the message sending instruction is executed, so as to complete the sending of the video message to be sent. Therefore, the generation mode of the message sending instruction is not limited by the invention.
In summary, in this embodiment, when a user needs to send a video message and record a video, voice recognition processing may be performed on voice information acquired during a video recording process to obtain text information, that is, content of the voice information, and the recorded video information and the text information are combined to obtain a video message to be sent, instead of directly using the recorded video information as the video message to be sent, so that diversity of video message content is achieved, and various requirements of the user on the video message, such as viewing images, listening to voice, viewing voice content, and the like, are met.
Based on the above description of the generation process of the video message, the obtained video message to be sent is different from the content of the video message to be sent obtained by direct recording, and the former includes not only the video image and the voice signal but also the text information obtained by converting the voice signal, so that, as a receiver of the video message, when the receiver outputs the video message, the output interface is also different from the output interface of the existing video message, for this reason, another information processing method is provided in the embodiment of the present invention, referring to the flow diagram of the information processing method shown in fig. 5 and the video playing interface diagrams shown in fig. 6a and 6b, the method may further include the following steps:
step S21, receiving video information;
the video message may be a message sent by the second electronic device through the instant messaging software, the second electronic device may be any electronic device, and the process of generating the video message may refer to the information processing process described in the above embodiment, which is not described in detail herein.
Step S22, responding to the playing instruction of the video message, and outputting a video playing interface in the current session window;
as shown in the schematic diagram of the session window on the left side in fig. 2, after receiving the video message sent by lie X, wang XX may output identification information of the video message in the session window between wang XX and lie X, where the identification information shown in the schematic diagram is a frame of image carrying the video duration, but is not limited to this. At this time, the wang XX may click the image, i.e., select to view the image, may generate a play instruction for the video message, and the electronic device executes the play instruction, and may output a video play interface in the current session window, such as the video play interfaces shown in fig. 6a and 6b, but is not limited thereto.
And step S23, outputting the video image of the video message in the first display area of the video playing interface, and outputting the text information in the video message in the second display area of the video playing interface.
The video playing interface of this embodiment may be divided into a first display area and a second display area, the first display area outputs a video image, and the second display area outputs text information, and the first display area and the second display area may not overlap, as shown in fig. 6a and 6b, the video playing interface may be divided into an upper display area and a lower display area, which are respectively used to display different types of information, that is, image information and text information, but is not limited to this division manner.
Compared with the existing video playing interface shown in the right side of the upper image in fig. 2, the output content of the video playing interface is enriched, so that a user can know the video content from multiple aspects, and certainly, if the collected voice information is not the introduction of a shooting object but the description of other contents during the video recording, the processing mode enriches the communication content, avoids directly adding characters into a video image as subtitles, shields the video image, enables the user to more accurately and comprehensively watch the video content, and improves the user experience.
Based on the above analysis, as shown in fig. 7, a schematic flow chart of another information processing method provided in the embodiment of the present invention, which is applied to an electronic device, may include, but is not limited to, the following steps:
step S31, responding to a video recording instruction, and recording a video in the current environment to obtain a video data stream;
the video recording instruction may be generated after the user clicks the shooting button, as described above, which is not limited in this embodiment.
It should be noted that, in this embodiment, while responding to the video recording instruction, the voice recognition system of the electronic device is started to enable the electronic device to perform voice acquisition, and of course, the voice recognition system of the electronic device may also default to a normally open state, and after the electronic device enters the video recording state, the electronic device can directly perform audio and video acquisition.
Step S32, extracting an audio data stream from the video data stream;
the present embodiment may implement step S32 based on the type of data stream, but is not limited to this implementation.
Step S33, carrying out voice recognition processing on the audio data stream to obtain text information;
step S33 can refer to the description of step S12, and the description of this embodiment is omitted.
Step S34, outputting the obtained video data stream and text information in different display areas of the video recording interface;
referring to the video recording interface shown in fig. 4, the video data stream may be displayed in the form of a video image on the top of the video recording interface, and the text information may be displayed in the display area on the bottom of the video recording interface, but is not limited to this display mode.
Because the video recording is performed for a certain time period instead of one or more frames of images, the video data streams obtained in the above manner are continuous, and the text information obtained by converting the audio data streams can also be continuous, so that the new text information obtained each time can be updated to the corresponding display area of the video recording interface, so that the user can directly see the generated characters. Of course, the obtained text information may also be always kept in the display area, that is, all the generated text information may be always in the display area, and it is not limited to obtaining one text information and displaying one text information.
Step S35, receiving a video recording ending instruction, and controlling a video recording interface to enter an editable state;
step S36, receiving an edit instruction for the obtained text information;
optionally, after the video recording interface enters the editable state, the user may edit the content output by the video recording interface, where this embodiment mainly explains editing of text information, and of course, the user may also edit a video image, such as mosaic making, decoration adding, and the like.
For editing text information, especially for editing only part of text information, a user may first select a target text to be edited, and click an edit button to generate a corresponding edit instruction, but is not limited to this processing method.
Step S37, responding the editing instruction, and updating the cached corresponding text information by using the text information obtained by editing;
in practical application, in the process of recording a video by using an electronic device, a user generally caches various collected information, so that after text information is obtained for the first time according to the method, the text information is cached, and after the text information is edited, the cached text information needs to be updated to ensure that the sent message is the edited text information.
Step S38, a data packet is generated from all the buffered video data streams and the corresponding text information, and the data packet is sent to the second electronic device as a video message to be sent.
For a specific generation process of the video message to be sent, that is, a generation process of the data packet, reference may be made to the description of the corresponding part of the foregoing method embodiment, which is not described again in this embodiment.
In summary, when a user needs to send a video message and record a video, the voice information in the recorded video can be extracted and converted into characters, after the recording is completed, the obtained characters can be edited and corrected as required, and the corrected characters are carried in video data and sent out as the video message, so that the opposite side can see corresponding character descriptions while watching audio and video data in the video message, the diversity of the content of the video message is improved, the requirements of different users are met, and because the characters are not subtitles of the video image, the image cannot be shielded, and the watching effect is improved.
Referring to fig. 8, a schematic structural diagram of an information processing apparatus according to an embodiment of the present invention, where the apparatus may be applied to an electronic device, and as shown in fig. 8, the apparatus may include:
the voice information acquisition module 11 is used for acquiring voice information in the video recording process;
optionally, the voice information obtaining module 11 may include:
the video data acquisition unit is used for responding to a video recording instruction and recording a video in the current environment to obtain video data;
and the voice information extraction unit is used for extracting the voice information from the video data.
Alternatively, the voice information acquiring module 11 may include:
and the voice acquisition unit is used for acquiring the voice information acquired by the voice acquisition unit in the video recording process.
The text information acquisition module 12 is configured to perform voice recognition processing on the voice information to obtain text information;
the information processing module 13 is configured to combine the text information and the recorded video information to obtain a video message to be sent;
and the message sending module 14 is configured to respond to a message sending instruction, and send the message to be sent to a second electronic device that establishes a communication connection with the electronic device.
Optionally, the information processing module 13 may include:
the relation establishing unit is used for establishing an incidence relation between the text information and the video information obtained by recording;
and the first data packet generating unit is used for generating a data packet by the text information, the video information and the incidence relation, and taking the data packet as a video message to be sent.
As another alternative, as shown in fig. 9, the apparatus may further include:
the information output module 15 is configured to output a video recording interface, where the video recording interface includes a first window and a second window, the first window is used to display a video image to be recorded, and the second window is used to display the text information;
and the cache module 16 is configured to cache the video image displayed in the first window and the text information displayed in the second window.
Based on this, the information processing module 13 may include:
the second data packet generating unit is used for correlating the cached video images and the cached text information according to the caching time and generating a data packet by all the cached video images and the cached text information;
or, the third data packet generating unit is configured to associate a video file generated from the cached video image with a text file generated from the cached text information after the video recording is finished, and generate a data packet from the video file and the text file.
Optionally, as shown in fig. 10, the apparatus may further include:
an editing instruction receiving module 17, configured to receive an editing instruction for the text information currently displayed in the second window;
and the information updating module 18 is used for responding to the editing instruction and updating the cached corresponding text information by using the text information obtained by editing.
Based on the above embodiments, the apparatus may further include:
the video message receiving module is used for receiving a video message;
the video playing interface output module is used for responding to a playing instruction aiming at the video message and outputting a video playing interface at the current conversation window;
and the information output module is used for outputting the video image of the video message in a first display area of the video playing interface and outputting the text information in the video message in a second display area of the video playing interface.
In summary, when a user needs to send a video message and record a video, voice recognition processing can be performed on voice information acquired in the video recording process to obtain text information, namely content of the voice information, and the recorded video information and the text information are combined to obtain the video message to be sent, instead of directly using the recorded video information as the video message to be sent, so that diversity of the content of the video message is realized, and various requirements of the user on the video message such as image watching, voice listening, voice content watching and the like are met.
Referring to fig. 11, a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present invention is provided, where the electronic device may be a mobile phone, a notebook computer, and as shown in fig. 11, the electronic device may include, but is not limited to: audio video collection equipment 21, display screen 22, communication interface 23, memory 24 and processor 25, wherein:
the audio and video capture device 21 may include an audio capture device and a video capture device, such as a camera, a voice capture device, and the like, and the specific composition structure of the audio and video capture device 21 is not limited in this embodiment.
With regard to the process of how the audio/video capture device 21 is utilized to obtain the voice information in the present embodiment, reference may be made to the description of the corresponding parts of the above-mentioned method embodiments.
The display screen 22 may be a touch display screen or a non-touch display screen, the present embodiment does not limit the type and structure of the display screen, in practical application of the present embodiment, the display screen may be used to display various interfaces, such as a video recording interface, a video playing interface, and the like, and a user may also perform a trigger operation on the display screen to generate a corresponding instruction to complete video recording, text editing, and the like, which is not described in detail herein.
The communication interface 23 may be used to implement communication connection with other electronic devices to implement message transmission, and the communication interface may specifically be an interface of a wireless communication module, such as an interface of a GPRS module, an interface of a WIFI module, and the like, and the type of the communication interface is not limited in this embodiment.
The memory 24 may store a program that implements the above-described information processing method, which may refer to the description of the above-described method embodiments.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
The processor 25 may load the program stored in the memory 24, and mainly implement the following steps:
acquiring voice information in the video recording process;
carrying out voice recognition processing on the voice information to obtain text information;
and combining the text information and the video information obtained by recording to obtain the video information to be sent.
The processor 25 may also implement other steps of the information processing method when executing the program, and the description according to the above method embodiment may be specifically given, which is not described in detail in this embodiment.
Finally, it should be noted that, in the embodiments, relational terms such as first, second and the like may be used solely to distinguish one operation, unit or module from another operation, unit or module without necessarily requiring or implying any actual such relationship or order between such units, operations or modules. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method or system that comprises the element.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device and the electronic equipment disclosed by the embodiment correspond to the method disclosed by the embodiment, so that the description is relatively simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. An information processing method applied to an electronic device, the method comprising:
acquiring voice information in the video recording process;
carrying out voice recognition processing on the voice information to obtain text information;
and combining the text information and the video information obtained by recording to obtain a video message to be sent, wherein the video message comprises the voice information.
2. The method according to claim 1, wherein the merging the text message and the recorded video message to obtain a video message to be sent comprises:
establishing an incidence relation between the text information and the video information obtained by recording;
generating a data packet by the text information, the video information and the incidence relation, and taking the data packet as a video message to be sent;
and responding to a message sending instruction, and sending the message to be sent to second electronic equipment which establishes communication connection with the electronic equipment.
3. The method of claim 1, wherein during video recording, the method further comprises:
outputting a video recording interface, wherein the video recording interface comprises a first window and a second window, the first window is used for displaying a video image to be recorded, and the second window is used for displaying the text information;
and caching the video image displayed by the first window and the text information displayed by the second window.
4. The method of claim 3, wherein the combining the text message with the recorded video message comprises:
according to the caching time, correlating the cached video images and the cached text information, and generating a data packet by all the cached video images and the cached text information;
or after the video recording is finished, associating the video file generated by the cached video image with the text file generated by the cached text information, and generating a data packet by the video file and the text file.
5. The method according to claim 3 or 4, characterized in that the method further comprises:
receiving an editing instruction aiming at the currently displayed text information of the second window;
and responding to the editing instruction, and updating the cached corresponding text information by using the text information obtained by editing.
6. The method of claim 1, wherein the obtaining voice information during the video recording process comprises:
responding to a video recording instruction, recording a video in the current environment, and extracting voice information from the video data after the video data is obtained; alternatively, the first and second electrodes may be,
and in the video recording process, acquiring the voice information acquired by the voice acquisition device.
7. The method according to any one of claims 1 to 4, further comprising:
receiving a video message;
responding to a playing instruction aiming at the video message, and outputting a video playing interface in a current session window;
and outputting the video image of the video message in a first display area of the video playing interface, and outputting the text information in the video message in a second display area of the video playing interface.
8. An information processing apparatus, applied to an electronic device, the apparatus comprising:
the voice information acquisition module is used for acquiring voice information in the video recording process;
the text information acquisition module is used for carrying out voice recognition processing on the voice information to obtain text information;
and the information processing module is used for combining the text information and the video information obtained by recording to obtain a video message to be sent.
9. The apparatus of claim 8, further comprising:
the information output module is used for outputting a video recording interface, the video recording interface comprises a first window and a second window, the first window is used for displaying a video image to be recorded, and the second window is used for displaying the text information;
and the cache module is used for caching the video image displayed by the first window and the text information displayed by the second window.
10. An electronic device, characterized in that the electronic device comprises:
audio and video acquisition equipment; a display screen; a communication interface;
a memory for storing a program for implementing the information processing method according to any one of claims 1 to 7;
a processor for loading and executing the memory-stored program, the program for:
acquiring voice information in the video recording process;
carrying out voice recognition processing on the voice information to obtain text information;
and combining the text information and the video information obtained by recording to obtain the video information to be sent.
CN201811301088.5A 2018-11-02 2018-11-02 Information processing method and device and electronic equipment Pending CN111147948A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811301088.5A CN111147948A (en) 2018-11-02 2018-11-02 Information processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811301088.5A CN111147948A (en) 2018-11-02 2018-11-02 Information processing method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN111147948A true CN111147948A (en) 2020-05-12

Family

ID=70516244

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811301088.5A Pending CN111147948A (en) 2018-11-02 2018-11-02 Information processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN111147948A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767501A (en) * 2020-06-29 2020-10-13 百度在线网络技术(北京)有限公司 Information processing method, information output method, information processing apparatus, information output apparatus, electronic device, and storage medium
CN111814732A (en) * 2020-07-23 2020-10-23 上海优扬新媒信息技术有限公司 Identity verification method and device
CN111970577A (en) * 2020-08-25 2020-11-20 北京字节跳动网络技术有限公司 Subtitle editing method and device and electronic equipment
CN112188266A (en) * 2020-09-24 2021-01-05 北京达佳互联信息技术有限公司 Video generation method and device and electronic equipment
CN112533052A (en) * 2020-11-27 2021-03-19 北京字跳网络技术有限公司 Video sharing method and device, electronic equipment and storage medium
CN113593567A (en) * 2021-06-23 2021-11-02 荣耀终端有限公司 Method for converting video and sound into text and related equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103841268A (en) * 2014-03-17 2014-06-04 联想(北京)有限公司 Information processing method and information processing device
CN104184870A (en) * 2014-07-29 2014-12-03 小米科技有限责任公司 Call log marking method and device and electronic equipment
CN104469542A (en) * 2014-11-07 2015-03-25 重庆晋才富熙科技有限公司 Device used for full video marking
CN106067310A (en) * 2016-06-27 2016-11-02 乐视控股(北京)有限公司 Recording data processing method and processing device
CN106997764A (en) * 2016-01-26 2017-08-01 阿里巴巴集团控股有限公司 A kind of instant communicating method and instantaneous communication system based on speech recognition
CN108063722A (en) * 2017-12-20 2018-05-22 北京时代脉搏信息技术有限公司 Video data generating method, computer readable storage medium and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103841268A (en) * 2014-03-17 2014-06-04 联想(北京)有限公司 Information processing method and information processing device
CN104184870A (en) * 2014-07-29 2014-12-03 小米科技有限责任公司 Call log marking method and device and electronic equipment
CN104469542A (en) * 2014-11-07 2015-03-25 重庆晋才富熙科技有限公司 Device used for full video marking
CN106997764A (en) * 2016-01-26 2017-08-01 阿里巴巴集团控股有限公司 A kind of instant communicating method and instantaneous communication system based on speech recognition
CN106067310A (en) * 2016-06-27 2016-11-02 乐视控股(北京)有限公司 Recording data processing method and processing device
CN108063722A (en) * 2017-12-20 2018-05-22 北京时代脉搏信息技术有限公司 Video data generating method, computer readable storage medium and electronic equipment

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767501A (en) * 2020-06-29 2020-10-13 百度在线网络技术(北京)有限公司 Information processing method, information output method, information processing apparatus, information output apparatus, electronic device, and storage medium
CN111814732A (en) * 2020-07-23 2020-10-23 上海优扬新媒信息技术有限公司 Identity verification method and device
CN111814732B (en) * 2020-07-23 2024-02-09 度小满科技(北京)有限公司 Identity verification method and device
CN111970577A (en) * 2020-08-25 2020-11-20 北京字节跳动网络技术有限公司 Subtitle editing method and device and electronic equipment
CN111970577B (en) * 2020-08-25 2023-07-25 北京字节跳动网络技术有限公司 Subtitle editing method and device and electronic equipment
CN112188266A (en) * 2020-09-24 2021-01-05 北京达佳互联信息技术有限公司 Video generation method and device and electronic equipment
CN112533052A (en) * 2020-11-27 2021-03-19 北京字跳网络技术有限公司 Video sharing method and device, electronic equipment and storage medium
US20230300452A1 (en) * 2020-11-27 2023-09-21 Beijing Zitiao Network Technology Co., Ltd. Video sharing method and apparatus, electronic device, and storage medium
US11956531B2 (en) * 2020-11-27 2024-04-09 Beijing Zitiao Network Technology Co., Ltd. Video sharing method and apparatus, electronic device, and storage medium
EP4236328A4 (en) * 2020-11-27 2024-04-24 Beijing Zitiao Network Technology Co Ltd Video sharing method and apparatus, electronic device, and storage medium
CN113593567A (en) * 2021-06-23 2021-11-02 荣耀终端有限公司 Method for converting video and sound into text and related equipment

Similar Documents

Publication Publication Date Title
CN111147948A (en) Information processing method and device and electronic equipment
CN109120866B (en) Dynamic expression generation method and device, computer readable storage medium and computer equipment
EP2940940B1 (en) Methods for sending and receiving video short message, apparatus and handheld electronic device thereof
CN105845124B (en) Audio processing method and device
JP6121621B2 (en) Voice call method, apparatus, program, and recording medium
EP3258392A1 (en) Systems and methods for building contextual highlights for conferencing systems
KR101203516B1 (en) Mobile communications terminal and method for manageing photograph image file thereof
CN109461462B (en) Audio sharing method and device
RU2500081C2 (en) Information processing device, information processing method and recording medium on which computer programme is stored
CN109245997B (en) Voice message playing method and device
CN110677734B (en) Video synthesis method and device, electronic equipment and storage medium
CN111756930A (en) Communication control method, communication control device, electronic apparatus, and readable storage medium
CN114115674A (en) Method for positioning sound recording and document content, electronic equipment and storage medium
CN112907703A (en) Expression package generation method and system
US10313502B2 (en) Automatically delaying playback of a message
CN111510556B (en) Call information processing method and device and computer storage medium
CN110298021B (en) Message interaction method, system and storage medium
CN112153396A (en) Page display method, device and system and storage medium
EP3174052A1 (en) Method and device for realizing voice message visualization service
CN112532931A (en) Video processing method and device and electronic equipment
CN106875968B (en) Information acquisition method, client and system
WO2021057957A1 (en) Video call method and apparatus, computer device and storage medium
CN110610727A (en) Courseware recording and broadcasting system with voice recognition function
CN111739538B (en) Translation method and device, earphone and server
CN104049833A (en) Terminal screen image displaying method based on individual biological characteristics and terminal screen image displaying device based on individual biological characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200512