CN113596352B

CN113596352B - Video processing method, processing device and electronic equipment

Info

Publication number: CN113596352B
Application number: CN202110867170.XA
Authority: CN
Inventors: 刘洋
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2021-07-29
Filing date: 2021-07-29
Publication date: 2023-07-25
Anticipated expiration: 2041-07-29
Also published as: CN113596352A

Abstract

The disclosure relates to a video processing method, a processing device and an electronic device, wherein the method comprises the following steps: determining keywords in the target video in response to the determining operation of the target object; responding to the acquisition operation of the target object, and acquiring a target paraphrasing corresponding to the keyword; and sending the target paraphrasing and the target video to a server so that the target video displays the target paraphrasing when being played. According to the method, the target definition corresponding to the keyword in the target video is displayed when the target video is played, so that a user can understand the content in the target video conveniently according to the target definition corresponding to the keyword, the user can be helped to understand the content played by the target video better, the problem that the user has understanding deviation on the played content due to the fact that the user engages in industry and the like is avoided, and the user can understand the video content accurately is guaranteed better.

Description

Video processing method, processing device and electronic equipment

Technical Field

The disclosure relates to the field of videos, and in particular relates to a video processing method, a processing device and electronic equipment.

Background

In the related art, short video application has become one of important channels for information transmission, more and more people produce short videos on a short video platform, share life, skills, experience or expertise, and simultaneously, more and more people watch the short videos on the short video platform to acquire required information. However, since the short video producer and the short video viewer are engaged in different industries, there is a phenomenon that the understanding degree of the same content is different between both parties, and particularly in some professional fields, a phenomenon that the understanding of the same content is different is common.

Thus, there is a need for a method of helping users understand short video content.

Disclosure of Invention

The present disclosure provides a video processing method, a processing apparatus, and an electronic device, so as to at least solve the problem in the related art that a method for helping a user to understand short video content is lacking. The technical scheme of the present disclosure is as follows:

according to a first aspect of an embodiment of the present disclosure, there is provided a video processing method, including: determining keywords in the target video in response to the determining operation of the target object; responding to the acquisition operation of the target object, and acquiring a target paraphrasing corresponding to the keyword; and sending the target paraphrasing and the target video to a server, so that the target video displays the target paraphrasing when being played.

Optionally, the step of determining the keywords in the target video in response to the determination operation of the target object includes: responding to the determining operation, processing the audio data of the target video to obtain a plurality of candidate keywords; and determining the keywords from a plurality of candidate keywords.

Optionally, the determining operation includes a first sub-operation and a second sub-operation, and the determining the keyword in the target video in response to the determining operation of the target object includes: responding to the first sub-operation, and displaying a plurality of candidate keywords on a display interface; and in response to the second sub-operation, determining at least part of the candidate keywords as the keywords.

Optionally, after the step of determining the keyword in the target video in response to the determining operation of the target object, before the step of acquiring the target paraphrasing corresponding to the keyword in response to the acquiring operation of the target object, the method further includes: the keyword is sent to the server, and the step of obtaining the target paraphrasing corresponding to the keyword in response to the obtaining operation of the target object comprises the following steps: receiving the target paraphrasing sent by the server in response to the obtaining operation, wherein the target paraphrasing is obtained by the server inquiring whether a first word stock comprises target entries corresponding to the keywords or not, and the target paraphrasing is the paraphrasing corresponding to the target entries or is obtained by the server from network side equipment; the first word library comprises a plurality of first word entries, the target word entry is the first word entry corresponding to the keyword, and the first word entry comprises a first word and a corresponding first paraphrasing.

Optionally, the step of acquiring the target paraphrasing corresponding to the keyword in response to the acquiring operation of the target object includes: responding to the obtaining operation, judging whether a second word stock comprises target entries corresponding to the keywords, wherein the second word stock comprises a plurality of second entries, the target entries are the second entries corresponding to the keywords, and the second entries comprise second words and corresponding second paraphrases; and under the condition that the second word stock comprises the target entry, acquiring the second paraphrasing corresponding to the target entry, and taking the second paraphrasing as the target paraphrasing.

Optionally, the step of acquiring the target paraphrasing corresponding to the keyword in response to the acquiring operation of the target object further includes: and under the condition that the second word stock does not comprise the target entry, acquiring the target paraphrasing of the keyword from network test equipment or the server.

Optionally, the method further comprises at least one of: receiving a second candidate term, and storing the second candidate term into the second word stock under the condition that the second candidate term meets a first preset condition; and deleting the second low-frequency vocabulary entries in the second vocabulary library under the condition that the second low-frequency vocabulary entries are included in the second vocabulary library, wherein the second low-frequency vocabulary entries are the second vocabulary entries which are searched for times smaller than the first preset times.

Optionally, the second entry further includes at least one of: corresponding source information of the second paraphrasing, and domain information of the second word.

Optionally, the step of sending the target paraphrasing and the target video to a server includes: integrating the target video and the target paraphrasing to obtain integrated data; and sending the integrated data to the server side so that the server side forwards the integrated data to a playing side, and when a target frame image is played by the playing side, displaying the target paraphrasing on the target frame image, wherein the target frame image is the frame image in which the keyword is positioned.

According to a second aspect of an embodiment of the present disclosure, there is provided a video processing method, including: receiving a target definition and a target video sent by a server, wherein the target definition is a definition corresponding to a keyword in the target video; and playing the target video, and displaying the target paraphrasing when the target video is played.

According to a third aspect of the embodiments of the present disclosure, there is provided a video processing apparatus, including: the method comprises a determining unit, a first obtaining unit and a first sending unit, wherein the determining unit is configured to perform a determining operation responding to a target object and determine keywords in the target video; the first acquisition unit is configured to perform an acquisition operation in response to the target object, and acquire a target paraphrasing corresponding to the keyword; the first transmitting unit is configured to perform a process configured to transmit the target paraphrasing and the target video to a server so that the target video displays the target paraphrasing when played.

According to a fourth aspect of embodiments of the present disclosure, there is provided an electronic device, comprising: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement any of the processing methods.

According to a fifth aspect of embodiments of the present disclosure, there is provided a system, including a server side, a playback side, and an editing side, wherein the editing side is configured to perform any one of the processing methods; the server is in communication connection with the editing end and is configured to at least receive the target paraphrasing and the target video; the playing end is in communication connection with the server end and is configured to receive the target video sent by the server end and display target paraphrasing corresponding to the keywords when the target video is played.

According to a sixth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium, which when executed by a processor of an electronic device, causes the electronic device to perform a processing method as defined in any one of the above.

According to a seventh aspect of embodiments of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements any of the processing methods described.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

in the video processing method, firstly, keywords are determined; then, acquiring a target definition corresponding to the keyword, wherein the target definition is the definition corresponding to the keyword in the target video; and finally, sending the target paraphrasing and the target video to a server side so that the target video displays the target paraphrasing when being played. According to the method, the target definition is obtained, and the target definition and the target video are sent to the server side, so that the target definition corresponding to the keyword in the target video is displayed when the target video is played, the user can understand the content in the target video conveniently according to the target definition corresponding to the keyword, the user can be helped to understand the content played by the target video better, the problem that the user understands the content because of the difference of industries and the like, and the problem that the user understands the content and deviates is solved, and the user can understand the video content accurately is guaranteed better.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure and do not constitute an undue limitation on the disclosure.

FIG. 1 is an architecture diagram of an implementation environment, shown in accordance with an exemplary embodiment.

Fig. 2 is a flowchart illustrating a method of processing video applied to an editing side according to an exemplary embodiment.

Fig. 3 is a flowchart illustrating a method of processing video applied to an editing side according to an exemplary embodiment.

Fig. 4 is a flowchart illustrating a processing method generation of video applied to a playback end according to an exemplary embodiment.

Fig. 5 is a flowchart illustrating a method of processing video according to an exemplary embodiment.

Fig. 6 is a block diagram of a video processing apparatus according to an exemplary embodiment.

Fig. 7 is a block diagram of a video processing apparatus according to an exemplary embodiment.

FIG. 8 is a schematic diagram of a system according to an exemplary embodiment.

Wherein, the drawings are as follows:

01. an electronic device; 02. and the server side.

Detailed Description

In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

Fig. 1 is an architecture diagram of an implementation environment in which the following video processing method may be applied, as shown in fig. 1, according to an exemplary embodiment. The implementation environment includes an electronic device 01 and a server 02. Wherein the electronic device 01 and the server 02 may be interconnected and communicate via a network.

The electronic device 01 includes an editing end of a video and/or a playing end of the video, where the editing end generally refers to electronic devices such as a mobile phone and a pad used by a video creator. The playing end generally refers to a video viewer, and is electronic equipment such as mobile phones and pads used by people consuming videos. The electronic device 01 may be any electronic product that can perform man-machine interaction with a user through one or more modes of a keyboard, a touch pad, a touch screen, a remote controller, a voice interaction or handwriting device, such as a mobile phone, a tablet computer, a palmtop computer, a personal computer (Personal Computer, PC), a wearable device, a smart television, and the like.

The server 02 may be a server, a server cluster formed by a plurality of servers, or a cloud computing service center. The server 02 may include a processor, memory, network interface, and the like.

Those skilled in the art will appreciate that the above-described electronic devices and servers are merely examples, and that other existing or future-occurring electronic devices or servers are applicable to the present disclosure and are intended to be within the scope of the present disclosure and are incorporated herein by reference.

As described above, a method of helping a user understand short video content is lacking in the related art.

The data referred to in this disclosure may be data authorized by the user or sufficiently authorized by the parties.

Based on the above, the embodiment of the disclosure provides a video processing method, a processing device and an electronic device.

Fig. 2 is a flowchart illustrating a video processing method according to an exemplary embodiment, and as shown in fig. 2, the video processing method may be used in an editing end, and includes the following steps:

in step S11, keywords in the target video are determined in response to the determination operation of the target object;

in step S12, in response to the obtaining operation of the target object, a target paraphrasing corresponding to the keyword is obtained;

in step S13, the target paraphrasing and the target video are sent to a server, so that the target video displays the target paraphrasing when playing.

In a specific embodiment, the keywords may be terms in the fields of medicine, science, sports, astronomy, etc., or may be other academic or news terms.

The target object may be an editor of the video, and the determining operation and the obtaining operation may be specifically a click operation, a slide operation, a double click operation, or the like, which is not limited to these operations, but may be other suitable operations.

In a specific embodiment of the present application, the step of determining the keywords in the target video in response to the determination operation of the target object includes: responding to the determining operation, and processing the audio data of the target video to obtain a plurality of candidate keywords; the keyword is determined from among the plurality of candidate keywords. In the scheme, an editing end processes audio data of a target video to obtain a plurality of candidate keywords, and then corresponding keywords are determined from the plurality of candidate keywords, namely, in the scheme, the editing end determines the corresponding keywords based on understanding of video content. The scheme can more efficiently and quickly determine the corresponding keywords.

The specific implementation process can be that an identification for acquiring the keywords is displayed on a display interface of the editing end, the editor performs a determining operation on the identification, and the editing end responds to the determining operation to determine the keywords by itself.

Of course, the specific process of obtaining the keywords in the present application may be other processes, and is not limited to the above-mentioned process, and in another embodiment of the present application, the determining operation includes a first sub-operation and a second sub-operation, and the step of determining the keywords in the target video in response to the determining operation of the target object includes: responding to the first sub-operation, and displaying a plurality of candidate keywords on a display interface; and in response to the second sub-operation, determining at least part of the candidate keywords as the keywords. In the scheme, firstly, responding to a first sub-operation of an editor, firstly, determining corresponding candidate keywords by the editing end according to the content of the target video, displaying the candidate keywords on an interface, and then, receiving a second sub-operation of the editor, and selecting the keywords from a plurality of candidate keywords. In the scheme, the keyword is determined in a process of determining the keyword, the keyword is not completely dependent on an editing end, but an editor participates in the keyword, and the editor knows the existing situation in a comparing way, so that the determined keyword is more accurate.

The first sub-operation and the second sub-operation may be selected from a click operation, a slide operation, or a double click operation, and may be the same or different. Of course, in the actual application process, the method is not limited to the specific operation described above, but may be other suitable operations, and those skilled in the art may set suitable operations as the first sub-operation and the second sub-operation according to the actual situation.

According to a specific embodiment of the present application, after the step of determining the keyword in the target video in response to the determining operation of the target object, before the step of acquiring the target paraphrasing corresponding to the keyword in response to the acquiring operation of the target object, the method further includes: and sending the keyword to the server, wherein the step of acquiring the target paraphrasing corresponding to the keyword in response to the acquisition operation of the target object comprises the following steps: receiving the target paraphrasing sent by the server in response to the obtaining operation, wherein the target paraphrasing is obtained by the server inquiring whether the first word stock comprises target entries corresponding to the keywords, and the target paraphrasing is the paraphrasing corresponding to the target entries, or the target paraphrasing is obtained by the server from network side equipment; the first word library comprises a plurality of first word entries, the target word entry is the first word entry corresponding to the keyword, and the first word entry comprises a first word and a corresponding first paraphrase. Therefore, the target definition can be obtained simply, quickly and accurately, the obtained target definition can be further conveniently sent to the client, and the user is further helped to better understand the content of the target video.

In still another specific embodiment of the present application, the step of obtaining the target paraphrasing corresponding to the keyword in response to the obtaining operation of the target object includes: responding to the obtaining operation, judging whether a second word stock comprises target words corresponding to the keywords, wherein the second word stock comprises a plurality of second words, the target words are the second words corresponding to the keywords, and the second words comprise second words and corresponding second paraphraseology; and when the target entry is included in the second word stock, acquiring the second paraphrasing corresponding to the target entry, and taking the second paraphrasing as the target paraphrasing. In the scheme, the editing end acquires the corresponding target paraphrasing from the second word stock, and the target paraphrasing does not need to be transmitted to the server end or the like to acquire the target paraphrasing, so that the target paraphrasing can be acquired more simply and efficiently.

In an actual application process, there is a case that the second word stock does not include the target entry, and in this case, in order to further ensure that the target paraphrasing can be obtained more accurately and more quickly, in another specific embodiment of the present application, the step of obtaining the target paraphrasing corresponding to the keyword in response to the obtaining operation of the target object further includes: and under the condition that the second word stock does not comprise the target entry, acquiring the target paraphrasing of the keyword from network test equipment or the server. According to the scheme, the situation that the target paraphrasing is difficult to accurately obtain under the condition that the second word stock does not comprise the target entry is well made up, and further the method can accurately obtain the target paraphrasing corresponding to the key word.

In order to ensure that the second word stock has better practicability while ensuring the richness of the second word stock, in another specific embodiment of the present application, the method further includes at least one of the following: receiving a second candidate term, and storing the second candidate term into the second lexicon when the second candidate term meets a first preset condition; and deleting the second low-frequency vocabulary entries in the second vocabulary library when the second low-frequency vocabulary entries are determined to be included in the second vocabulary library, wherein the second low-frequency vocabulary entries are the second vocabulary entries which are searched for less than a second preset number of times. Under the condition that the second candidate entry meets the first preset condition, the method stores the second candidate entry into the second word stock, so that the second word stock is rich; and deleting the second low-frequency vocabulary entries in the second vocabulary library under the condition that the second low-frequency vocabulary entries are included in the second vocabulary library, so that the low-frequency vocabulary entries can be cleaned, and the second vocabulary library is better in practicality.

In another embodiment of the present application, the server may also execute the corresponding method in the upper section, so as to ensure the richness of the first word stock and simultaneously ensure that the practicability of the first word stock is better. The details are not described here.

Specifically, the above entry may be collected and expanded by: firstly, collecting hot words, segments and the like in a hot event transmission platform; secondly, acquiring current hot words through authorization channels such as social media; secondly, obtaining professional vocabulary entries and paraphrasing thereof through an authorized website and the like; fourth, collecting the vocabulary manually input by the user through the platform and the interpretation of the vocabulary. Of course, the person skilled in the art may extend the first word stock and/or the second word stock by other ways.

In the actual application process, after a plurality of vocabulary entries are provided, the vocabulary entries need to be subjected to operations such as screening, filtering, classifying, sorting, warehousing, maintaining and the like, a vocabulary library is finally built to provide services for a plurality of platforms, and screening can be for screening paraphrases, such as screening the most accurate paraphrases from a plurality of paraphrases corresponding to one word, and screening words with strong academic or professional properties; the filtering may be filtering of terms, such as filtering of non-terminology; classifying, namely classifying the vocabulary entries according to the category of the vocabulary entries, such as biology, medicine, astronautics and the like; sorting, namely sorting the entries according to a certain sequence, such as sorting according to the sequence of occurrence frequency and the like; putting the finished vocabulary entries into a vocabulary library, and maintaining the vocabulary entries in the vocabulary library after establishment, namely updating or deleting the vocabulary entries and the like.

In an actual application process, the second term further includes at least one of the following: corresponding source information of the second definition and domain information of the second word. Of course, the second entry may also include other information, and those skilled in the art may actually add or subtract the information of the second entry.

In another specific embodiment, the first term further includes at least one of the following: the corresponding source information of the first paraphrasing and the domain information of the first word further ensure that the information of the first word is more comprehensive. Of course, the first term may also include other information, and those skilled in the art may actually add or subtract the information of the first term.

According to still another specific embodiment of the present application, the step of sending the target paraphrasing and the target video to the server includes: integrating the target video and the target definition to obtain integrated data; and sending the integrated data to the server side, so that the server side forwards the integrated data to a playing side, and the playing side displays the target definition on the target frame image when playing the target frame image, wherein the target frame image is the frame image where the keyword is located. Therefore, when the playing end plays the target frame image, the target definition is displayed, synchronization of the target definition and the target frame image is further ensured, and better experience of a user is further ensured.

Any other method known in the art may be employed by those skilled in the art to achieve integration of the target video with the target paraphrasing.

Fig. 3 is a flowchart of a processing method of a video applied to an editing end, which is generated according to an embodiment of the present application, wherein the editing end uploads a target video first, determines whether a target definition of a keyword needs to be supplemented in the target video, and when determining that a corresponding target definition needs to be supplemented in the target video, the editing end may select to manually supplement the target definition, may also receive the target definition sent by a server to implement automatic supplementation of the target definition, and then integrate the target video and the target definition to obtain integrated data, and sends the integrated data to the server, so that the server forwards the second integrated data to a playing end, thereby enabling the playing end to display the target definition on the target frame image when playing the target frame image.

Fig. 4 is a flowchart illustrating a video processing method according to an exemplary embodiment, and as shown in fig. 4, the video processing method may be used in a playback end, and includes the following steps.

In step S21, receiving a target definition and a target video sent by a server, where the target definition is a definition corresponding to a keyword in the target video;

In step S22, the target video is played, and the target paraphrasing is displayed when the target video is played.

In the video processing method, firstly, a target video and a target paraphrase sent by a server are received, then the target video is played in a display interface, and the target paraphrase is displayed in the display interface. According to the method, the target definition corresponding to the keyword in the target video is displayed when the target video is played, so that a user can understand the content in the target video conveniently according to the target definition corresponding to the keyword, the user can be helped to understand the content played by the target video better, the problem that the user has understanding deviation on the played content due to the fact that the user engages in industry and the like is avoided, and the user can understand the video content accurately is guaranteed better.

In a specific embodiment of the present application, the target paraphrasing is determined according to a keyword in the target video, and the keyword is determined in response to a determination operation of the target object, and the specific process may include: the editing end responds to the determining operation and processes the audio data of the target video to obtain a plurality of candidate keywords; the editing end determines the keywords from a plurality of candidate keywords. In the scheme, an editing end processes audio data of a target video to obtain a plurality of candidate keywords, and then corresponding keywords are determined from the plurality of candidate keywords, namely, in the scheme, the editing end determines the corresponding keywords based on understanding of video content. The scheme can more efficiently and quickly determine the corresponding keywords.

The specific keyword determining process may be that an identifier for obtaining the keyword is displayed on a display interface of the editing end, an editor performs determining operation on the identifier, and the editing end determines the keyword by itself in response to the determining operation.

Of course, the specific process of acquiring the keyword in the present application may be other processes, and is not limited to the above-mentioned process, and in another embodiment of the present application, the determining operation includes a first sub-operation and a second sub-operation, and the specific process of determining the keyword in response to the determining operation of the target object is described as follows: the editing end responds to the first sub-operation and displays a plurality of candidate keywords on a display interface; and the editing end responds to the second sub-operation to determine at least part of the candidate keywords as the keywords. In the scheme, firstly, responding to a first sub-operation of an editor, firstly, determining corresponding candidate keywords by the editing end according to the content of the target video, displaying the candidate keywords on an interface, and then, receiving a second sub-operation of the editor, and selecting the keywords from a plurality of candidate keywords. In the scheme, the keyword is determined in a process of determining the keyword, the keyword is not completely dependent on an editing end, but an editor participates in the keyword, and the editor knows the existing situation in a comparing way, so that the determined keyword is more accurate.

According to a specific embodiment of the present application, the step of receiving the target video and the target paraphrasing includes: receiving the target paraphrasing and the target video sent by the server; the target paraphrasing is obtained by the server inquiring whether the first word stock comprises target entries corresponding to the keywords, the target paraphrasing is obtained by the target paraphrasing corresponding to the target entries, or the target paraphrasing is obtained by the server from network side equipment; the first word library comprises a plurality of first word entries, the target word entry is the first word entry corresponding to the keyword, and the first word entry comprises a first word and a corresponding first paraphrasing. This ensures that the received target paraphrasing is accurate.

According to another specific embodiment of the present application, before the step of receiving the target paraphrasing, the method further includes: acquiring audio data when playing the target video in real time; and extracting the key words in the audio data and sending the key words to a server. According to the method, the audio data when the target video is played is acquired in real time, the keywords in the audio data are extracted and sent to the server, so that the keywords are further ensured to be simply, quickly and accurately acquired, the follow-up user can understand the content in the target video according to the target definition corresponding to the keywords conveniently, and the user can understand the video content accurately according to the target definition of the keywords.

In the practical application process, the keywords are also obtained in one of the following ways: the server side determines according to the received target video; the server receives the transmission of the editing terminal. Therefore, the method is convenient for the user to flexibly select the way of acquiring the keywords according to the actual situation, and the user experience is ensured to be better.

According to another specific embodiment of the present application, the step of receiving the target paraphrasing and the target video sent by the server includes: receiving first integrated data or second integrated data sent by the server, where the first integrated data is data obtained by integrating the target video and the target paraphrasing by the server, and the second integrated data is data obtained by integrating the target video and the target paraphrasing by the editing end and sent to the server. The server or the editing end integrates the target video and the target paraphrasing, so that the target paraphrasing is displayed when the subsequent playing end plays the target frame image, namely, the synchronization of the target paraphrasing and the target frame image is ensured.

According to another specific embodiment of the present application, the playing the target video, and displaying the target paraphrasing step when playing the target video, includes: and displaying the target paraphrase on the target frame image when the target frame image is played, wherein the target frame image is the frame image where the keyword is positioned. Therefore, the synchronization of the target definition and the target frame image is further ensured, the problem that the target definition and the corresponding target frame image are not synchronized is further avoided, and the better experience of a user is further ensured.

In another specific embodiment of the present application, the determining process that the editing end obtains the target paraphrasing includes: responding to the obtaining operation, judging whether a second word stock comprises target words corresponding to the keywords, wherein the second word stock comprises a plurality of second words, the target words are the second words corresponding to the keywords, and the second words comprise second words and corresponding second paraphraseology; and when the target entry is included in the second word stock, acquiring the second paraphrasing corresponding to the target entry, and taking the second paraphrasing as the target paraphrasing.

In an actual application process, there is a case that the second word stock does not include the target term, and in this case, in order to further ensure that the target paraphrasing can be obtained more accurately and more quickly, according to another specific embodiment of the present application, the step of obtaining the target paraphrasing by the editing end further includes: and under the condition that the second word stock does not comprise the target entry, acquiring the target paraphrasing from network side equipment or a server side. The method further facilitates the subsequent transmission of the acquired target definition to the client, and further helps the user to better understand the content played by the target video.

In order to ensure that the second word stock has better practicability while ensuring the richness of the second word stock, in another specific embodiment of the present application, the method executed by the editing end further includes at least one of the following: receiving a second candidate term, and storing the second candidate term into the second lexicon when the second candidate term meets a first preset condition; and deleting the second low-frequency vocabulary entries in the second vocabulary library when the second low-frequency vocabulary entries are determined to be included in the second vocabulary library, wherein the second low-frequency vocabulary entries are the second vocabulary entries which are searched for less than a second preset number of times. Under the condition that the second candidate entry meets the first preset condition, the method stores the second candidate entry into the second word stock, so that the second word stock is rich; and deleting the second low-frequency vocabulary entries in the second vocabulary library under the condition that the second low-frequency vocabulary entries are included in the second vocabulary library, so that the low-frequency vocabulary entries can be cleaned, and the second vocabulary library is better in practicality.

Fig. 5 is a flowchart generated by a processing method of a video applied to a playing end according to an embodiment of the present application, where a target video is played, audio data when the target video is obtained in real time is processed by adopting a natural language processing technology, the keywords in the audio data are extracted and sent to a server, the server obtains a target definition corresponding to the keywords according to the received keywords, and the playing end receives the target definition sent by the server and then displays the target definition in the playing process of the target video.

Fig. 6 is a block diagram of a processing device for video shown according to an exemplary embodiment. Referring to fig. 6, the apparatus includes a determining unit 10, a first acquiring unit 20, and a first transmitting unit 30.

The determining unit 10 is configured to perform a determining operation in response to the target object, determining keywords in the target video;

the first obtaining unit 20 is configured to perform an obtaining operation in response to the target object, and obtain a target paraphrasing corresponding to the keyword;

the first transmitting unit 30 is configured to perform transmission of the target paraphrasing and the target video to a server, so that the target video displays the target paraphrasing when playing.

In the video processing device, the determining unit determines the keyword, and the first acquiring unit acquires the target definition, wherein the target definition is the definition corresponding to the keyword in the target video; and transmitting the target definition and the target video to a server through the first transmitting unit, so that the target video displays the target definition when being played. According to the device, the target definition is obtained, and the target definition and the target video are sent to the server side, so that the target definition corresponding to the keyword in the target video is displayed when the target video is played, the user can understand the content in the target video conveniently according to the target definition corresponding to the keyword, the user can be helped to understand the content played by the target video better, the problem that the user understands the content because of the difference of industries and the like, and the problem that the user understands the content and deviates is solved, and the user can understand the video content accurately is guaranteed better.

In a specific embodiment of the present application, the determining unit includes: the processing module is configured to execute processing on the audio data of the target video in response to the determining operation to obtain a plurality of candidate keywords; the first determining module is configured to determine the keyword from a plurality of candidate keywords. In the scheme, an editing end processes audio data of a target video to obtain a plurality of candidate keywords, and then corresponding keywords are determined from the plurality of candidate keywords, namely, in the scheme, the editing end determines the corresponding keywords based on understanding of video content. The scheme can more efficiently and quickly determine the corresponding keywords.

Of course, the specific process of obtaining the keyword in the present application may be other processes, and is not limited to the above-mentioned process, and in another embodiment of the present application, the determining operation includes a first sub-operation and a second sub-operation, and the determining unit includes: the display module is configured to respond to the first sub-operation and display a plurality of candidate keywords on a display interface; and a second determining module configured to determine at least some of the plurality of candidate keywords as the keywords in response to the second sub-operation. In the scheme, firstly, responding to a first sub-operation of an editor, firstly, determining corresponding candidate keywords by the editing end according to the content of the target video, displaying the candidate keywords on an interface, and then, receiving a second sub-operation of the editor, and selecting the keywords from a plurality of candidate keywords. In the scheme, the keyword is determined in a process of determining the keyword, the keyword is not completely dependent on an editing end, but an editor participates in the keyword, and the editor knows the existing situation in a comparing way, so that the determined keyword is more accurate.

According to a specific embodiment of the present application, the above apparatus further includes: a second sending unit configured to send the keyword to the server after the step of determining the keyword in the target video in response to the determining operation of the target object, before the step of acquiring the target paraphrasing corresponding to the keyword in response to the acquiring operation of the target object, the first acquiring unit configured to perform: receiving the target paraphrasing sent by the server in response to the obtaining operation, wherein the target paraphrasing is obtained by the server inquiring whether the first word stock comprises target entries corresponding to the keywords, and the target paraphrasing is the paraphrasing corresponding to the target entries, or the target paraphrasing is obtained by the server from network side equipment; the first word library comprises a plurality of first word entries, the target word entry is the first word entry corresponding to the keyword, and the first word entry comprises a first word and a corresponding first paraphrase. Therefore, the target definition can be obtained simply, quickly and accurately, the obtained target definition can be further conveniently sent to the client, and the user is further helped to better understand the content of the target video.

In still another specific embodiment of the present application, the first obtaining unit includes: a judging module configured to execute a judgment, in response to the acquiring operation, as to whether a second word stock includes a target term corresponding to the keyword, the second word stock including a plurality of second terms, the target term being the second term corresponding to the keyword, the second term including a second word and a corresponding second paraphrase; the first obtaining module is configured to obtain the second paraphrasing corresponding to the target entry when the target entry is included in the second word stock, and take the second paraphrasing as the target paraphrasing. In the scheme, the editing end acquires the corresponding target paraphrasing from the second word stock, and the target paraphrasing does not need to be transmitted to the server end or the like to acquire the target paraphrasing, so that the target paraphrasing can be acquired more simply and efficiently.

In an actual application process, there is a case where the second word stock does not include the target term, and in this case, in order to further ensure that the target paraphrasing can be obtained more accurately and more quickly, in another specific embodiment of the present application, the first obtaining unit further includes: and a second obtaining module configured to obtain the target paraphrasing of the keyword from a network test device or the server under the condition that the target entry is not included in the second word stock. According to the scheme, the situation that the target paraphrasing is difficult to accurately obtain under the condition that the second word stock does not comprise the target entry is well made up, and further the device is guaranteed to accurately obtain the target paraphrasing corresponding to the keyword.

In order to ensure that the second word stock is better in practicality while ensuring the richness of the second word stock, in another specific embodiment of the present application, the apparatus is further configured to perform at least one of the following: receiving a second candidate term, and storing the second candidate term into the second lexicon when the second candidate term meets a first preset condition; and deleting the second low-frequency vocabulary entries in the second vocabulary library when the second low-frequency vocabulary entries are determined to be included in the second vocabulary library, wherein the second low-frequency vocabulary entries are the second vocabulary entries which are searched for less than a second preset number of times. The device stores the second candidate vocabulary entry into the second vocabulary library under the condition that the second candidate vocabulary entry meets the first preset condition, so that the second vocabulary library is rich; and deleting the second low-frequency vocabulary entries in the second vocabulary library under the condition that the second low-frequency vocabulary entries are included in the second vocabulary library, so that the low-frequency vocabulary entries can be cleaned, and the second vocabulary library is better in practicality.

According to still another specific embodiment of the present application, the first transmitting unit includes: the integration module is configured to integrate the target video and the target paraphrasing to obtain integrated data; and the sending module is configured to send the integrated data to the server side, so that the server side forwards the integrated data to the playing side, and the playing side displays the target paraphrasing on the target frame image when playing the target frame image, wherein the target frame image is the frame image where the keyword is located. Therefore, when the playing end plays the target frame image, the target definition is displayed, synchronization of the target definition and the target frame image is further ensured, and better experience of a user is further ensured.

Any other means known in the art may be employed by those skilled in the art to achieve integration of the target video with the target paraphrasing.

Fig. 7 is a block diagram of a processing device for video shown according to an exemplary embodiment. Referring to fig. 7, the apparatus includes a receiving unit 40 and a playing unit 50.

The receiving unit 40 is configured to execute receiving a target paraphrasing and a target video sent by a server, where the target paraphrasing is a paraphrasing corresponding to a keyword in the target video;

the playback unit 50 is configured to perform playback of the target video and display the target paraphrasing when playing the target video.

In the video processing device, the receiving unit receives the target video and the target paraphrasing, the playing unit plays the target video in the display interface, and the target paraphrasing is displayed in the display interface. According to the device, the target definition corresponding to the keyword in the target video is displayed when the target video is played, so that a user can understand the content in the target video conveniently according to the target definition corresponding to the keyword, the user can be helped to understand the content played by the target video better, the problem that the user has understanding deviation on the played content due to the fact that the user engages in industry and the like is avoided, and the user can understand the video content accurately is guaranteed better.

The specific keyword determining process may be: the display interface of the editing end displays an identifier for acquiring the keywords, the editor performs a determining operation on the identifier, and the editing end responds to the determining operation to determine the keywords by itself.

According to a specific embodiment of the present application, the receiving unit is configured to: executing the target paraphrasing and the target video sent by the receiving server; the target paraphrasing is obtained by the server inquiring whether the first word stock comprises target entries corresponding to the keywords, the target paraphrasing is obtained by the target paraphrasing corresponding to the target entries, or the target paraphrasing is obtained by the server from network side equipment; the first word library comprises a plurality of first word entries, the target word entry is the first word entry corresponding to the keyword, and the first word entry comprises a first word and a corresponding first paraphrasing. This ensures that the received target paraphrasing is accurate.

According to another specific embodiment of the present application, the above device further comprises: a second acquisition unit configured to acquire audio data at the time of playing the target video in real time before the target paraphrasing step is received; and a third transmitting unit configured to perform extraction of the keywords in the audio data and transmit the extracted keywords to a server. The device extracts the keywords in the audio data by acquiring the audio data when the target video is played in real time and sends the keywords to the server, so that the keywords are further ensured to be simply, quickly and accurately acquired, the follow-up user can understand the content in the target video according to the target definition corresponding to the keywords, and the user can understand the video content accurately according to the target definition of the keywords.

According to another specific embodiment of the present application, the above-mentioned receiving unit is configured to perform: receiving first integrated data or second integrated data sent by the server, where the first integrated data is data obtained by integrating the target video and the target paraphrasing by the server, and the second integrated data is data obtained by integrating the target video and the target paraphrasing by the editing end and sent to the server. The server or the editing end integrates the target video and the target paraphrasing, so that the target paraphrasing is displayed when the subsequent playing end plays the target frame image, namely, the synchronization of the target paraphrasing and the target frame image is ensured.

According to a further specific embodiment of the present application, the above-mentioned playing unit is configured to perform: and displaying the target paraphrase on the target frame image when the target frame image is played, wherein the target frame image is the frame image where the keyword is positioned. Therefore, the synchronization of the target definition and the target frame image is further ensured, the problem that the target definition and the corresponding target frame image are not synchronized is further avoided, and the better experience of a user is further ensured.

In order to ensure that the second word stock is better in practicality while ensuring the richness of the second word stock, in another specific embodiment of the present application, the editing end is further configured to perform at least one of the following: receiving a second candidate term, and storing the second candidate term into the second lexicon when the second candidate term meets a first preset condition; and deleting the second low-frequency vocabulary entries in the second vocabulary library when the second low-frequency vocabulary entries are determined to be included in the second vocabulary library, wherein the second low-frequency vocabulary entries are the second vocabulary entries which are searched for less than a second preset number of times. The device stores the second candidate vocabulary entry into the second vocabulary library under the condition that the second candidate vocabulary entry meets the first preset condition, so that the second vocabulary library is rich; and deleting the second low-frequency vocabulary entries in the second vocabulary library under the condition that the second low-frequency vocabulary entries are included in the second vocabulary library, so that the low-frequency vocabulary entries can be cleaned, and the second vocabulary library is better in practicality.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

In an exemplary embodiment, there is also provided an electronic device including a processor and a memory for storing instructions executable by the processor, where the processor is configured to execute the instructions to implement any one of the above-mentioned processing methods for a server, any one of the above-mentioned processing methods for a player, or any one of the above-mentioned processing methods for an editor.

In an exemplary embodiment, a system is further provided, where the system includes a server, a playing end, and an editing end, where the server is configured to execute any of the above processing methods; the playing end is in communication connection with the service end and is configured to execute any one of the processing methods; the editing side is communicatively connected to the server side and configured to execute any one of the processing methods described above.

The system comprises a server, a playing end and an editing end, wherein the playing end is in communication connection with the server and is configured to execute any one of the processing methods; the editing side is communicatively connected to the server side and configured to execute any one of the processing methods described above. According to the system, the target definition corresponding to the keyword in the target video is displayed when the target video is played, so that a user can understand the content in the target video conveniently according to the target definition corresponding to the keyword, the user can be helped to understand the content played by the target video better, the problem that the user has understanding deviation on the played content due to the fact that the user engages in industry and the like is avoided, and the user can understand the video content accurately is guaranteed better.

In a specific embodiment, a schematic diagram of the above system is shown in fig. 8, where the editing end includes content editing and paraphrasing automatic alignment technology, and the paraphrasing automatic alignment technology includes audio processing, text formatting, text analysis and keyword extraction. The server side comprises a word stock system and a paraphrasing interface, wherein the word stock system comprises word adding/deleting/modifying/checking and corresponding paraphrasing, automatic capacity expansion, low-frequency word cleaning, and screening, filtering, classifying, sorting and warehousing of words and corresponding paraphrasing. The playing end comprises content presentation and natural language processing technology, wherein the natural language processing technology comprises audio processing, text formatting, text analysis and keyword extraction.

In an exemplary embodiment, a computer readable storage medium comprising instructions is also provided, which when executed by a processor of an electronic device, enable the electronic device to perform any one of the above-described processing methods. Alternatively, the computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

In an exemplary embodiment, a computer program product is also provided, comprising a computer program which, when executed by a processor, implements any of the above-described processing methods.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method for processing video, comprising:

determining keywords in the target video in response to the determining operation of the target object;

responding to the acquisition operation of the target object, and acquiring a target paraphrasing corresponding to the keyword;

sending the target paraphrasing and the target video to a server so that the target video displays the target paraphrasing when being played,

after the step of determining the keyword in the target video in response to the determination operation of the target object, before the step of acquiring the target paraphrasing corresponding to the keyword in response to the acquisition operation of the target object, the method further includes: sending the keywords to the server side,

the step of obtaining the target paraphrasing corresponding to the keyword in response to the obtaining operation of the target object comprises the following steps:

receiving the target paraphrasing sent by the server in response to the obtaining operation, wherein the target paraphrasing is obtained by the server inquiring whether a first word stock comprises target entries corresponding to the keywords or not, and the target paraphrasing is the paraphrasing corresponding to the target entries or is obtained by the server from network side equipment; the first word library comprises a plurality of first vocabulary entries, the target vocabulary entry is the first vocabulary entry corresponding to the keyword, and the first vocabulary entry comprises a first word and a corresponding first paraphrasing;

The step of determining keywords in the target video in response to the determination operation of the target object comprises the following steps:

responding to the determining operation, processing the audio data of the target video to obtain a plurality of candidate keywords;

determining the keyword from a plurality of candidate keywords;

the step of sending the target paraphrasing and the target video to a server side comprises the following steps: integrating the target video and the target paraphrasing to obtain integrated data; and sending the integrated data to the server side so that the server side forwards the integrated data to a playing side, and when a target frame image is played by the playing side, displaying the target paraphrasing on the target frame image, wherein the target frame image is the frame image in which the keyword is positioned.

2. The processing method of claim 1, wherein the determining operation comprises a first sub-operation and a second sub-operation,

responding to the first sub-operation, and displaying a plurality of candidate keywords on a display interface;

and in response to the second sub-operation, determining at least part of the candidate keywords as the keywords.

3. The processing method according to claim 1 or 2, wherein the step of acquiring the target paraphrasing corresponding to the keyword in response to the acquisition operation of the target object includes:

responding to the obtaining operation, judging whether a second word stock comprises target entries corresponding to the keywords, wherein the second word stock comprises a plurality of second entries, the target entries are the second entries corresponding to the keywords, and the second entries comprise second words and corresponding second paraphrases;

and under the condition that the second word stock comprises the target entry, acquiring the second paraphrasing corresponding to the target entry, and taking the second paraphrasing as the target paraphrasing.

4. The method according to claim 3, wherein the step of acquiring the target paraphrasing corresponding to the keyword in response to the acquisition operation of the target object further comprises:

and under the condition that the second word stock does not comprise the target entry, acquiring the target paraphrasing of the keyword from network test equipment or the server.

5. The process of claim 4, further comprising at least one of:

Receiving a second candidate term, and storing the second candidate term into the second word stock under the condition that the second candidate term meets a first preset condition;

and deleting the second low-frequency vocabulary entries in the second vocabulary library under the condition that the second low-frequency vocabulary entries are included in the second vocabulary library, wherein the second low-frequency vocabulary entries are the second vocabulary entries which are searched for times smaller than the first preset times.

6. The processing method of claim 4, wherein the second entry further comprises at least one of: corresponding source information of the second paraphrasing, and domain information of the second word.

7. A video processing apparatus, comprising:

a determining unit configured to perform a determining operation in response to the target object, determining keywords in the target video;

a first acquisition unit configured to perform an acquisition operation in response to the target object, to acquire a target paraphrasing corresponding to the keyword;

a first transmission unit configured to perform transmission of the target paraphrasing and the target video to a server so that the target video displays the target paraphrasing when played,

The apparatus further includes a second transmitting unit configured to perform, after the step of determining the keyword in the target video in response to the determination operation of the target object, transmitting the keyword to the server before the step of acquiring the target paraphrasing corresponding to the keyword in response to the acquisition operation of the target object, the first acquiring unit configured to perform: receiving the target paraphrasing sent by the server in response to the obtaining operation, wherein the target paraphrasing is obtained by the server inquiring whether a first word stock comprises target entries corresponding to the keywords or not, and the target paraphrasing is the paraphrasing corresponding to the target entries or is obtained by the server from network side equipment; wherein the first word stock comprises a plurality of first vocabulary entries, the target vocabulary entry is the first vocabulary entry corresponding to the keyword, the first vocabulary entry comprises a first word and a corresponding first paraphrasing,

the determination unit includes: the processing module is configured to execute processing on the audio data of the target video in response to the determining operation to obtain a plurality of candidate keywords; a first determination module configured to perform determining the keyword from a plurality of the candidate keywords;

The first transmitting unit includes: an integration module configured to perform integration of the target video and the target paraphrasing to obtain integration data; the sending module is configured to send the integrated data to the server, so that the server forwards the integrated data to the playing end, and when a target frame image is played by the playing end, the target paraphrasing is displayed on the target frame image, wherein the target frame image is the frame image where the keyword is located.

8. An electronic device, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the processing method of any of claims 1 to 6.

9. A system, comprising:

an editing end configured to perform the processing method of any one of claims 1 to 6;

the server is in communication connection with the editing end and is configured to at least receive the target paraphrasing and the target video;

and the playing end is in communication connection with the server end and is configured to receive the target video sent by the server end and display target paraphrasing corresponding to the keywords when playing the target video.

10. A computer readable storage medium, characterized in that instructions in the computer readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the processing method of any one of claims 1 to 6.