CN115474089A - Audio and video online examination method and related equipment - Google Patents

Audio and video online examination method and related equipment Download PDF

Info

Publication number
CN115474089A
CN115474089A CN202210971679.3A CN202210971679A CN115474089A CN 115474089 A CN115474089 A CN 115474089A CN 202210971679 A CN202210971679 A CN 202210971679A CN 115474089 A CN115474089 A CN 115474089A
Authority
CN
China
Prior art keywords
audio
instruction
video
annotation
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210971679.3A
Other languages
Chinese (zh)
Inventor
谭熙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Big Head Brothers Technology Co Ltd
Original Assignee
Shenzhen Big Head Brothers Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Big Head Brothers Technology Co Ltd filed Critical Shenzhen Big Head Brothers Technology Co Ltd
Priority to CN202210971679.3A priority Critical patent/CN115474089A/en
Publication of CN115474089A publication Critical patent/CN115474089A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8126Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts
    • H04N21/8133Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts specifically related to the content, e.g. biography of the actors in a movie, detailed information about an article seen in a video program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8455Structuring of content, e.g. decomposing content into time segments involving pointers to the content, e.g. pointers to the I-frames of the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects

Abstract

The invention discloses an on-line examination method of audio and video and related equipment, wherein the method comprises the steps of obtaining an audio and video file to be processed; analyzing the audio and video file to obtain a time axis corresponding to the audio and video file; when a positioning instruction for the time axis is detected, target information in the audio and video file is displayed according to the positioning instruction; and when an annotation instruction for the target information is detected, generating annotation information corresponding to the target information according to the annotation instruction. The invention can facilitate the user to annotate and modify the audio/video file and improve the efficiency.

Description

Audio and video online examination method and related equipment
Technical Field
The invention relates to the technical field of multimedia processing, in particular to an on-line audio and video film examination method and related equipment.
Background
Along with threshold reduction of video shooting and processing, more and more users can easily shoot and make videos. In the video editing process, team cooperation is often required to improve the quality of the video in the later period. Because the current video post-processing usually takes a frame as a unit, but a section of video contains a large number of video frames, corresponding video frames need to be frequently searched in the modification process after a problem is raised, and the working efficiency is low. If the offline direct communication mode is adopted, team members are required to be present at the same time, so that modification cannot be frequently and efficiently discussed.
Disclosure of Invention
The invention aims to solve the technical problem of the annotation modification efficiency of audio and video, and provides an online examination method of audio and video and related equipment aiming at the defects of the prior art.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
an on-line audio-video film examination method, comprising:
acquiring an audio/video file to be processed;
analyzing the audio and video file to obtain a time axis corresponding to the audio and video file;
when a positioning instruction for the time axis is detected, target information in the audio and video file is displayed according to the positioning instruction;
and when an annotation instruction for the target information is detected, generating annotation information corresponding to the target information according to the annotation instruction.
The method for the online examination of the audio and video comprises the steps that a time axis comprises an image axis or/and an audio axis, and a positioning instruction comprises an image instruction aiming at the image axis and an audio instruction aiming at the audio axis; the displaying the corresponding target information in the audio and video file according to the positioning instruction comprises the following steps:
when the image instruction is detected, displaying and displaying image information corresponding to the audio and video file according to the image instruction;
and when the audio instruction is detected, displaying and displaying the video information corresponding to the audio and video file according to the audio instruction.
According to the method for the on-line examination of the audios and videos, the image instruction comprises a single-frame instruction and a multi-frame instruction; the displaying the target information in the audio and video file according to the positioning instruction comprises the following steps:
when the positioning instruction is a single-frame instruction, according to a timestamp corresponding to the single-frame instruction, taking a corresponding frame image in the audio/video file as target information and displaying the target information;
and when the positioning instruction is a multi-frame instruction, taking the corresponding image set in the audio/video file as target information and displaying according to the starting time and the ending time corresponding to the multi-frame instruction.
The method for the on-line examination of the audio and video comprises the following steps of taking a corresponding image set in an audio and video file as target information and displaying the target information according to the starting time and the ending time corresponding to the multi-frame instruction:
determining a starting image and a terminating image in the audio and video file according to a starting frame corresponding to the starting moment and a terminating frame corresponding to the terminating moment;
taking the video images between the starting image and the ending image as a set of images;
and displaying the image set according to a preset preview rule.
The audio and video online examination method comprises the steps that the annotation instruction comprises a starting instruction and an annotation text; the generating annotation information according to the annotation instruction comprises:
when a starting instruction corresponding to the audio and video information is detected, activating a preset annotation area;
and when the annotation text aiming at the annotation area is detected, generating annotation information according to the positioning instruction and the annotation text.
The method for the on-line examination of the audio and video, wherein the generating of the annotation information according to the positioning instruction and the annotation text comprises the following steps:
generating time information according to the positioning instruction;
determining an annotation object according to a time axis corresponding to the positioning instruction;
and generating annotation information according to the time information, the annotation object and the annotation text.
The method for the online examination of the audio and video comprises the following steps:
when a modification instruction for the audio/video file is detected, determining whether the modification instruction corresponds to the annotation information or not according to a timestamp corresponding to the modification instruction;
and if so, generating a modification remark according to the modification instruction.
The audio and video online examination method comprises the following steps of:
acquiring a file to be processed;
carrying out shot recognition on the file to be processed to obtain boundary frames corresponding to different shots;
and splitting the file to be processed according to the boundary frame to obtain a plurality of audio and video files.
A computer readable storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement the steps in the method for online review of audio-visual data as described in any of the above.
A terminal device, comprising: a processor, a memory, and a communication bus; the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes the connection communication between the processor and the memory;
the processor realizes the steps of the online audio and video reviewing method when executing the computer readable program.
Has the beneficial effects that: the invention provides an audio and video online examination method and related equipment. When a user needs to annotate the audio and video at a certain moment or a certain period of time, a positioning instruction is sent first, and target information is displayed according to the positioning instruction so that the user can determine whether the content needs to be annotated. And after the user confirms, inputting an annotation instruction so as to generate annotation information corresponding to the moment or time.
Drawings
Fig. 1 is a flowchart of an online audio/video review method provided by the present invention.
Fig. 2 is a schematic diagram of determining a boundary frame in the audio/video online film examination method provided by the invention.
Fig. 3 is a schematic diagram of first target information display in the audio/video online review method provided by the present invention.
Fig. 4 is a schematic diagram of second target information display in the online audio/video reviewing method provided by the present invention.
Fig. 5 is a schematic structural diagram of a terminal device provided in the present invention.
Detailed Description
The invention provides an on-line audio and video reviewing method, which is further described in detail below by referring to the attached drawings and embodiments in order to make the purpose, technical scheme and effect of the invention clearer and clearer. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As shown in fig. 1, this embodiment provides an online audio/video reviewing method, which is described by using a common server as an execution subject for convenience of description, where the server may be replaced with a tablet, a computer, or other devices having a data processing function, and the online audio/video reviewing method includes the following steps:
and S10, acquiring an audio and video file to be processed.
Specifically, a to-be-processed audio and video file is obtained first, and the audio and video file comprises audio and video. For example, the audio/video file 1 is a video file, and the audio/video file 2 is an audio file.
Generally, video files are large, and annotating on the video files requires a computer to have high computing power. In order to improve the operation efficiency, in this embodiment, the file to be processed may be split to obtain a plurality of audio/video files, and each audio/video file is a fragment of the file to be processed, so that the processing efficiency may be improved. Further, the video is generally composed of a plurality of shots, and if the shots are cut randomly, the shots are likely to split when being viewed subsequently. Therefore, in this embodiment, to obtain the audio/video file, before obtaining the audio/video file, the method further includes:
and A10, acquiring a file to be processed.
Specifically, a file to be processed is first obtained, and the file to be processed is a video file.
And A20, carrying out shot identification on the file to be processed to obtain boundary frames corresponding to different shots.
Specifically, a plurality of shots are contained in the video file, and as shown in fig. 2, the frame image for switching from one shot to another shot, i.e. the boundary frame, can be determined according to the source of the shot taken by the video file.
For example, a light flow analysis algorithm is used for analyzing a file to be processed to determine a boundary moment, wherein a light flow is an 'instantaneous speed' of pixel motion of a spatial moving object on an observation imaging plane, so that a calculable instantaneous speed exists between frame images shot by the same lens, but a matched pixel is difficult to find between frame images shot by different lenses, so that the calculated instantaneous speed is abnormal, and a normal range of a light flow value, namely a light flow threshold value, can be preset.
And calculating an optical flow value between each frame image and the previous frame image in the video file, and taking the frame image as a boundary frame when the optical flow value is not in the range of the optical flow threshold value.
Besides optical flow analysis, the boundary frame can be determined by means of similarity between two front and back frame images.
After the boundary frame is determined, the boundary frame and a plurality of images of the front frame and the rear frame can be displayed on a display screen, and if a user checks that the boundary frame is determined to be wrong, the proper boundary frame can be manually reselected.
And A30, splitting the file to be processed according to the boundary frame to obtain a plurality of audio and video files.
Specifically, after the boundary frame is obtained, the boundary frame is used as a watershed, the file to be processed is split, frame images before the first boundary frame are used as one group, frame images after the first boundary frame and before the second boundary frame are used as a second group \8230, 8230, and the method is carried out until all frame images in the file to be processed are divided into a plurality of image groups. And simultaneously splitting the audio of the file to be processed according to the time corresponding to the boundary frame to obtain a plurality of audios. And combining the audio and the corresponding image according to the corresponding time period in the file to be processed to obtain a subsequent annotatable audio/video file.
And if no boundary frame exists, namely the number of the boundary frames is zero, directly taking the file to be processed as an audio/video file.
Further, team work requires cooperation of multiple people, for example, one user manages one shot, so a management account corresponding to each shot can be preset, and each management account corresponds to one shot. After the management account is split into the audio and video files, the corresponding management account is determined according to the shot corresponding to the audio and video files.
For example, a management account a corresponding to a first shot is preset, and a frontmost audio and video file in a corresponding file to be processed is sent to the management account a.
And S20, analyzing the audio and video file to obtain a time axis corresponding to the audio and video file.
Specifically, no matter the audio file or the video file is played, the playing is performed according to the time axis of the audio and the time axis of the video, so that a user can conveniently confirm the time needing to be modified, and the acquired audio and video file is analyzed to obtain the time axis corresponding to the audio and video file.
In order to distinguish the time axis corresponding to the audio file from the time axis corresponding to the video file, the time axis corresponding to the audio file is referred to as an audio axis, and the time axis corresponding to the video file is referred to as an image axis.
And S30, when a positioning instruction aiming at the time axis is detected, displaying and displaying target information in the audio and video file according to the positioning instruction.
Specifically, a user can send a positioning instruction to the computer through the external device, for example, a mouse clicks a time axis, each segment of the time axis corresponds to a certain moment of the audio/video file, when the positioning instruction is detected, the corresponding moment of the positioning instruction can be determined according to a coordinate corresponding to the positioning instruction, and information corresponding to the moment in the audio/video file is used as target information and displayed.
In the above example, for the target information that needs to be annotated to correspond to a single time, the positioning command may correspond to a time period, for example, moving a route while clicking a mouse, setting a time corresponding to a start point of the route on a time axis as a start time, and setting a time corresponding to an end point of the route on the time axis as an end time. A time period can be determined from the start time and the end time. And according to the time period, corresponding target information in the audio and video file can be determined and displayed.
And if the audio/video file is a video file, the positioning instruction aiming at the image axis is called an image instruction. And when the image command is detected, displaying a plurality of images corresponding to the image command according to the image command. For the video file, the positioning instruction can be further divided into a single-frame instruction and a multi-frame instruction according to whether the positioning instruction corresponds to a single moment or a period of time. Since the single frame of the audio does not correspond to the time, in the embodiment, the positioning instruction can only correspond to a plurality of time periods for the audio file.
And when the positioning instruction is a single-frame instruction, taking a corresponding frame image in the audio/video file as target information and displaying the target information according to a timestamp corresponding to the single-frame instruction. For example, as shown in fig. 3, a display area of the object information is set above the time axis, and the frame image is displayed.
And when the positioning instruction is a multi-frame instruction, taking the corresponding image set in the audio/video file as target information and displaying according to the starting time and the ending time corresponding to the multi-frame instruction. According to a start frame corresponding to the start time of a positioning instruction and an end frame corresponding to the end time, a start image and an end image in an audio/video file are determined, and then a video image between the start image and the end image is used as an image set. The start image and the end image themselves may or may not be included in the set of images. And then displaying the image set according to a preset preview rule.
At this time, because the multi-frame instruction corresponds to multiple frame images, in order to display the frame image, as shown in fig. 3, multiple display frames are preset in a preview rule, and when multiple frame images need to be displayed, the frame images are sequentially guided into the display frames according to the sequence of the frame images so as to display the image set. In another preview rule, as shown in fig. 4, based on the image set, a thumbnail is generated, which includes a thumbnail corresponding to each image frame in the image set. When a thumbnail for one of the thumbnails is detected, a frame image corresponding to the thumbnail is displayed. The former can facilitate a user to see a large number of images in the target information at one time, and the latter can facilitate the user to carefully check frame by frame, so that the two preview modes can be mutually switched and displayed.
And S40, when an annotation instruction aiming at the target information is detected, generating annotation information corresponding to the target information according to the annotation instruction.
Specifically, after the target information is determined, the user may input an annotation instruction for the target information, where the annotation instruction includes content to be annotated, and when the annotation instruction is detected, generate annotation information corresponding to the target information according to the content in the annotation instruction.
As shown in fig. 3, after generating the annotation information, the annotation information may be disposed on the left side of the display interface, and the annotation information may include information about a time or a time period, content of the annotation, time of the annotation, and the like.
Further, the annotation instruction may include a start instruction and an annotation text, where the start instruction is used to start annotation, and the annotation text is content that the user needs to annotate. In order to facilitate the user to input and determine the content of the annotation text, when the starting instruction is detected, the preset annotation area is activated, and the user can input the content to be annotated, namely the annotation text. And when the annotation text is detected, generating annotation information according to the positioning instruction and the annotation text. For example, time information is generated according to the time on the time axis corresponding to the positioning instruction. And simultaneously, determining an annotation object according to a time axis corresponding to the positioning instruction, wherein the annotation object can comprise an image, audio or video. For example, "volume when a person is talking is adjusted" may be annotated as audio for an object, "exposure is increased" for annotated text as an image for an object, and "video is not harmonized with music tempo" for annotated text as video for an object. Corresponding to the first annotation text, the corresponding annotation information may be "[ 0.
In addition, in order to facilitate the user to view the position corresponding to the annotation instruction, while generating the annotation information, a preset prompt tag may be displayed in a region corresponding to the annotation information on the time axis, for example, the background color of the region corresponding to the time on the time axis is changed to another color, and the green background is used as the prompt tag. For example, in fig. 4, a region on the time axis is light gray, which indicates that the corresponding annotation information exists in the time period.
Further, when the user modifies the audio and video file, a modification instruction for the audio and video file is sent to the server, and whether the modification instruction corresponds to the annotation information or not is determined according to the timestamp corresponding to the modification instruction. For example, the user makes a modification for audio with a corresponding time of [ 0.
Based on the method and the device, the user can annotate the content at a certain moment or in a certain time period through the positioning instruction, so that the area needing to be modified can be conveniently and quickly positioned in the follow-up process, and the working efficiency is improved.
Based on the above on-line examination method of audio and video, the present invention further provides a terminal device, as shown in fig. 5, which includes at least one processor (processor) 20; a display screen 21; and a memory (memory) 22, and may further include a communication Interface (Communications Interface) 23 and a bus 24. The processor 20, the display 21, the memory 22 and the communication interface 23 can communicate with each other through the bus 24. The display screen 21 is configured to display a user guidance interface preset in the initial setting mode. The communication interface 23 may transmit information. The processor 20 may call logical commands in the memory 22 to perform the method in the above embodiment.
In addition, the logic commands in the memory 22 can be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand-alone product.
The memory 22, which is a computer-readable storage medium, may be configured to store a software program, a computer-executable program, such as program commands or modules corresponding to the methods in the embodiments of the present disclosure. The processor 20 executes functional applications and data processing by executing software programs, commands or modules stored in the memory 22, i.e. implements the method in the above-described embodiments.
The memory 22 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device, and the like. Further, the memory 22 may include a high speed random access memory and may also include a non-volatile memory. For example, a variety of media that can store program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, may also be used as the transient computer readable storage medium.
In addition, the specific processes loaded and executed by the computer readable storage medium and the plurality of command processors in the terminal device are described in detail in the method, and are not stated herein.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. An on-line examination method for audio and video is characterized by comprising the following steps:
acquiring an audio/video file to be processed;
analyzing the audio and video file to obtain a time axis corresponding to the audio and video file;
when a positioning instruction for the time axis is detected, target information in the audio and video file is displayed according to the positioning instruction;
and when an annotation instruction for the target information is detected, generating annotation information corresponding to the target information according to the annotation instruction.
2. The method for on-line examination of audio and video according to claim 1, wherein the time axis comprises an image axis and/or an audio axis, and the positioning instruction comprises an image instruction for the image axis and an audio instruction for the audio axis; the step of displaying and displaying corresponding target information in the audio and video file according to the positioning instruction comprises the following steps:
when the image instruction is detected, displaying image information corresponding to the audio and video file according to the image instruction and displaying the image information;
and when the audio instruction is detected, displaying and displaying the video information corresponding to the audio and video file according to the audio instruction.
3. The method for the on-line examination of audios and videos as claimed in claim 2, wherein the image command comprises a single-frame command and a multi-frame command; the displaying the target information in the audio and video file according to the positioning instruction comprises the following steps:
when the positioning instruction is a single-frame instruction, according to a timestamp corresponding to the single-frame instruction, taking a corresponding frame image in the audio/video file as target information and displaying the target information;
and when the positioning instruction is a multi-frame instruction, taking the corresponding image set in the audio/video file as target information and displaying according to the starting time and the ending time corresponding to the multi-frame instruction.
4. The method for the on-line examination of the audio and video according to claim 3, wherein the step of taking and displaying the corresponding image set in the audio and video file as target information according to the starting time and the ending time corresponding to the multi-frame instruction comprises the following steps:
determining a starting image and a terminating image in the audio/video file according to a starting frame corresponding to the starting moment and a terminating frame corresponding to the terminating moment;
taking the video images between the starting image and the ending image as a set of images;
and displaying the image set according to a preset preview rule.
5. The on-line audio-video film examination method according to claim 1, wherein the annotation command comprises a start command and an annotation text; the generating annotation information according to the annotation instruction comprises:
when a starting instruction corresponding to the audio and video information is detected, activating a preset annotation area;
and when the annotation text aiming at the annotation area is detected, generating annotation information according to the positioning instruction and the annotation text.
6. The method for the on-line examination of audios and videos as claimed in claim 5, wherein the generating of the annotation information according to the positioning instruction and the annotation text comprises:
generating time information according to the positioning instruction;
determining an annotation object according to a time axis corresponding to the positioning instruction;
and generating annotation information according to the time information, the annotation object and the annotation text.
7. The method for on-line examination of audio-visual frequency and video according to claim 1, wherein the method further comprises:
when a modification instruction for the audio/video file is detected, determining whether the modification instruction corresponds to the annotation information or not according to a timestamp corresponding to the modification instruction;
and if so, generating a modification remark according to the modification instruction.
8. The method for the on-line examination of audios and videos as claimed in claim 1, wherein the obtaining of the audio and video files to be processed comprises:
acquiring a file to be processed;
carrying out shot recognition on the file to be processed to obtain boundary frames corresponding to different shots;
and splitting the file to be processed according to the boundary frame to obtain a plurality of audio and video files.
9. A computer readable storage medium, characterized in that the computer readable storage medium stores one or more programs which can be executed by one or more processors to implement the steps in the method for on-line examination of audio and video according to any one of claims 1 to 8.
10. A terminal device, comprising: a processor, a memory, and a communication bus; the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes the connection communication between the processor and the memory;
the processor realizes the steps of the audio and video online examination method according to any one of claims 1 to 8 when executing the computer readable program.
CN202210971679.3A 2022-08-12 2022-08-12 Audio and video online examination method and related equipment Pending CN115474089A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210971679.3A CN115474089A (en) 2022-08-12 2022-08-12 Audio and video online examination method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210971679.3A CN115474089A (en) 2022-08-12 2022-08-12 Audio and video online examination method and related equipment

Publications (1)

Publication Number Publication Date
CN115474089A true CN115474089A (en) 2022-12-13

Family

ID=84365995

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210971679.3A Pending CN115474089A (en) 2022-08-12 2022-08-12 Audio and video online examination method and related equipment

Country Status (1)

Country Link
CN (1) CN115474089A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116824107A (en) * 2023-07-13 2023-09-29 北京万物镜像数据服务有限公司 Processing method, device and equipment for three-dimensional model review information

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090290847A1 (en) * 2008-05-20 2009-11-26 Honeywell International Inc. Manual voice annotations for cctv reporting and investigation
CN103024602A (en) * 2011-09-23 2013-04-03 华为技术有限公司 Method and device for adding annotations to videos
CN110012358A (en) * 2019-05-08 2019-07-12 腾讯科技(深圳)有限公司 Review of a film by the censor information processing method, device
CN110381382A (en) * 2019-07-23 2019-10-25 腾讯科技(深圳)有限公司 Video takes down notes generation method, device, storage medium and computer equipment
US20210287718A1 (en) * 2020-03-10 2021-09-16 Sony Corporation Providing a user interface for video annotation tools

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090290847A1 (en) * 2008-05-20 2009-11-26 Honeywell International Inc. Manual voice annotations for cctv reporting and investigation
CN103024602A (en) * 2011-09-23 2013-04-03 华为技术有限公司 Method and device for adding annotations to videos
CN110012358A (en) * 2019-05-08 2019-07-12 腾讯科技(深圳)有限公司 Review of a film by the censor information processing method, device
CN110381382A (en) * 2019-07-23 2019-10-25 腾讯科技(深圳)有限公司 Video takes down notes generation method, device, storage medium and computer equipment
US20210287718A1 (en) * 2020-03-10 2021-09-16 Sony Corporation Providing a user interface for video annotation tools

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
吴平颐 等: "基于视频标注的教学技能训练点评系统设计", 《教育教学论坛》, no. 2020, pages 78 - 81 *
王少珠 等: "基于多网络环境的移动审核系统", 《电视技术》, vol. 38, no. 24, pages 95 - 97 *
董守斌 等: "《网络信息检索》", vol. 2010, 西安电子科技大学出版社, pages: 291 - 293 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116824107A (en) * 2023-07-13 2023-09-29 北京万物镜像数据服务有限公司 Processing method, device and equipment for three-dimensional model review information
CN116824107B (en) * 2023-07-13 2024-03-19 北京万物镜像数据服务有限公司 Processing method, device and equipment for three-dimensional model review information

Similar Documents

Publication Publication Date Title
US9595098B2 (en) Image overlaying and comparison for inventory display auditing
WO2019047789A1 (en) Augmented reality scene related processing method, terminal device and system and computer storage medium
US9549121B2 (en) Image acquiring method and electronic device
WO2016187888A1 (en) Keyword notification method and device based on character recognition, and computer program product
WO2018102880A1 (en) Systems and methods for replacing faces in videos
US11070706B2 (en) Notifications for deviations in depiction of different objects in filmed shots of video content
WO2020052062A1 (en) Detection method and device
CN115474089A (en) Audio and video online examination method and related equipment
CN112235632A (en) Video processing method and device and server
US20190005133A1 (en) Method, apparatus and arrangement for summarizing and browsing video content
US20200092444A1 (en) Playback method, playback device and computer-readable storage medium
US20170040040A1 (en) Video information processing system
CN111970560A (en) Video acquisition method and device, electronic equipment and storage medium
US11144766B2 (en) Method for fast visual data annotation
US20180276004A1 (en) Visual history for content state changes
CN117152660A (en) Image display method and device
CN111475677A (en) Image processing method, image processing device, storage medium and electronic equipment
CN111491183B (en) Video processing method, device, equipment and storage medium
CN114125297A (en) Video shooting method and device, electronic equipment and storage medium
CN114245193A (en) Display control method and device and electronic equipment
CN114025237A (en) Video generation method and device and electronic equipment
CN113965798A (en) Video information generating and displaying method, device, equipment and storage medium
US20140313388A1 (en) Method for processing information and electronic device
JP2012004670A (en) Image analysis device and image analysis method
US11457247B1 (en) Edge computing method and apparatus, edge device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 518000 Building 1901, 1902, 1903, Qianhai Kexing Science Park, Labor Community, Xixiang Street, Bao'an District, Shenzhen, Guangdong Province

Applicant after: Shenzhen Flash Scissor Intelligent Technology Co.,Ltd.

Address before: 518000 Unit 9ABCDE, Building 2, Haihong Industrial Plant Phase II, Haihong Industrial Plant, West Side of Xixiang Avenue, Labor Community, Xixiang Street, Bao'an District, Shenzhen, Guangdong

Applicant before: Shenzhen big brother Technology Co.,Ltd.