WO2022105419A1 - 视频处理方法及装置 - Google Patents

视频处理方法及装置 Download PDF

Info

Publication number
WO2022105419A1
WO2022105419A1 PCT/CN2021/119991 CN2021119991W WO2022105419A1 WO 2022105419 A1 WO2022105419 A1 WO 2022105419A1 CN 2021119991 W CN2021119991 W CN 2021119991W WO 2022105419 A1 WO2022105419 A1 WO 2022105419A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
target
processed
video frame
recognition image
Prior art date
Application number
PCT/CN2021/119991
Other languages
English (en)
French (fr)
Inventor
唐君行
Original Assignee
上海哔哩哔哩科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海哔哩哔哩科技有限公司 filed Critical 上海哔哩哔哩科技有限公司
Priority to US18/037,750 priority Critical patent/US20240013811A1/en
Publication of WO2022105419A1 publication Critical patent/WO2022105419A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/34Indicating arrangements 
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • G06F16/748Hypervideo
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/7867Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field

Definitions

  • the present application relates to the field of Internet technologies, and in particular, to a video processing method.
  • the present application also relates to a video processing apparatus, a computing device, a computer-readable storage medium and a computer program product.
  • Video uploaders can upload self-made videos through video websites. If video uploaders want to share some content with the public, they usually choose to embed QR codes, barcodes, applet codes, and QR codes in the video, and upload them in the video.
  • the description information outside the video indicates the link address of the content to be shared.
  • users watch the video if they want to watch the content shared by the video uploader in real time, they need to use another terminal device to scan the code for identification, or take a screenshot, and then It is very inconvenient to obtain the content by recognizing the screenshot on the mobile phone. If you click the link address in the description information outside the video, you need to exit the video, and if there are many link addresses, it cannot be quickly and accurately identified. The content you want is also time-consuming and labor-intensive, resulting in the loss of users.
  • embodiments of the present application provide a video processing method.
  • the present application also relates to a video processing apparatus, a computing device, a computer-readable storage medium and a computer program product, so as to solve the problem in the prior art that the video uploader adds a link to the description information of the video, and the user It is time-consuming and laborious to obtain the content shared by the video uploader in the process of watching a video, which is a technical defect of poor user experience.
  • a video processing method including:
  • the video location information, the image location information and the target link are stored correspondingly and bound with the video to be processed.
  • a video processing apparatus including:
  • an acquisition module configured to acquire the video to be processed
  • a decoding module configured to decode the to-be-processed video to obtain a target video frame set, wherein the target video frame set includes a target recognition image
  • the determining module is configured to determine the video position information of the target video frame set in the to-be-processed video, the image position information of the target recognition image in the target video frame set, and the corresponding target recognition image. target link;
  • a storage binding module is configured to store the video position information, the image position information and the target link correspondingly, and bind them with the video to be processed.
  • a computing device including:
  • the memory is used to store computer-executable instructions
  • the processor is used to execute the computer-executable instructions:
  • the video location information, the image location information and the target link are stored correspondingly and bound with the video to be processed.
  • a computer-readable storage medium which stores computer-executable instructions, and when the instructions are executed by a processor, implements any of the steps of the video processing method.
  • a computer program product wherein when the computer program product is executed in a computer, the computer is caused to execute the steps of the above video processing method.
  • the video processing method provided by the present application obtains the video to be processed; decodes the to-be-processed video to obtain a target video frame set, wherein the target video frame set includes target recognition images; determines that the target video frame set is in the target video frame set The video position information in the video to be processed, the image position information of the target recognition image in the target video frame set, and the target link corresponding to the target recognition image; the video position information, the image position information and The target link is stored correspondingly and bound with the video to be processed.
  • An embodiment of the present application realizes that the link address corresponding to the target recognition image and the location information of the target recognition image in the video are stored correspondingly in the processing stage after the video upload is completed.
  • the user can directly click the target in the video. Recognize the image, so as to directly open the link address corresponding to the target recognition image without performing the scanning operation, which can better help the user jump to the content that the video uploader wants to share, simplify the tedious process of scanning the code, and improve the user experience.
  • FIG. 1 is a flowchart of a video processing method provided by an embodiment of the present application.
  • FIG. 2 is a process flow diagram of a video processing method applied to a video website provided by an embodiment of the present application
  • FIG. 3 is a schematic structural diagram of a video processing apparatus provided by an embodiment of the present application.
  • FIG. 4 is a structural block diagram of a computing device provided by an embodiment of the present application.
  • Hypertext Web-like text that organizes text information in various spaces.
  • Hypertext is a user interface paradigm used to display text and related content between texts.
  • hypertext is generally in the form of electronic documents.
  • the text contains links that can be linked to other locations or documents, allowing direct switching from the current reading position to the location pointed to by the hypertext link, there are many formats of hypertext, the most commonly used currently is the hypertext markup language (an application under Standard Common Markup Language) and rich text format.
  • Hypertext technology Embed link information in ordinary text, making the text interactive, you can click on the text and jump, so as to connect all resources in the Internet, which is convenient for users.
  • QR code a type of two-dimensional barcode
  • QR comes from the abbreviation of "Quick Response” in English, which means quick response. It originates from the inventor's hope that the content of QR code can be decoded quickly, and QR code can store more than ordinary barcodes. information, and there is no need to directly align the scanner when scanning like ordinary barcodes.
  • a video processing method is provided, and the present application also relates to a video processing apparatus, a computing device, a computer-readable storage medium and a computer program product, which are performed one by one in the following embodiments. Detailed description.
  • FIG. 1 shows a flowchart of a video processing method provided according to an embodiment of the present application, which specifically includes the following steps:
  • Step 102 Acquire the video to be processed.
  • the video to be processed is the video uploaded by the video uploader to the video website. After the video upload is completed, the video website needs to do some processing on the video, such as adding a watermark to the video, adding introduction information to the video, etc.
  • the video website needs to perform corresponding processing on the uploaded video, and the video is the video to be processed.
  • the pending video M uploaded by a certain video uploader is acquired.
  • Step 104 Decode the video to be processed to obtain a target video frame set, wherein the target video frame set includes target recognition images.
  • the target video frame is a video frame including a target recognition image.
  • the video uploader will embed a target recognition image in a certain time period in the video, and the user can scan the target recognition image to jump to the target recognition image
  • the corresponding link address based on this, the target video frame is a video frame containing the target recognition image, and in a video to be processed, the target video frames are continuous.
  • the target recognition image is an image containing the content that the video uploader wants to share.
  • the user can obtain the content by scanning and recognizing the target recognition image.
  • the target recognition image includes a two-dimensional code, a barcode, and a small program code.
  • any one of the QR codes, the specific representation of the target recognition image is not limited in this application, and the actual application shall prevail.
  • target recognition images including:
  • the target video frame set is generated by screening the video frames including the target recognition image in the initial video frame set.
  • the to-be-processed video is usually decoded first, and an initial video frame set is generated according to all video frames corresponding to the to-be-processed video, and the initial video frame set includes target video frames and non-target video frames.
  • the video frames including the target recognition image can be screened in the initial video frame set by image recognition technology, and the video frames including the target recognition image can be grouped together to generate the target video frame set,
  • the target set of video frames is a subset of the initial set of video frames.
  • the number of video frames obtained after decoding a video to be processed may be very large.
  • a video to be processed is an ultra-high-definition video
  • the frame rate of the video to be processed is 60 frames, that is, there are 60 frames per second. There are 3600 video frames in one minute. If the duration of the video to be processed is relatively long, the number of video frames will be very large, the calculation and processing speed will be relatively slow, and the requirements on the server will be relatively high. Therefore, in order to improve efficiency , decoding the to-be-processed video includes: decoding the to-be-processed video according to a preset time interval.
  • the accuracy of framing does not need to be accurate to every video frame, and the video to be processed can be decoded at preset time intervals, for example, every second.
  • the time interval between frames is used to decode the video to be processed. In this way, only 60 video frames are needed in one minute. Compared with the video frames obtained by decoding all the videos to be processed, the number of video frames will be greatly reduced, and the video processing efficiency will also be improved. higher.
  • the to-be-processed video M is decoded and processed in a time interval of 1 second.
  • the to-be-processed video is 360 seconds in total, and 360 video frames are obtained, which are then obtained by the image recognition method.
  • the 360 video frames there are 60 target video frames, and the target video frames include identification two-dimensional codes.
  • Step 106 Determine the video position information of the target video frame set in the to-be-processed video, the image position information of the target recognition image in the target video frame set, and the target link corresponding to the target recognition image.
  • the target video frames in the target video frame set are continuous, and the video position information of the target video frame set in the to-be-processed video is the start and end of the target video frame in the to-be-processed video frame Position, such as the target video frame set starts at the 30th second of the video and ends at the 60th second.
  • the image position information of the target recognition image in the target video frame set is the position information of the target recognition image appearing in the target video frame, such as a certain coordinate area in the target video frame.
  • the target link corresponding to the target recognition image is the link information corresponding to the target recognition image, and the user can jump to the address corresponding to the target link by scanning the target recognition image.
  • the target link address is stored in the video introduction. .
  • determining the video position information of the target video frame set in the to-be-processed video includes:
  • Video position information of the target video frame set in the to-be-processed video is determined according to the end time point and the start time point.
  • the target video frames are usually continuous, so the time point corresponding to the first target video frame in the target video frame set can be used as the start time when the target recognition image appears in the video to be processed, and the last target video frame The time point corresponding to the video frame can be used as the end time when the target recognition image appears in the video to be processed. Therefore, the start time point of the first target video frame in the video to be processed is determined, and the last target video frame is determined.
  • the video position information of the target video frame set in the to-be-processed video can be determined according to the end time point and the start time point.
  • the video M to be processed includes 60 target video frames, wherein the first target video frame appears in the 60th second of the video to be processed, and the last target video frame appears in the The 120th second of the to-be-processed video, the video position information may be determined to be the 60th to 120th seconds of the target video frame set appearing in the to-be-processed video.
  • determining the image position information of the target recognition image in the target video frame set includes:
  • the position of the target recognition image in the video to be processed can be fixed or dynamic.
  • the image position information of the target recognition image can be one or multiple. If the target recognition image If the position in the video to be processed is fixed, you only need to determine the image position information of the target recognition image in a certain target video frame. If the position of the target recognition image in the video to be processed is dynamic, you need to determine The image position information of the target recognition image in each target video frame.
  • determining the coordinate information of the target recognition image in the target video frame as the image position information includes:
  • the target recognition image is a rectangle, determine the coordinate information of the target recognition image in the target video frame according to the coordinates, length and width of any vertex of the target recognition image; or
  • the target recognition image is circular
  • coordinate information of the target recognition image in the target video frame is determined by setting the coordinates of the three vertices of the target recognition image.
  • the shapes of the target recognition images are various, which can be regular shapes such as rectangles, circles, triangles, etc., or irregular shapes.
  • the coordinates of a certain vertex (upper left, lower left, upper right, lower right) and the length and width of the rectangle represent the coordinate information of the target recognition image in the target video frame.
  • the target recognition image is a prototype
  • the target recognition image can be recognized according to the target.
  • the coordinates and radius of the circle center represent the coordinate information of the target recognition image in the target video frame.
  • the target recognition image is a triangle
  • the coordinates of the target recognition image in the target video frame can be represented by the coordinates of the three vertices of the target recognition image.
  • the coordinates of several points on the target recognition image can be taken according to the actual situation to represent the position of the target recognition image in the target video frame.
  • the coordinate information in the target video frame is defined as image position information.
  • the video position information of the target video frame set in the to-be-processed video M in the to-be-processed video is "the 60th second to the 120th second", identifying the two-dimensional
  • the position of the code in the video to be processed is fixed, and the identification two-dimensional code is a rectangle
  • the image position information of the identification two-dimensional code in the target video frame set is determined to be (x upper left , y upper left , a, b) , where (x upper left , y upper left ) is the coordinate of the top left corner of the QR code, a is the length of the QR code, and b is the width of the QR code; at the same time, the target link corresponding to the QR code is " https://www.*****”.
  • Step 108 Store the video location information, the image location information and the target link correspondingly, and bind them with the video to be processed.
  • the video location information, the image location information and the target link are stored correspondingly and bound with the to-be-processed video, and the above information can be read when the video is played.
  • the video location information, the image location information and the target link are stored correspondingly and bound with the video to be processed, including:
  • Meta information also known as intermediate data and metadata, is data describing data, mainly information describing data attributes, and is used to support functions such as indicating storage location, historical data, resource search, file recording, etc. Meta information is An electronic catalog that records the content or characteristics of data for the purpose of assisting in data retrieval.
  • the method further includes:
  • the method further includes:
  • a query message can also be sent to the user to ask whether the Jump, if the user chooses to continue or confirm, then jump to the target link, if the user chooses to cancel or not to continue, continue to play the video.
  • the to-be-processed video M is combined with the video position information "the 60th second to the 120th second", the image position information (x upper left , y upper left , a, b) and identification
  • the target link "https://www.******" corresponding to the QR code is bound.
  • the user plays the video M, and when the video M is played between the 60th and 120th second, the click command sent by the user is received, and the click position information (x click , y click ) of the click command is obtained, if the click position information (x click , y click) is clicked ) falls within the range of the image position information (upper left of x, upper left of y, a, b), then obtain and jump to the target link "https://www.******".
  • the link address corresponding to the target recognition image and the position information of the target recognition image in the video are stored correspondingly, and when the video is playing, the user can directly click on the video The target recognition image in the video, so as to directly open the link address corresponding to the target recognition image, without performing the scanning operation, which can better help the user jump to the content that the video uploader wants to share, simplify the tedious process of scanning the code, and improve the user experience. .
  • a query message is sent to the user to ask the user whether to open the link, so as to prevent the user from misoperation and further improve the user experience.
  • FIG. 2 shows a processing flowchart of a video processing method applied to a video website provided by an embodiment of the present application, which specifically includes the following steps:
  • Step 202 Acquire the video to be processed.
  • the user uploads the video T to the video website B, and the video T is the video to be processed.
  • Step 204 Decode the video to be processed according to a preset time interval to generate an initial set of video frames.
  • the to-be-processed video T is decoded at intervals of one second to obtain an initial video frame set, and the initial video frame set includes 600 video frames in total.
  • Step 206 Screen the video frames including the target recognition image in the initial video frame set to generate a target video frame set.
  • the target recognition image is a two-dimensional code
  • the video frames including the two-dimensional code are screened out of 600 video frames through image recognition, and a target video frame set is generated.
  • Step 208 Determine the video position information of the target video frame set in the to-be-processed video, the image position information of the target recognition image in the target video frame set, and the target link corresponding to the target recognition image.
  • the target The video position information of the video frame set in the video to be processed is (the 51st second to the 124th second).
  • the two-dimensional code is in a fixed position in the to-be-processed video T. Select any target video frame in the target video frame, and determine that the coordinates of the lower left corner of the two-dimensional code are (50, 550), and the two-dimensional code The side length of the code is (100, 100), then the image position information of the target recognition image in the target video frame set is (50, 550, 100, 100).
  • the target link corresponding to the two-dimensional code is "www.****.com”.
  • Step 210 correspondingly store the video location information, the image location information and the target link in a meta information file.
  • the video position information (the 51st second to the 124th second), the image position information are (50, 550, 100, 100) and the target link is "www.****.com" Stored in the meta information file F.
  • Step 212 Bind the meta-information file with the video to be processed.
  • the meta-information file F is bound with the video T to be processed.
  • Step 214 In the case where the video to be processed is played to the video location information, receive a click instruction from the user, and obtain the click location information of the click instruction.
  • the video website B publishes the video T, and the user can watch the video T through the video website B.
  • a click instruction sent by the user by clicking on the screen is received,
  • the click position information of the click instruction click screen is obtained as (73, 600).
  • Step 216 In the case that the click position information satisfies the image position information, send inquiry information to the user in response to the click instruction.
  • the user after judging that the position of the clicked position information (73, 600) conforms to the image position information (50, 550, 100, 100), that is, the user clicks on the two-dimensional code, the user will respond to the click instruction Sending query information to the user, asking the user whether to jump to the target link corresponding to the two-dimensional code.
  • Step 218 Obtain and jump to the target link in the case of receiving the determination instruction sent by the user according to the query information.
  • the link address corresponding to the target recognition image and the position information of the target recognition image in the video are stored correspondingly, and when the video is playing, the user can directly click the video The target recognition image in the video, so as to directly open the link address corresponding to the target recognition image, without performing the scanning operation, which can better help the user jump to the content that the video uploader wants to share, simplify the tedious process of scanning the code, and improve the user experience. .
  • a query message is sent to the user to ask the user whether to open the link, so as to prevent the user from misoperation and further improve the user experience.
  • FIG. 3 shows a schematic structural diagram of a video processing apparatus provided by an embodiment of the present application.
  • the device includes:
  • the obtaining module 302 is configured to obtain the video to be processed.
  • the decoding module 304 is configured to decode the to-be-processed video to obtain a target video frame set, wherein the target video frame set includes target recognition images.
  • the determining module 306 is configured to determine the video position information of the target video frame set in the to-be-processed video, the image position information of the target recognition image in the target video frame set and the target recognition image corresponding target link.
  • the storage binding module 308 is configured to store the video position information, the image position information and the target link correspondingly, and bind them with the to-be-processed video.
  • the decoding module 304 is further configured to:
  • the target video frame set is generated by screening the video frames including the target recognition image in the initial video frame set.
  • the decoding module 304 is further configured to:
  • the to-be-processed video is decoded according to a preset time interval.
  • the determining module 306 is further configured to:
  • Video position information of the target video frame set in the to-be-processed video is determined according to the end time point and the start time point.
  • the determining module 306 is further configured to:
  • the determining module 306 is further configured to:
  • the target recognition image is a rectangle, determine the coordinate information of the target recognition image in the target video frame according to the coordinates, length and width of any vertex of the target recognition image; or
  • the target recognition image is circular
  • coordinate information of the target recognition image in the target video frame is determined by setting the coordinates of the three vertices of the target recognition image.
  • the storage binding module 308 is further configured to:
  • the target recognition image includes any one of a two-dimensional code, a barcode, a small program code, and a QR code.
  • the device further includes:
  • a receiving module configured to receive a user's click instruction when the to-be-processed video is played to the video location information, and obtain the click location information of the click instruction
  • the jumping module is configured to acquire and jump to the target link when the click position information satisfies the image position information.
  • the jump module is further configured as:
  • the link address corresponding to the target recognition image and the position information of the target recognition image in the video are stored correspondingly.
  • the user can directly click on the video The target recognition image in the video, so as to directly open the link address corresponding to the target recognition image, without performing the scanning operation, which can better help the user jump to the content that the video uploader wants to share, simplify the tedious process of scanning the code, and improve the user experience. .
  • a query message is sent to the user to ask the user whether to open the link, so as to prevent the user from misoperation and further improve the user experience.
  • the above is a schematic solution of a video processing apparatus according to this embodiment. It should be noted that the technical solution of the video processing device and the technical solution of the above-mentioned video processing method belong to the same concept, and the details that are not described in detail in the technical solution of the video processing device can be referred to the description of the technical solution of the above-mentioned video processing method. .
  • FIG. 4 shows a structural block diagram of a computing device 400 according to an embodiment of the present application.
  • Components of the computing device 400 include, but are not limited to, memory 410 and processor 420 .
  • the processor 420 is connected with the memory 410 through the bus 430, and the database 450 is used for saving data.
  • Computing device 400 also includes access device 440 that enables computing device 400 to communicate via one or more networks 460 .
  • networks include a public switched telephone network (PSTN), a local area network (LAN), a wide area network (WAN), a personal area network (PAN), or a combination of communication networks such as the Internet.
  • Access device 440 may include one or more of any type of network interface (eg, network interface card (NIC)), wired or wireless, such as IEEE 802.11 wireless local area network (WLAN) wireless interface, World Interoperability for Microwave Access ( Wi-MAX) interface, Ethernet interface, Universal Serial Bus (USB) interface, cellular network interface, Bluetooth interface, Near Field Communication (NFC) interface, and the like.
  • NIC network interface card
  • the above-described components of the computing device 400 and other components not shown in FIG. 4 may also be connected to each other, eg, through a bus. It should be understood that the structural block diagram of the computing device shown in FIG. 4 is only for the purpose of example, rather than limiting the scope of the present application. Those skilled in the art can add or replace other components as required.
  • Computing device 400 may be any type of stationary or mobile computing device, including mobile computers or mobile computing devices (eg, tablet computers, personal digital assistants, laptop computers, notebook computers, netbooks, etc.), mobile phones (eg, smartphones ), wearable computing devices (eg, smart watches, smart glasses, etc.) or other types of mobile devices, or stationary computing devices such as desktop computers or PCs.
  • Computing device 400 may also be a mobile or stationary server.
  • the processor 420 is configured to execute the following computer-executable instructions:
  • the video location information, the image location information and the target link are stored correspondingly and bound with the video to be processed.
  • the above is a schematic solution of a computing device according to this embodiment. It should be noted that the technical solution of the computing device and the technical solution of the above-mentioned video processing method belong to the same concept, and the details not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the above-mentioned video processing method.
  • An embodiment of the present application further provides a computer-readable storage medium, which stores computer instructions, and when the instructions are executed by a processor, is used for:
  • the video location information, the image location information and the target link are stored correspondingly and bound with the video to be processed.
  • the above is a schematic solution of a computer-readable storage medium of this embodiment. It should be noted that the technical solution of the storage medium and the technical solution of the above-mentioned video processing method belong to the same concept. For details not described in detail in the technical solution of the storage medium, refer to the description of the technical solution of the above-mentioned video processing method.
  • An embodiment of the present specification also provides a computer program product, wherein, when the computer program product is executed in a computer, the computer is caused to execute the steps of the above-mentioned video processing method.
  • the computer instructions include computer program product code, which may be in source code form, object code form, an executable file, some intermediate form, or the like.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program product code, a recording medium, a USB flash drive, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory) ), random access memory (RAM, Random Access Memory), electrical carrier signals, telecommunication signals, and software distribution media, etc.
  • the content contained in the computer-readable media may be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction, for example, in some jurisdictions, according to legislation and patent practice, the computer-readable media Electric carrier signals and telecommunication signals are not included.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Library & Information Science (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本申请提供视频处理方法及装置,其中所述视频处理方法包括:获取待处理视频;解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像;确定所述目标视频帧集合在所述待处理视频中的视频位置信息、所述目标识别图像在所述目标视频帧集合中的图像位置信息和所述目标识别图像对应的目标链接;将所述视频位置信息、所述图像位置信息和所述目标链接对应存储,并与所述待处理视频绑定,通过本申请提供的视频处理方法,方便用户更快捷的跳转至目标链接,简化了繁琐流程,提高了用户体验。

Description

视频处理方法及装置
本申请要求于2020年11月19日提交中国专利局、申请号为202011302351.X、发明名称为“视频处理方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及互联网技术领域,特别涉及一种视频处理方法。本申请同时涉及一种视频处理装置,一种计算设备,一种计算机可读存储介质以及一种计算机程序产品。
背景技术
随着互联网技术的发展,视频已经从单纯的电影电视剧等向多元化发展,越来越多的自制视频面向公众。
视频上传者可以通过视频网站上传自制视频,若视频上传者在视频中想向大众分享一些内容时,通常会选择在视频中嵌入二维码、条形码、小程序码、QR码的方式,并在视频外部的描述信息中表示想要分享内容的链接地址,用户在观看视频时,若想实时观看视频上传者分享的内容时,就需要用另外的终端设备进行扫码识别,或通过截图,再通过手机对截图进行识别的方式获取内容,操作十分不便,若通过点击视频外部的描述信息中的链接地址的方式,则需要退出视频,并且若链接地址比较多的情况下,无法快速准确的识别自己想要的内容,也比较费时费力,进而造成用户的流失。
发明内容
有鉴于此,本申请实施例提供了一种视频处理方法。本申请同时涉及一种视频 处理装置,一种计算设备,一种计算机可读存储介质以及一种计算机程序产品,以解决现有技术中存在的视频上传者在视频的描述信息中添加链接,用户在观看视频过程中想要获取视频上传者分享的内容时费时费力,用户体验差的技术缺陷。
根据本申请实施例的第一方面,提供了一种视频处理方法,包括:
获取待处理视频;
解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像;
确定所述目标视频帧集合在所述待处理视频中的视频位置信息、所述目标识别图像在所述目标视频帧集合中的图像位置信息和所述目标识别图像对应的目标链接;
将所述视频位置信息、所述图像位置信息和所述目标链接对应存储,并与所述待处理视频绑定。
根据本申请实施例的第二方面,提供了一种视频处理装置,包括:
获取模块,被配置为获取待处理视频;
解码模块,被配置为解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像;
确定模块,被配置为确定所述目标视频帧集合在所述待处理视频中的视频位置信息、所述目标识别图像在所述目标视频帧集合中的图像位置信息和所述目标识别图像对应的目标链接;
存储绑定模块,被配置为将所述视频位置信息、所述图像位置信息和所述 目标链接对应存储,并与所述待处理视频绑定。
根据本申请实施例的第三方面,提供了一种计算设备,包括:
存储器和处理器;
所述存储器用于存储计算机可执行指令,所述处理器用于执行所述计算机可执行指令:
获取待处理视频;
解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像;
确定所述目标视频帧集合在所述待处理视频中的视频位置信息、所述目标识别图像在所述目标视频帧集合中的图像位置信息和所述目标识别图像对应的目标链接;
将所述视频位置信息、所述图像位置信息和所述目标链接对应存储,并与所述待处理视频绑定。
根据本申请实施例的第四方面,提供了一种计算机可读存储介质,其存储有计算机可执行指令,该指令被处理器执行时实现任意所述视频处理方法的步骤。
根据本说明书实施例的第五方面,提供了一种计算机程序产品,其中,当所述计算机程序产品在计算机中执行时,令计算机执行上述视频处理方法的步骤。
本申请提供的视频处理方法,获取待处理视频;解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像;确定所 述目标视频帧集合在所述待处理视频中的视频位置信息、所述目标识别图像在所述目标视频帧集合中的图像位置信息和所述目标识别图像对应的目标链接;将所述视频位置信息、所述图像位置信息和所述目标链接对应存储,并与所述待处理视频绑定。
本申请一实施例实现了在视频上传完成后的处理阶段,将目标识别图像对应的链接地址与目标识别图像在视频中的位置信息对应存储,当视频播放时,用户可以直接点击视频中的目标识别图像,从而直接打开目标识别图像对应的链接地址,无需执行扫描操作,可以更好的帮助用户跳转至视频上传者想要分享的内容,简化扫码的繁琐流程,提高用户体验。
附图说明
图1是本申请一实施例提供的一种视频处理方法的流程图;
图2是本申请一实施例提供的一种应用于某视频网站的视频处理方法的处理流程图;
图3是本申请一实施例提供的一种视频处理装置的结构示意图;
图4是本申请一实施例提供的一种计算设备的结构框图。
具体实施方式
在下面的描述中阐述了很多具体细节以便于充分理解本申请。但是本申请能够以很多不同于在此描述的其它方式来实施,本领域技术人员可以在不违背本申请内涵的情况下做类似推广,因此本申请不受下面公开的具体实施的限制。
在本申请一个或多个实施例中使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请一个或多个实施例。在本申请一个或多个实施例和 所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本申请一个或多个实施例中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。
应当理解,尽管在本申请一个或多个实施例中可能采用术语第一、第二等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本申请一个或多个实施例范围的情况下,第一也可以被称为第二,类似地,第二也可以被称为第一。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。
首先,对本申请一个或多个实施例涉及的名词术语进行解释。
超级文本:将各种不同空间的文字信息组织在一起的网状文本,超文本更是一种用户界面范式,用以显示文本及文本之间相关的内容,现实超文本普遍以电子文档的方式存在,其中的文字包含有可以链接到其他位置或文档的连接,允许从当前阅读位置直接切换到超文本链接所指向的位置,超文本的格式有很多,目前最常使用的是超文本标记语言(标准通用标记语言下的一个应用)及富文本格式。
超级文本技术:在普通文本里嵌入链接信息,使得文本具备交互能力,可以点击文本、跳转,从而连接互联网中的所有资源,方便用户。
QR码:二维条码的一种,QR来自英文“Quick Response”的缩写,即快速反应的意思,源自发明者希望QR码可以让其内容快速被解码,QR码比普 通条码可存储更多的资料,亦无需像普通条码般在扫描时直接对准扫描器。
在本申请中,提供了一种视频处理方法,本申请同时涉及一种视频处理装置,一种计算设备,一种计算机可读存储介质以及一种计算机程序产品,在下面的实施例中逐一进行详细说明。
图1示出了根据本申请一实施例提供的一种视频处理方法的流程图,具体包括以下步骤:
步骤102:获取待处理视频。
待处理视频为视频上传者上传到视频网站的视频,视频上传完成后,视频网站要对视频做一些处理,比如在视频中添加水印,为视频添加简介信息等等,例如,某视频上传者在视频网站上传视频,视频网站需要对已经完成上传的视频做相应处理,则视频即为待处理视频。
在本申请提供的一具体实施方式中,获取某视频上传者上传的待处理视频M。
步骤104:解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像。
在获得待处理视频后,对所述待处理视频做解码处理,获得所述待处理视频中的每个目标视频帧,目标视频帧为包括有目标识别图像的视频帧。在实际应用中,一个待处理视频的视频帧会有很多,而有时视频上传者会在视频中的某一个时间段嵌入一个目标识别图像,用户可以通过扫描该目标识别图像跳转至目标识别图像对应的链接地址,基于此,目标视频帧为包含有目标识别图像的视频帧,在一个待处理视频中,目标视频帧是连续的。
目标识别图像为包含有视频上传者想要分享的内容的图像,用户可以通过扫描识别所述目标识别图像获取内容,在实际应用中,所述目标识别图像包括二维码、条形码、小程序码、QR码中的任意一个,目标识别图像的具体表现形式在本申请中不做限制,以实际应用为准。
具体的,解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像,包括:
解码所述待处理视频,生成初始视频帧集合;
在所述初始视频帧集合中筛选包括目标识别图像的视频帧生成目标视频帧集合。
在实际应用中,通常先对所述待处理视频做解码,根据所述待处理视频对应的全部视频帧生成初始视频帧集合,在初始视频帧集合中包括目标视频帧和非目标视频帧。
在获得初始视频帧集合后,可以通过图像识别技术,在所述初始视频帧集合中筛选包括有目标识别图像的视频帧,将包含有目标识别图像的视频帧组成到一起生成目标视频帧集合,目标视频帧集合是初始视频帧集合的子集。
在具体实施时,对一个待处理视频进行解码后获得的视频帧的数量可能会非常大,比如一个待处理视频为超高清视频,待处理视频的帧率为60帧,即一秒有60个视频帧,一分钟就有3600个视频帧,若待处理视频的时长比较长的话,视频帧的数量非常庞大,计算处理速度就会比较慢,对服务器的要求也比较高,因此,为了提高效率,解码所述待处理视频,包括:根据预设的时间间隔解码所述待处理视频。
为了识别待处理视频中的目标识别图像,对视频进行解码处理时,分帧的精度无需精确到每一帧视频帧,可以将待处理视频按照预设的时间间隔进行解码,比如按照一秒一帧的时间间隔对待处理视频进行解码,这样,一分钟只需要60个视频帧,相比于对待处理视频全部解码得到的视频帧而言,视频帧的数量就会大大减少,视频处理效率也会更高。
在本申请提供的一具体实施方式中,沿用上例,以1秒的时间间隔为单位对待处理视频M进行解码处理,待处理视频共计360秒,获得360个视频帧,再通过图像识别法获得360个视频帧中有60个目标视频帧,目标视频帧中包括识别二维码。
步骤106:确定所述目标视频帧集合在所述待处理视频中的视频位置信息、所述目标识别图像在所述目标视频帧集合中的图像位置信息和所述目标识别图像对应的目标链接。
在实际应用中,目标视频帧集合中的目标视频帧都是连续的,目标视频帧集合在所述待处理视频中的视频位置信息为目标视频帧在所述待处理视频帧中的起始结束位置,比如目标视频帧集合在视频的第30秒开始,第60秒结束。
所述目标识别图像在所述目标视频帧集合中的图像位置信息为目标识别图像出现在目标视频帧中的位置信息,如在目标视频帧中的某个坐标区域。
目标识别图像对应的目标链接为目标识别图像对应的链接信息,用户通过扫描该目标识别图像即可跳转至目标链接对应的地址,在现有技术中,所述目标链接地址保存在视频简介中。
具体的,确定所述目标视频帧集合在所述待处理视频中的视频位置信息, 包括:
确定所述目标视频帧集合中第一个目标视频帧在所述待处理视频中的时间点作为起始时间点;
确定所述目标视频帧集合中最后一个目标视频帧在所述待处理视频中的时间点为结束时间点;
根据所述结束时间点和所述起始时间点确定所述目标视频帧集合在所述待处理视频中的视频位置信息。
在实际应用中,目标视频帧通常都是连续的,因此目标视频帧集合中的第一个目标视频帧对应的时间点即可作为目标识别图像在待处理视频中出现的开始时间,最后一个目标视频帧对应的时间点即可作为目标识别图像在待处理视频中出现的结束时间,因此,确定第一个目标视频帧在所述待处理视频中的起始时间点,确定最后一个目标视频帧在所述待处理视频中的结束时间点,根据结束时间点和起始时间点即可确定目标视频帧集合在待处理视频中的视频位置信息。
在本申请提供的一具体实施方式中,沿用上例,待处理视频M包括60个目标视频帧,其中第一个目标视频帧出现在待处理视频的第60秒,最后一个目标视频帧出现在所述待处理视频的第120秒,则可以确定视频位置信息为目标视频帧集合出现在待处理视频中的第60-120秒。
另一方面,确定所述目标识别图像在所述目标视频帧集合中的图像位置信息,包括:
在所述目标视频帧集合中确定目标视频帧;
确定所述目标识别图像在所述目标视频帧中的坐标信息作为图像位置信息。
在实际应用中,目标识别图像在待处理视频中的位置可以是固定的,也可以是动态的,相应的,目标识别图像的图像位置信息可以是一个,也可以是多个,若目标识别图像在待处理视频中的位置是固定的,则只需确定目标识别图像在某一个目标视频帧中的图像位置信息即可,若目标识别图像在待处理视频中的位置是动态的,则需要确定目标识别图像在每个目标视频帧中的图像位置信息。
具体的,确定所述目标识别图像在所述目标视频帧中的坐标信息作为图像位置信息,包括:
在所述目标识别图像为矩形的情况下,根据所述目标识别图像的任一顶点坐标、长和宽确定所述目标识别图像在所述目标视频帧中的坐标信息;或
在所述目标识别图像为圆形的情况下,根据所述目标识别图像的圆心坐标和半径确定所述目标识别图像在所述目标视频帧中的坐标信息;或
在所述目标识别图像为三角形的情况下,格局所述目标识别图像的三个顶点坐标确定所述目标识别图像在所述目标视频帧中的坐标信息。
在实际应用中,目标识别图像的形状各种各样,可以为矩形、圆形、三角形等规则形状,也可以为不规则形状,当目标识别图像为矩形的情况下,可以根据目标识别图像的某一个顶点(左上、左下、右上、右下)的坐标和矩形的长和宽来表示目标识别图像在目标视频帧中的坐标信息,当目标识别图像为原型的情况下,可以根据目标识别图像的圆心坐标和半径表示目标识别图像在目 标视频帧中的坐标信息,当目标识别图像为三角形的情况下,可以根据目标识别图像三个顶点的坐标来表示目标识别图像在目标视频帧中的坐标信息,当目标识别图像为不规则形状时,可以根据实际情况取目标识别图像上若干个点的坐标来表示目标识别图像在目标视频帧中的位置,本申请中不对确定所述目标识别图像在所述目标视频帧中的坐标信息作为图像位置信息做限定。
在本申请提供的一具体实施方式中,沿用上例,待处理视频M中的目标视频帧集合在所述待处理视频中的视频位置信息为“第60秒至第120秒”,识别二维码在待处理视频中出现的位置是固定的,且识别二维码为矩形,确定识别二维码在所述目标视频帧集合中的图像位置信息为(x 左上,y 左上,a,b),其中,(x 左上,y 左上)为识别二维码左上角顶点的坐标,a为识别二维码的长,b为识别二维码的宽;同时获得二维码对应的目标链接为“https://www.******”。
步骤108:将所述视频位置信息、所述图像位置信息和所述目标链接对应存储,并与所述待处理视频绑定。
将所述视频位置信息、所述图像位置信息和所述目标链接对应存储起来,并与所述待处理视频绑定,当视频播放时,可以读取上述信息。
具体的,将所述视频位置信息、所述图像位置信息和所述目标链接对应存储,并与所述待处理视频绑定,包括:
将所述视频位置信息、所述图像位置信息和所述目标链接对应存储至元信息文件;
将所述元信息文件与所述待处理视频绑定。
元信息(Metadta)又称为中介数据、元数据,为描述数据的数据,主要 是描述数据属性的信息,用来支持如指示存储位置、历史数据、资源查找、文件记录等功能,元信息是一种电子式目录,记录数据的内容或特色,达成协助数据检索的目的。
将所述视频位置信息、所述图像位置信息和所述目标链接对应存储至元信息文件,并将元信息文件与所述待处理视频进行绑定,当播放视频时,读取元信息文件,用链接的形式还原目标识别图像在视频中出现的时间、在视频中的位置和目标链接。
可选的,所述方法还包括:
在所述待处理视频播放至所述视频位置信息的情况下,接收用户的点击指令,并获取所述点击指令的点击位置信息;
在所述点击位置信息满足所述图像位置信息的情况下,获取并跳转至所述目标链接。
用户在观看视频时,可以通过点击屏幕的方式发送点击指令,在待处理视频播放过程中,不同的点击指令对应不同的操作,比如暂停、缩小画面等等,当待处理视频播放至视频位置信息的情况下,接收到用户发送的点击指令,就需要获取点击指令对应的点击位置信息,如点击位置不符合所述图像位置信息的情况下,则说明用户点击了目标识别图像之外的位置,可以直接执行对应的操作;若点击位置信息符合所述图像位置信息的情况下,说明用户点击了目标识别图像,则获取于所述待处理视频绑定的目标链接,并跳转至所述目标链接。
在实际应用中,为了防止用户误操作,在获取并跳转至所述目标链接之前,所述方法还包括:
响应于所述点击指令向所述用户发送询问信息,其中,所述询问信息用于确定是否跳转至所述目标链接;
在接收到用户根据所述询问信息发送的确定指令的情况下,获取并跳转至所述目标链接。
在实际应用中,有时用户可能会有误操作,不小心点击到目标识别图像,若直接跳转则会影响用户体验,还可以在跳转至目标链接之前向用户发送询问信息,询问用户是否以跳转,若用户选择继续或确认,则跳转至目标链接,若用户选择取消或不继续,则继续播放视频。
在本申请提供的一具体实施方式中,沿用上例,将待处理视频M与视频位置信息“第60秒至第120秒”、图像位置信息(x 左上,y 左上,a,b)和识别二维码对应的目标链接“https://www.******”绑定。
用户播放视频M,当播放至第60-120秒之间时,接收用户发送的点击指令,并获取点击指令的点击位置信息(x 点击,y 点击),若点击位置信息(x 点击,y 点击)落在图像位置信息(x 左上,y 左上,a,b)的区域范围内,则获取并跳转至所述目标链接“https://www.******”。
本申请实施例提供的视频处理方法,在视频上传完成后的处理阶段,将目标识别图像对应的链接地址与目标识别图像在视频中的位置信息对应存储,当视频播放时,用户可以直接点击视频中的目标识别图像,从而直接打开目标识别图像对应的链接地址,无需执行扫描操作,可以更好的帮助用户跳转至视频上传者想要分享的内容,简化扫码的繁琐流程,提高用户体验。
其次,在打开链接地址之前,向用户发送询问信息,询问用户是否打开链 接,防止用户误操作,进一步提高了用户的使用体验。
下述结合附图2,以本申请提供的视频处理方法在某视频网站的应用为例,对所述视频处理方法进行进一步说明。其中,图2示出了本申请一实施例提供的一种应用于某视频网站的视频处理方法的处理流程图,具体包括以下步骤:
步骤202:获取待处理视频。
在本申请提供的实施例中,用户将视频T上传至视频网站B,视频T即为待处理视频。
步骤204:根据预设的时间间隔解码所述待处理视频生成初始视频帧集合。
在本申请提供的实施例中,按照一秒的时间间隔解码所述待处理视频T,获得初始视频帧集合,初始视频帧集合中共有600个视频帧。
步骤206:在所述初始视频帧集合中筛选包括目标识别图像的视频帧生成目标视频帧集合。
在本申请提供的实施例中,目标识别图像为二维码,通过图像识别在600个视频帧中筛选出包括二维码的视频帧,生成目标视频帧集合,目标视频帧集合中共有73个目标视频帧,且目标视频帧是连续的。
步骤208:确定所述目标视频帧集合在所述待处理视频中的视频位置信息、所述目标识别图像在所述目标视频帧集合中的图像位置信息和所述目标识别图像对应的目标链接。
在本申请提供的实施例中,确定第一个目标视频帧在待处理视频T中的时间点为第51秒,最后一个目标视频帧在待处理视频T中的时间点是124秒,则目标视频帧集合在待处理视频中的视频位置信息为(第51秒至第124秒)。
所述二维码在所述待处理视频T中是固定位置的,在所述目标视频帧中任选一个目标视频帧,确定二维码左下角顶点的坐标为(50,550),二维码的边长为(100,100),则所述目标识别图像在所述目标视频帧集合中的图像位置信息为(50,550,100,100)。
所述二维码对应的目标链接为“www.****.com”。
步骤210:将所述视频位置信息、所述图像位置信息和所述目标链接对应存储至元信息文件。
在本申请提供的实施例中,将视频位置信息(第51秒至第124秒)、图像位置信息为(50,550,100,100)和目标链接为“www.****.com”存储至元信息文件F中。
步骤212:将所述元信息文件与所述待处理视频绑定。
在本申请提供的实施例中,将元信息文件F与所述待处理视频T绑定。
步骤214:在所述待处理视频播放至所述视频位置信息的情况下,接收用户的点击指令,并获取所述点击指令的点击位置信息。
在本申请提供的实施例中,视频网站B将所述视频T发布,用户可以通过视频网站B观看视频T,在视频T播放至第73秒时,接收到用户通过点击屏幕发送的点击指令,同时获取所述点击指令点击屏幕的点击位置信息为(73,600)。
步骤216:在所述点击位置信息满足所述图像位置信息的情况下,响应于所述点击指令向所述用户发送询问信息。
在本申请提供的实施例中,经过判断点击位置信息(73,600)的位置符 合图像位置信息(50,550,100,100),即用户点击到了二维码,则响应于所述点击指令向用户发送询问信息,询问用户是否跳转至所述二维码对应的目标链接。
步骤218:在接收到用户根据所述询问信息发送的确定指令的情况下,获取并跳转至所述目标链接。
在本申请提供的实施例中,在接收到用户根据所述询问信息发送的确定指令的情况下,确定用户希望跳转至所述二维码对应的目标连接,则获取并跳转至目标连接“www.****.com”。
本申请实施例提供的视频处理方法,在视频上传完成后的处理阶段,将目标识别图像对应的链接地址与目标识别图像在视频中的位置信息对应存储,当视频播放时,用户可以直接点击视频中的目标识别图像,从而直接打开目标识别图像对应的链接地址,无需执行扫描操作,可以更好的帮助用户跳转至视频上传者想要分享的内容,简化扫码的繁琐流程,提高用户体验。
其次,在打开链接地址之前,向用户发送询问信息,询问用户是否打开链接,防止用户误操作,进一步提高了用户的使用体验。
与上述方法实施例相对应,本申请还提供了视频处理装置实施例,图3示出了本申请一实施例提供的一种视频处理装置的结构示意图。如图3所示,该装置包括:
获取模块302,被配置为获取待处理视频。
解码模块304,被配置为解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像。
确定模块306,被配置为确定所述目标视频帧集合在所述待处理视频中的视频位置信息、所述目标识别图像在所述目标视频帧集合中的图像位置信息和所述目标识别图像对应的目标链接。
存储绑定模块308,被配置为将所述视频位置信息、所述图像位置信息和所述目标链接对应存储,并与所述待处理视频绑定。
可选的,所述解码模块304,进一步被配置为:
解码所述待处理视频,生成初始视频帧集合;
在所述初始视频帧集合中筛选包括目标识别图像的视频帧生成目标视频帧集合。
可选的,所述解码模块304,进一步被配置为:
根据预设的时间间隔解码所述待处理视频。
可选的,所述确定模块306,进一步被配置为:
确定所述目标视频帧集合中第一个目标视频帧在所述待处理视频中的时间点作为起始时间点;
确定所述目标视频帧集合中最后一个目标视频帧在所述待处理视频中的时间点为结束时间点;
根据所述结束时间点和所述起始时间点确定所述目标视频帧集合在所述待处理视频中的视频位置信息。
可选的,所述确定模块306,进一步被配置为:
在所述目标视频帧集合中确定目标视频帧;
确定所述目标识别图像在所述目标视频帧中的坐标信息作为图像位置信息。
可选的,所述确定模块306,进一步被配置为:
在所述目标识别图像为矩形的情况下,根据所述目标识别图像的任一顶点坐标、长和宽确定所述目标识别图像在所述目标视频帧中的坐标信息;或
在所述目标识别图像为圆形的情况下,根据所述目标识别图像的圆心坐标和半径确定所述目标识别图像在所述目标视频帧中的坐标信息;或
在所述目标识别图像为三角形的情况下,格局所述目标识别图像的三个顶点坐标确定所述目标识别图像在所述目标视频帧中的坐标信息。
可选的,所述存储绑定模块308,进一步被配置为:
将所述视频位置信息、所述图像位置信息和所述目标链接对应存储至元信息文件;
将所述元信息文件与所述待处理视频绑定。
可选的,所述目标识别图像包括二维码、条形码、小程序码、QR码中的任意一个。
可选的,所述装置还包括:
接收模块,被配置为在所述待处理视频播放至所述视频位置信息的情况下,接收用户的点击指令,并获取所述点击指令的点击位置信息;
跳转模块,被配置为在所述点击位置信息满足所述图像位置信息的情况下,获取并跳转至所述目标链接。
可选的,所述跳转模块,进一步被配置为:
响应于所述点击指令向所述用户发送询问信息,其中,所述询问信息用于确定是否跳转至所述目标链接;
在接收到用户根据所述询问信息发送的确定指令的情况下,获取并跳转至所述目标链接。
本申请实施例提供的视频处理装置,在视频上传完成后的处理阶段,将目标识别图像对应的链接地址与目标识别图像在视频中的位置信息对应存储,当视频播放时,用户可以直接点击视频中的目标识别图像,从而直接打开目标识别图像对应的链接地址,无需执行扫描操作,可以更好的帮助用户跳转至视频上传者想要分享的内容,简化扫码的繁琐流程,提高用户体验。
其次,在打开链接地址之前,向用户发送询问信息,询问用户是否打开链接,防止用户误操作,进一步提高了用户的使用体验。
上述为本实施例的一种视频处理装置的示意性方案。需要说明的是,该视频处理装置的技术方案与上述的视频处理方法的技术方案属于同一构思,视频处理装置的技术方案未详细描述的细节内容,均可以参见上述视频处理方法的技术方案的描述。
图4示出了根据本申请一实施例提供的一种计算设备400的结构框图。该计算设备400的部件包括但不限于存储器410和处理器420。处理器420与存储器410通过总线430相连接,数据库450用于保存数据。
计算设备400还包括接入设备440,接入设备440使得计算设备400能够经由一个或多个网络460通信。这些网络的示例包括公用交换电话网(PSTN)、 局域网(LAN)、广域网(WAN)、个域网(PAN)或诸如因特网的通信网络的组合。接入设备440可以包括有线或无线的任何类型的网络接口(例如,网络接口卡(NIC))中的一个或多个,诸如IEEE802.11无线局域网(WLAN)无线接口、全球微波互联接入(Wi-MAX)接口、以太网接口、通用串行总线(USB)接口、蜂窝网络接口、蓝牙接口、近场通信(NFC)接口,等等。
在本申请的一个实施例中,计算设备400的上述部件以及图4中未示出的其他部件也可以彼此相连接,例如通过总线。应当理解,图4所示的计算设备结构框图仅仅是出于示例的目的,而不是对本申请范围的限制。本领域技术人员可以根据需要,增添或替换其他部件。
计算设备400可以是任何类型的静止或移动计算设备,包括移动计算机或移动计算设备(例如,平板计算机、个人数字助理、膝上型计算机、笔记本计算机、上网本等)、移动电话(例如,智能手机)、可佩戴的计算设备(例如,智能手表、智能眼镜等)或其他类型的移动设备,或者诸如台式计算机或PC的静止计算设备。计算设备400还可以是移动式或静止式的服务器。
其中,处理器420用于执行如下计算机可执行指令:
获取待处理视频;
解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像;
确定所述目标视频帧集合在所述待处理视频中的视频位置信息、所述目标识别图像在所述目标视频帧集合中的图像位置信息和所述目标识别图像对应的目标链接;
将所述视频位置信息、所述图像位置信息和所述目标链接对应存储,并与所述待处理视频绑定。
上述为本实施例的一种计算设备的示意性方案。需要说明的是,该计算设备的技术方案与上述的视频处理方法的技术方案属于同一构思,计算设备的技术方案未详细描述的细节内容,均可以参见上述视频处理方法的技术方案的描述。
本申请一实施例还提供一种计算机可读存储介质,其存储有计算机指令,该指令被处理器执行时以用于:
获取待处理视频;
解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像;
确定所述目标视频帧集合在所述待处理视频中的视频位置信息、所述目标识别图像在所述目标视频帧集合中的图像位置信息和所述目标识别图像对应的目标链接;
将所述视频位置信息、所述图像位置信息和所述目标链接对应存储,并与所述待处理视频绑定。
上述为本实施例的一种计算机可读存储介质的示意性方案。需要说明的是,该存储介质的技术方案与上述的视频处理方法的技术方案属于同一构思,存储介质的技术方案未详细描述的细节内容,均可以参见上述视频处理方法的技术方案的描述。
本说明书一实施例还提供一种计算机程序产品,其中,当所述计算机程序 产品在计算机中执行时,令计算机执行上述视频处理方法的步骤。
上述为本实施例的一种计算机程序产品的示意性方案。需要说明的是,该计算机程序产品的技术方案与上述的视频处理方法的技术方案属于同一构思,计算机程序产品的技术方案未详细描述的细节内容,均可以参见上述视频处理方法的技术方案的描述。
上述对本申请特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。
所述计算机指令包括计算机程序产品代码,所述计算机程序产品代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序产品代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。
需要说明的是,对于前述的各方法实施例,为了简便描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的 动作顺序的限制,因为依据本申请,某些步骤可以采用其它顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定都是本申请所必须的。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其它实施例的相关描述。
以上公开的本申请优选实施例只是用于帮助阐述本申请。可选实施例并没有详尽叙述所有的细节,也不限制该发明仅为所述的具体实施方式。显然,根据本申请的内容,可作很多的修改和变化。本申请选取并具体描述这些实施例,是为了更好地解释本申请的原理和实际应用,从而使所属技术领域技术人员能很好地理解和利用本申请。本申请仅受权利要求书及其全部范围和等效物的限制。

Claims (14)

  1. 一种视频处理方法,包括:
    获取待处理视频;
    解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像;
    确定所述目标视频帧集合在所述待处理视频中的视频位置信息、所述目标识别图像在所述目标视频帧集合中的图像位置信息和所述目标识别图像对应的目标链接;
    将所述视频位置信息、所述图像位置信息和所述目标链接对应存储,并与所述待处理视频绑定。
  2. 如权利要求1所述的视频处理方法,解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像,包括:
    解码所述待处理视频,生成初始视频帧集合;
    在所述初始视频帧集合中筛选包括目标识别图像的视频帧生成目标视频帧集合。
  3. 如权利要求2所述的视频处理方法,解码所述待处理视频,包括:
    根据预设的时间间隔解码所述待处理视频。
  4. 如权利要求1-3任意一项所述的视频处理方法,确定所述目标视频帧集合在所述待处理视频中的视频位置信息,包括:
    确定所述目标视频帧集合中第一个目标视频帧在所述待处理视频中的时 间点作为起始时间点;
    确定所述目标视频帧集合中最后一个目标视频帧在所述待处理视频中的时间点为结束时间点;
    根据所述结束时间点和所述起始时间点确定所述目标视频帧集合在所述待处理视频中的视频位置信息。
  5. 如权利要求1-4任意一项所述的视频处理方法,确定所述目标识别图像在所述目标视频帧集合中的图像位置信息,包括:
    在所述目标视频帧集合中确定目标视频帧;
    确定所述目标识别图像在所述目标视频帧中的坐标信息作为图像位置信息。
  6. 如权利要求5所述的视频处理方法,确定所述目标识别图像在所述目标视频帧中的坐标信息作为图像位置信息,包括:
    在所述目标识别图像为矩形的情况下,根据所述目标识别图像的任一顶点坐标、长和宽确定所述目标识别图像在所述目标视频帧中的坐标信息;或
    在所述目标识别图像为圆形的情况下,根据所述目标识别图像的圆心坐标和半径确定所述目标识别图像在所述目标视频帧中的坐标信息;或
    在所述目标识别图像为三角形的情况下,格局所述目标识别图像的三个顶点坐标确定所述目标识别图像在所述目标视频帧中的坐标信息。
  7. 如权利要求1-6任意一项所述的视频处理方法,将所述视频位置信息、所述图像位置信息和所述目标链接对应存储,并与所述待处理视频绑定,包括:
    将所述视频位置信息、所述图像位置信息和所述目标链接对应存储至元信 息文件;
    将所述元信息文件与所述待处理视频绑定。
  8. 如权利要求1-7任意一项所述的视频处理方法,所述目标识别图像包括二维码、条形码、小程序码、QR码中的任意一个。
  9. 如权利要求1-8任意一项所述的视频处理方法,所述方法还包括:
    在所述待处理视频播放至所述视频位置信息的情况下,接收用户的点击指令,并获取所述点击指令的点击位置信息;
    在所述点击位置信息满足所述图像位置信息的情况下,获取并跳转至所述目标链接。
  10. 如权利要求9所述的视频处理方法,在获取并跳转至所述目标链接之前,所述方法还包括:
    响应于所述点击指令向所述用户发送询问信息,其中,所述询问信息用于确定是否跳转至所述目标链接;
    在接收到用户根据所述询问信息发送的确定指令的情况下,获取并跳转至所述目标链接。
  11. 一种视频处理装置,包括:
    获取模块,被配置为获取待处理视频;
    解码模块,被配置为解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像;
    确定模块,被配置为确定所述目标视频帧集合在所述待处理视频中的视频 位置信息、所述目标识别图像在所述目标视频帧集合中的图像位置信息和所述目标识别图像对应的目标链接;
    存储绑定模块,被配置为将所述视频位置信息、所述图像位置信息和所述目标链接对应存储,并与所述待处理视频绑定。
  12. 一种计算设备,包括:
    存储器和处理器;
    所述存储器用于存储计算机可执行指令,所述处理器用于执行所述计算机可执行指令,以实现下述方法:
    获取待处理视频;
    解码所述待处理视频,获得目标视频帧集合,其中,所述目标视频帧集合中包括目标识别图像;
    确定所述目标视频帧集合在所述待处理视频中的视频位置信息、所述目标识别图像在所述目标视频帧集合中的图像位置信息和所述目标识别图像对应的目标链接;
    将所述视频位置信息、所述图像位置信息和所述目标链接对应存储,并与所述待处理视频绑定。
  13. 一种计算机可读存储介质,其存储有计算机指令,该指令被处理器执行时实现权利要求1-10任意一项所述视频处理方法的步骤。
  14. 一种计算机程序产品,当所述计算机程序产品在计算机中执行时,令计算机执行权利要求1-10任意一项所述视频处理方法的步骤。
PCT/CN2021/119991 2020-11-19 2021-09-23 视频处理方法及装置 WO2022105419A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/037,750 US20240013811A1 (en) 2020-11-19 2021-11-23 Video processing method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011302351.X 2020-11-19
CN202011302351.XA CN112418058A (zh) 2020-11-19 2020-11-19 视频处理方法及装置

Publications (1)

Publication Number Publication Date
WO2022105419A1 true WO2022105419A1 (zh) 2022-05-27

Family

ID=74774602

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119991 WO2022105419A1 (zh) 2020-11-19 2021-09-23 视频处理方法及装置

Country Status (3)

Country Link
US (1) US20240013811A1 (zh)
CN (1) CN112418058A (zh)
WO (1) WO2022105419A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112418058A (zh) * 2020-11-19 2021-02-26 上海哔哩哔哩科技有限公司 视频处理方法及装置
CN113157160B (zh) * 2021-04-20 2023-08-15 北京百度网讯科技有限公司 用于识别误导播放按钮的方法和设备
CN113691729B (zh) * 2021-08-27 2023-08-22 维沃移动通信有限公司 图像处理方法及装置
CN114173154B (zh) * 2021-12-14 2024-03-19 上海哔哩哔哩科技有限公司 视频处理方法及系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103702222A (zh) * 2013-12-20 2014-04-02 惠州Tcl移动通信有限公司 移动终端的互动信息生成方法及其视频文件播放方法
CN106937176A (zh) * 2017-04-01 2017-07-07 福建中金在线信息科技有限公司 视频处理方法、装置以及视频交互方法和装置
US20190026617A1 (en) * 2015-02-06 2019-01-24 Lawrence F. Glaser Method of identifying, locating, tracking, acquiring and selling tangible and intangible objects utilizing predictive transpose morphology
CN109769133A (zh) * 2019-02-19 2019-05-17 上海七牛信息技术有限公司 视频播放过程中的二维码解析方法、装置及可读存储介质
CN109819340A (zh) * 2019-02-19 2019-05-28 上海七牛信息技术有限公司 视频播放过程中的网址解析方法、装置及可读存储介质
CN110399574A (zh) * 2018-04-19 2019-11-01 腾讯科技(深圳)有限公司 信息跳转方法、装置及电子装置
CN112418058A (zh) * 2020-11-19 2021-02-26 上海哔哩哔哩科技有限公司 视频处理方法及装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103686454A (zh) * 2012-09-24 2014-03-26 腾讯科技(深圳)有限公司 一种信息获取方法和装置
CN111954078A (zh) * 2020-08-24 2020-11-17 上海连尚网络科技有限公司 针对直播的视频生成方法和装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103702222A (zh) * 2013-12-20 2014-04-02 惠州Tcl移动通信有限公司 移动终端的互动信息生成方法及其视频文件播放方法
US20190026617A1 (en) * 2015-02-06 2019-01-24 Lawrence F. Glaser Method of identifying, locating, tracking, acquiring and selling tangible and intangible objects utilizing predictive transpose morphology
CN106937176A (zh) * 2017-04-01 2017-07-07 福建中金在线信息科技有限公司 视频处理方法、装置以及视频交互方法和装置
CN110399574A (zh) * 2018-04-19 2019-11-01 腾讯科技(深圳)有限公司 信息跳转方法、装置及电子装置
CN109769133A (zh) * 2019-02-19 2019-05-17 上海七牛信息技术有限公司 视频播放过程中的二维码解析方法、装置及可读存储介质
CN109819340A (zh) * 2019-02-19 2019-05-28 上海七牛信息技术有限公司 视频播放过程中的网址解析方法、装置及可读存储介质
CN112418058A (zh) * 2020-11-19 2021-02-26 上海哔哩哔哩科技有限公司 视频处理方法及装置

Also Published As

Publication number Publication date
US20240013811A1 (en) 2024-01-11
CN112418058A (zh) 2021-02-26

Similar Documents

Publication Publication Date Title
WO2022105419A1 (zh) 视频处理方法及装置
US10402483B2 (en) Screenshot processing device and method for same
US10311877B2 (en) Performing tasks and returning audio and visual answers based on voice command
US20150001287A1 (en) Method, apparatus, and mobile terminal for obtaining information
WO2019134587A1 (zh) 视频数据处理方法、装置、电子设备和存储介质
US9411839B2 (en) Index configuration for searchable data in network
US8787985B2 (en) Screen capture method of mobile communication terminal
CN113395605B (zh) 视频笔记生成方法及装置
US20110125731A1 (en) Information processing apparatus, information processing method, program, and information processing system
WO2022001600A1 (zh) 信息解析方法及装置、设备、存储介质
JP7399999B2 (ja) 情報表示方法および装置
WO2017008646A1 (zh) 一种在触控终端上选择多个目标的方法和设备
CN112449250A (zh) 一种视频资源的下载方法、装置、设备和介质
CN114727143A (zh) 多媒体资源展示方法及装置
JP2007104671A (ja) UPnPによらない個体をUPnPデバイスまたはコンテンツで表現する方法及び装置
CN104837065A (zh) 电视终端与移动终端间的二维码信息共享方法及系统
WO2015000433A1 (zh) 一种多媒体查找方法、终端、服务器及系统
CA3078190A1 (en) Apparatus and method for automatic generation of croudsourced news media from captured contents
CN114173154B (zh) 视频处理方法及系统
WO2023045430A1 (zh) 基于二维码的数据处理方法、装置及系统
WO2023020093A1 (zh) 虚拟礼物展示方法及装置
CN111447490A (zh) 流媒体文件处理方法及装置
CN113992866B (zh) 视频制作方法及装置
US20230368533A1 (en) Method and system for automatically creating loop videos
WO2023179590A1 (zh) 信息处理方法、装置、用户终端、程序产品和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21893570

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18037750

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 26.10.2023)