WO2023093339A1 - Video processing method and apparatus based on intelligent digital retina - Google Patents

Video processing method and apparatus based on intelligent digital retina Download PDF

Info

Publication number
WO2023093339A1
WO2023093339A1 PCT/CN2022/124876 CN2022124876W WO2023093339A1 WO 2023093339 A1 WO2023093339 A1 WO 2023093339A1 CN 2022124876 W CN2022124876 W CN 2022124876W WO 2023093339 A1 WO2023093339 A1 WO 2023093339A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
deleted
slice
video
time window
Prior art date
Application number
PCT/CN2022/124876
Other languages
French (fr)
Chinese (zh)
Inventor
滕波
王琪
向国庆
周东东
洪一帆
张羿
焦立欣
Original Assignee
浙江智慧视频安防创新中心有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江智慧视频安防创新中心有限公司 filed Critical 浙江智慧视频安防创新中心有限公司
Publication of WO2023093339A1 publication Critical patent/WO2023093339A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the invention relates to the technical field of video processing, in particular to a video processing method and device based on an intelligent digital retina.
  • digital retina technology is the first to propose an intelligent image sensor integrating video compression and video analysis.
  • the digital retina is characterized by the ability to obtain video compression data and video feature data at the same time, and transmit them to the cloud through data streams for later playback and retrieval.
  • the digital retina technology introduces the concept of model flow, which means that the image acquisition front-end can apply different feature extraction models according to the needs, and these models can be sent to the image acquisition front-end through cloud storage and reverse transmission. .
  • the Digital Retina Framework fuses video-related aspects of feature recognition and data compression, a new paradigm is created that excludes a technique measured by a single Comprehensive evaluation method of the target. This is also the valuable enlightenment obtained from the biological structure of the retina.
  • the retina is not simply transmitting or compressing image data, but an intelligent front-end device that serves various complex tasks of the brain.
  • the cloud server needs to store video stream data, on the other hand, it also needs to store feature stream data, and at the same time, it also needs to store model data.
  • the embodiment of the present application provides a video processing method based on intelligent digital retina, the method comprising:
  • the video stream and the corresponding feature stream are divided into time slices according to the preset division method, and the corresponding time slice division results are obtained, and the time slice division results include the timestamp corresponding to each time slice, the corresponding video data slice and Corresponding feature data slice;
  • the data slices to be deleted include video data slices to be deleted and/or the feature data slice to be deleted to obtain the processed video stream and the corresponding feature stream.
  • the data slice to be deleted includes the first video data slice to be deleted in the target time window and the first feature data slice to be deleted in the target time window;
  • the amount of data to be deleted corresponding to the window and the number of times of attention for each time slice, determining and deleting the data slice to be deleted in the target time window includes:
  • the first video data piece to be deleted and the first feature data piece to be deleted are determined and deleted according to the amount of data to be deleted corresponding to the target time window and the number of times of attention in each time slice.
  • the data slice to be deleted includes the second video data slice to be deleted in the target time window, and the data volume of the data to be deleted corresponding to the target time window and the number of each time slice Focusing on the number of times, determining and deleting the data pieces to be deleted in the target time window include:
  • the second video data piece to be deleted is determined and deleted according to the amount of data to be deleted corresponding to the target time window and the number of times of attention in each time slice.
  • the method also includes:
  • the video data pieces to be deleted in the target time window are deleted according to a preset deletion manner.
  • the deleting the video data pieces to be deleted in the target time window according to the preset deletion method includes:
  • the characteristic data piece of the target time window and the reconstructed characteristic data are stored.
  • the method also includes:
  • undeleted video data, feature data slices of the target time window, and the reconstructed feature data Based on the video reconstruction model, undeleted video data, feature data slices of the target time window, and the reconstructed feature data, perform reconstruction processing on the data slices to be deleted in the target time window, and generate corresponding reconstructed videos data.
  • the method also includes:
  • the type matching based on the depth model to the corresponding video reconstruction model includes:
  • the matched corresponding video reconstruction model is a reconstructed depth model with a second preset resolution range
  • the matched corresponding video reconstruction model is a decoder of an autoencoder
  • the depth model is a model for extracting reconstruction features based on a generative adversarial model
  • the corresponding video reconstruction model that is matched is a generative adversarial network.
  • the embodiment of the present application provides a video processing device based on intelligent digital retina, the device includes:
  • An acquisition module configured to acquire video streams and corresponding feature streams
  • a division module configured to divide the video stream and the corresponding feature stream acquired by the acquisition module into time slices according to a preset division method to obtain corresponding time slice division results, the time slice division results including each time slice Corresponding timestamp, corresponding video data slice and corresponding feature data slice;
  • An association analysis module for associating and analyzing each time slice in the time slice division result obtained by the division module with the number of searches and/or the number of playbacks, to obtain the number of times of attention for each time slice;
  • the determination and deletion module is used to determine and delete the data slice to be deleted in the target time window according to the amount of data to be deleted corresponding to the target time window and the number of times of attention for each time slice obtained by the association analysis module,
  • the data slices to be deleted include video data slices to be deleted and/or feature data slices to be deleted, and a processed video stream and corresponding feature streams are obtained.
  • the embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor, and the processor runs the computer program to realize Method steps as described above.
  • the embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and the program is executed by a processor to implement the above-mentioned method steps.
  • the video stream and the corresponding feature stream are obtained; the video stream and the corresponding feature stream are divided into time slices according to the preset division method, and the corresponding time slice division results are obtained.
  • the time slice division results include each time slice The time stamp corresponding to the slice, the corresponding video data slice, and the corresponding feature data slice; associate and analyze each time slice in the time slice division result with the number of searches and/or playback numbers, and obtain the number of attentions for each time slice ; and according to the amount of data to be deleted corresponding to the target time window and the number of times of concern for each time slice, determine and delete the data slices to be deleted in the target time window, the data slices to be deleted include video data slices to be deleted and/or Or the feature data slice to be deleted, to obtain the processed video stream and the corresponding feature stream.
  • the video processing method based on the intelligent digital retina provided by the embodiment of the present application can accurately determine and delete the data to be deleted in the target time window according to the data volume of the data to be deleted corresponding to the target time window and the number of attentions in each time slice In this way, the storage overhead in intelligent digital retina-based video processing can be effectively reduced.
  • FIG. 1 is a schematic flow diagram of a video processing method based on an intelligent digital retina provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of a storage judgment working mechanism in a specific application scenario provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of video data slices and feature data slices corresponding to time windows in a specific application scenario provided by an embodiment of the present application;
  • FIG. 4 is a schematic diagram of a deleted video data piece and a deleted feature data piece corresponding to a time window in a specific application scenario provided by an embodiment of the present application;
  • Fig. 5 is a schematic diagram of storing and reconstructing feature data and storing feature data slices in a specific application scenario provided by an embodiment of the present application;
  • Fig. 6 is a schematic structural diagram of a video processing device based on an intelligent digital retina provided by an embodiment of the present application
  • Fig. 7 shows a schematic diagram of a connection structure of an electronic device according to an embodiment of the present application.
  • FIG. 1 it is a schematic flowchart of a video processing method based on an intelligent digital retina provided by an embodiment of the present application; as shown in FIG. 1 , an embodiment of the present application provides a video processing method based on an intelligent digital retina, Concretely include the following method steps:
  • the video processing method provided in the embodiment of the present application is based on the intelligent digital retina technology, and the principle of the intelligent digital retina technology is as follows:
  • Front-end devices have both video compression and deep models for video feature extraction. Since the backend can deploy different models to the frontend through the transmission method, it can be understood that the frontend device has the ability to adaptively acquire any depth model. Therefore, as long as a model with special feature extraction capabilities is trained offline, it can be deployed to the front-end device through the intelligent digital retina model stream.
  • the main purpose of the feature stream is to perform image retrieval. After the user obtains the retrieval results, a common linkage requirement is to playback images or videos.
  • a large city has millions of front-end acquisition devices.
  • cloud servers and high-speed communication networks can realize data transmission and cloud processing, storage space is still a bottleneck.
  • the amygdala stores data based on the importance of events; the memory data mode of the cerebellum is based on implicit information rather than direct information; the prefrontal cortex is responsible for processing and memorizing the processed semantic information; in addition , The brain also performs long-term and short-term memory conversion, and this work is done through the hippocampus.
  • the intelligent digital retina used in the video processing method provided by the embodiment of the present application can not only provide simultaneous acquisition and transmission of video streams and feature streams, but also store data and utilize newly added feature stream data.
  • S104 Divide the video stream and the corresponding feature stream into time slices according to the preset division method, and obtain the corresponding time slice division results.
  • the time slice division results include the timestamp corresponding to each time slice, the corresponding video data slice and the corresponding Feature data sheet.
  • the preset division manner may be: the division of time slices is divided into time slices of equal length.
  • the preset division method may be: the time slice is divided according to coded GOP segments.
  • S106 Associating and analyzing each time slice in the time slice division result with the number of searches and/or the number of playbacks to obtain the number of attentions for each time slice.
  • the data slices to be deleted include the video data slices to be deleted and/or Or the feature data slice to be deleted, to obtain the processed video stream and the corresponding feature stream.
  • the data slices to be deleted include the first video data slice to be deleted in the target time window and the first feature data slice to be deleted in the target time window, according to the data to be deleted corresponding to the target time window
  • the amount of data and the number of times of attention in each time slice, determining and deleting the data slices to be deleted in the target time window includes the following steps:
  • the data slice to be deleted includes the second video data slice to be deleted in the target time window, and is determined according to the amount of data to be deleted corresponding to the target time window and the number of times of attention in each time slice
  • deleting the data piece to be deleted in the target time window includes the following steps:
  • FIG. 2 it is a schematic diagram of a storage judgment working mechanism in a specific application scenario provided by the embodiment of the present application.
  • a search engine is used to provide users with video retrieval.
  • the search engine users match the search results from the feature storage and give feedback to the users. If the user further needs video playback, the playback engine will decode the video stream information according to the timestamp information and play the video.
  • a decision module is stored, which generates a decision to delete/retain feature data and video stream data according to the search results of the search engine and/or the playback results of the playback engine. This is because the search engine obtains the user's feature input, which means that the searched feature belongs to the user's attention feature.
  • the user operations of the playback engine are also highly correlated with the content that the user cares about. Therefore, the storage judgment module judges the preserved features and video data through the user's feedback.
  • the time slices are divided into time slices of equal length.
  • the time slice is divided according to encoded GOP segments.
  • each time slice is associated with the number of searches and playbacks. Obtain an attention count corresponding to each time slice.
  • calculate and generate a data volume of deleted data and delete the data volume in order of the number of attentions from low to high.
  • the time window is updated periodically, therefore, data deletion is performed periodically.
  • each video data piece corresponds to the feature data piece with the same time, and is associated with an attention count. Note that the attention times corresponding to the video data slice and the feature data slice in the same time period are not necessarily equal.
  • the time window covers the video data slices and special data slices from timestamp 3 to timestamp 8.
  • the storage decision module will delete the video data slice corresponding to timestamp 4 and the feature data slice corresponding to timestamp 5 in an example. Specifically, suppose the total amount of data in the time window is D t , and the maximum allocable storage space is D max , and there is
  • the deletion judgment calculates and executes the deleted data slice according to D , and the result can be directly calculated according to the number of attentions in Figure 2, but the video processing method provided in the embodiment of the present application includes using any algorithm to calculate the required deleted data piece.
  • the deleted data is shown in Figure 4. Since the amount of feature data depends on the feature extraction model used, in some cases the amount of data in the feature stream is very small, and deleting the corresponding feature data slices does not save much storage space. Therefore, in one embodiment, the deletion process can only be performed on the video data slice or the feature data slice.
  • the video processing method provided in the embodiment of the present application further includes the following steps:
  • the video data pieces to be deleted in the target time window are deleted according to a preset deletion method.
  • the deletion of video or feature data is a direct deletion, that is, all related data will be discarded.
  • this method is relatively easy to implement, it has some disadvantages.
  • the distribution of attention times cannot represent all potential user needs. If a piece of data is deleted, the user will no longer be able to get any feedback on the video within the timestamp. Therefore, if other methods can be used to meet the needs of small probability and respond, it will bring a qualitative breakthrough compared to traditional storage and deletion methods. Due to the limited storage resources in the cloud, computing resources can be coordinated at any time. Therefore, in one embodiment, data deletion adopts an "incomplete" deletion method.
  • the default deletion method is the "incomplete" deletion method, and the specific steps are as follows:
  • S1 Calculate the video data slice to be deleted according to the time window and allocated storage resources.
  • S2 Deleting the video data to be deleted, and generating feature data that can be used to reconstruct the video data.
  • S3 retain the feature data slice and the reconstructed feature data generated in S2.
  • the video data slice in Figure 4 can be a closed GOP, which means that deleting a video data slice not only deletes the B frame or P frame with a small amount of data, but also includes the I frame with a large amount of data. frame. Since a closed GOP is encoded independently, it means that all video data in the data slice will completely disappear. Therefore, the feature data described in S2 does not depend on any block-based coding video data. In a more feasible method, the feature data of the reconstructed video data of S2 is obtained through a deep learning model.
  • Figure 5 shows a schematic diagram of the above process. Based on the storage decision, a piece of video data is input to a deep model for extracting reconstruction feature data, while the original encoded data is discarded. Ultimately, only the reconstructed feature data and feature data slices will be kept on the memory side.
  • deleting the video data piece to be deleted in the target time window according to a preset deletion method includes the following steps:
  • the video processing method provided in the embodiment of the present application further includes the following steps:
  • the video processing method provided in the embodiment of the present application further includes the following steps:
  • matching the corresponding video reconstruction model based on the type of the depth model includes the following steps:
  • the depth model is a model that generates an image with a first preset resolution range
  • the corresponding video reconstruction model that is matched is a reconstructed depth model with a second preset resolution range.
  • the first preset resolution range is often an ultra-low resolution range
  • the corresponding second preset resolution range of the reconstructed depth model is an ultra-high resolution range.
  • the depth model is a model capable of generating super-low-resolution images
  • the generated super-low-resolution images are images generated by coding consecutive images in a residual-based coding manner.
  • matching the corresponding video reconstruction model based on the type of the depth model includes the following steps:
  • the matched corresponding video reconstruction model is a decoder of an autoencoder.
  • the depth model is a feature extraction model, for example, an encoder of a sub-encoder
  • the corresponding video reconstruction model is a decoder of an auto-encoder
  • matching the corresponding video reconstruction model based on the type of the depth model includes the following steps:
  • the depth model is a model that extracts reconstruction features based on a generative adversarial model
  • the corresponding video reconstruction model that is matched is a generative adversarial network.
  • the depth model is a model based on the generative confrontation model for reconstruction feature extraction
  • the feature extraction model is mainly used to extract the memory features of human body bone features and appearance attribute information
  • the corresponding video reconstruction model is a Generative Adversarial Networks.
  • the GAN takes feature values as input and reconstructs video data according to a trained generative model.
  • the storage decision unit combined with the depth model can still guarantee the storage space consumption within the time window.
  • the user since the characteristic data pieces are completely preserved, the user may have a playback requirement for the deleted video data pieces.
  • the playback engine will utilize the depth model to reconstruct the deleted data from the reconstructed feature data.
  • the depth model used for video reconstruction corresponds one-to-one with the depth model used for reconstruction feature extraction in Fig. 5.
  • the video stream and the corresponding feature stream are obtained; the video stream and the corresponding feature stream are divided into time slices according to the preset division method, and the corresponding time slice division results are obtained.
  • the time slice division results include each time slice Timestamp corresponding to the slice, corresponding video data slice, and corresponding feature data slice; associate and analyze each time slice in the time slice division result with the number of searches and/or replays, and obtain the number of attentions for each time slice ; and according to the amount of data to be deleted corresponding to the target time window and the number of times of concern for each time slice, determine and delete the data slices to be deleted in the target time window, the data slices to be deleted include video data slices to be deleted and/or Or the feature data slice to be deleted, to obtain the processed video stream and the corresponding feature stream.
  • the intelligent digital retina-based video processing method provided in the embodiment of the present application can accurately determine and delete the data to be deleted in the target time window according to the amount of data to be deleted corresponding to the target time window and the number of attentions in each time slice In this way, the storage overhead in intelligent digital retina-based video processing can be effectively reduced.
  • the following is an embodiment of video processing based on intelligent digital retina in the embodiment of the present application, which can be used to implement the embodiment of the video processing method based on intelligent digital retina in the embodiment of the present application.
  • the details not disclosed in the embodiment of the intelligent digital retina-based video processing device in the embodiment of the present application please refer to the embodiment of the intelligent digital retina-based video processing method in the embodiment of the present application.
  • FIG. 6 shows a schematic structural diagram of an intelligent digital retina-based video processing device provided by an exemplary embodiment of the present invention.
  • the intelligent digital retina-based video processing device can be implemented as all or a part of the terminal through software, hardware or a combination of the two.
  • the intelligent digital retina-based video processing device includes an acquisition module 602 , a division module 604 , an association analysis module 606 and a determination and deletion module 608 .
  • the acquisition module 602 is configured to acquire video streams and corresponding feature streams
  • the division module 604 is configured to divide the video stream and the corresponding feature stream acquired by the acquisition module 602 into time slices according to a preset division method, and obtain corresponding time slice division results, the time slice division results including the timestamp corresponding to each time slice , the corresponding video data slice and the corresponding feature data slice;
  • the association analysis module 606 is used to associate and analyze each time slice in the time slice division result obtained by the division module 604 with the search quantity and/or playback quantity, and obtain the number of times of attention of each time slice;
  • the determination and deletion module 608 is used to determine and delete the data slice to be deleted in the target time window according to the data volume of the data to be deleted corresponding to the target time window and the number of times of concern for each time slice obtained by the association analysis module 606.
  • the data slices include video data slices to be deleted and/or feature data slices to be deleted, and a processed video stream and corresponding feature streams are obtained.
  • the data slices to be deleted include the first video data slice to be deleted of the target time window and the first feature data slice to be deleted of the target time window, and the determination and deletion module 608 is used for:
  • the data slice to be deleted includes the second video data slice to be deleted in the target time window, and the determination and deletion module 608 is used for:
  • the device also includes:
  • a deletion module (not shown in FIG. 6 ), configured to delete the video data pieces to be deleted in the target time window according to a preset deletion method.
  • the obtaining module 602 is also used to: obtain the video reconstruction model, the characteristic data slice of the target time window and the reconstructed characteristic data;
  • the device also includes:
  • the video data reconstruction module (not shown in Fig. 6) is used for based on the video reconstruction model obtained by the acquisition module 602, the undeleted video data, the feature data sheet and the reconstruction feature data of the target time window, to the target time window
  • the deleted data slices are reconstructed to generate corresponding reconstructed video data.
  • the device also includes:
  • the reconstruction model matching module (not shown in FIG. 6 ) is configured to match the corresponding video reconstruction model based on the type of the depth model.
  • the reconstruction model matching module is specifically used for:
  • the depth model is a model that generates an image with a first preset resolution range
  • the corresponding video reconstruction model that is matched is a reconstructed depth model with a second preset resolution range
  • the matched corresponding video reconstruction model is a decoder of an autoencoder
  • the depth model is a model that extracts reconstruction features based on a generative adversarial model
  • the corresponding video reconstruction model that is matched is a generative adversarial network.
  • the intelligent digital retina-based video processing device when the intelligent digital retina-based video processing device provided in the above-mentioned embodiments executes the intelligent digital retina-based video processing method, the division of the above-mentioned functional units is used as an example for illustration. The above function allocation is completed by different functional units, that is, the internal structure of the device is divided into different functional units, so as to complete all or part of the functions described above.
  • the intelligent digital retina-based video processing device and the intelligent digital retina-based video processing method embodiment provided in the above-mentioned embodiments belong to the same concept, and its implementation process is detailed in the intelligent digital retina-based video processing method embodiment, which is not repeated here. repeat.
  • the acquisition module is used to obtain the video stream and the corresponding feature stream;
  • the division module is used to divide the video stream and the corresponding feature stream acquired by the acquisition module into time slices according to the preset division method, and obtain the corresponding time Slice division results, the time slice division results include the timestamp corresponding to each time slice, the corresponding video data slice and the corresponding feature data slice;
  • the association analysis module is used to divide each time slice in the time slice division result obtained by the division module Correlating and analyzing with the number of searches and/or playbacks to obtain the number of attentions of each time slice;
  • the determination and deletion module is used to obtain each time according to the amount of data to be deleted corresponding to the target time window and the association analysis module
  • the number of attentions of the slices, determine and delete the data slices to be deleted in the target time window, the data slices to be deleted include the video data slices to be deleted and/or the feature data slices to be deleted, and the processed video stream and corresponding features flow.
  • the intelligent digital retina-based video processing device provided in the embodiment of the present application can accurately determine and delete the data to be deleted in the target time window according to the amount of data to be deleted corresponding to the target time window and the number of attentions in each time slice In this way, the storage overhead in intelligent digital retina-based video processing can be effectively reduced.
  • this embodiment provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and operable on the processor.
  • the processor runs the computer program to realize the above-mentioned Method steps.
  • An embodiment of the present application provides a storage medium storing computer-readable instructions, on which a computer program is stored, and the program is executed by a processor to implement the above method steps.
  • FIG. 7 it shows a schematic structural diagram of an electronic device suitable for implementing the embodiment of the present application.
  • the terminal equipment in the embodiment of the present application may include but not limited to such as mobile phone, notebook computer, digital broadcast receiver, PDA (personal digital assistant), PAD (tablet computer), PMP (portable multimedia player), vehicle terminal (such as mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers and the like.
  • the electronic device shown in FIG. 7 is only an example, and should not limit the functions and scope of use of this embodiment of the present application.
  • an electronic device may include a processing device (such as a central processing unit, a graphics processing unit, etc.) (RAM) 703 to execute various appropriate actions and processing.
  • RAM 703 various programs and data necessary for the operation of the electronic device are also stored.
  • the processing device 701 , ROM 702 , and RAM 703 are connected to each other through a bus 704 .
  • An input/output (I/O) interface 705 is also connected to the bus 704 .
  • the following devices can be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speaker, vibration an output device 707 such as a computer; a storage device 708 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 709.
  • the communication means 709 may allow the electronic device to communicate with other devices wirelessly or by wire to exchange data. While FIG. 7 shows an electronic device having various means, it is to be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, where the computer program includes program codes for executing the methods shown in the flowcharts.
  • the computer program may be downloaded and installed from a network via communication means 709 , or from storage means 708 , or from ROM 702 .
  • the processing device 701 When the computer program is executed by the processing device 701, the above-mentioned functions defined in the method of the embodiment of the present application are performed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted by any appropriate medium, including but not limited to wires, optical cables, RF (radio frequency), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
  • Computer program code for carrying out the operations of the present disclosure can be written in one or more programming languages, or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional Procedural Programming Language - such as "C" or a similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider). Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider such as AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
  • each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present application may be implemented by means of software or by means of hardware. Wherein, the name of a unit does not constitute a limitation of the unit itself under certain circumstances.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

Disclosed in the present invention are a video processing method and apparatus based on an intelligent digital retina. The method comprises: associating each time slice in a time slice division result with the number of searches and/or the number of playbacks and performing analysis to obtain the number of interests in the time slice; according to the data volume of data to be deleted corresponding to a target time window and the number of interests in each time slice, determining and deleting a data slice to be deleted of the target time window, wherein the data slice to be deleted comprises a video data slice to be deleted and/or a feature data slice to be deleted, to obtain a processed video stream and a corresponding feature stream. According to the video processing method provided in the embodiments of the present application, a data slice to be deleted of a target time window can be accurately determined and deleted according to the data volume of data to be deleted corresponding to the target time window and the number of interests in each time slice; in this way, the storage overhead in a video processing process based on an intelligent digital retina can be effectively reduced.

Description

一种基于智能数字视网膜的视频处理方法和装置A video processing method and device based on intelligent digital retina 技术领域technical field
本发明涉及视频处理技术领域,特别涉及一种基于智能数字视网膜的视频处理方法和装置。The invention relates to the technical field of video processing, in particular to a video processing method and device based on an intelligent digital retina.
背景技术Background technique
自从数字视网膜概念提出以来,在视频编解码、视频监控等领域引起了较大的关注。在传统的图像处理领域,视频压缩和视频分析分属不同的两个领域,数字视网膜技术受人类视网膜的生物学功能启发,率先提出了视频压缩、视频分析一体化的智能图像传感器。具体而言,数字视网膜的特点在于能够同时获得视频压缩数据和视频特征数据,并通过数据流传送至云端,便于后期的回放和检索。为了获取图像的特征流,数字视网膜技术引入了模型流的概念,也就是说图像采集前端可以根据需求应用不同的特征提取模型,这些模型可以通过云端存储和反向传输的方式发送到图像采集前端。Since the concept of digital retina was put forward, it has attracted great attention in the fields of video codec, video surveillance and so on. In the traditional field of image processing, video compression and video analysis belong to two different fields. Inspired by the biological functions of the human retina, digital retina technology is the first to propose an intelligent image sensor integrating video compression and video analysis. Specifically, the digital retina is characterized by the ability to obtain video compression data and video feature data at the same time, and transmit them to the cloud through data streams for later playback and retrieval. In order to obtain the feature flow of the image, the digital retina technology introduces the concept of model flow, which means that the image acquisition front-end can apply different feature extraction models according to the needs, and these models can be sent to the image acquisition front-end through cloud storage and reverse transmission. .
在视频压缩方面,基本的理念是通过计算压缩视频的时空冗余信息。视频压缩的基本范式在过去数十年来没有发生较大的改变,基于分块的视频压缩编解码技术发展得非常成熟,其具有计算复杂度适中、压缩率高、重建质量高等特点,因此在过去的数十年里得到了非常广泛的应用,目前主流的编解码技术包括H.264/H.265/H.266以及MPEG2/MPEG4等均主要以基于分块的视频编解码技术。从早期的视频编码开始,编码理论的范式就没有改变过,新一代的编码标准所采用的技术都是通过“计算换空间”的方法来提升压缩比率。例如从H.264到H.265的演进,压缩率提升了50%,但是同时也带来了更大的计算需求。这是由于使用了更灵活的编码单元,更灵活的参考帧,使得基于运动补偿的压缩方法挖掘了更多的压缩潜力。In video compression, the basic idea is to compress the spatio-temporal redundant information of the video by computing. The basic paradigm of video compression has not changed significantly in the past few decades. Block-based video compression codec technology has developed very maturely, which has the characteristics of moderate computational complexity, high compression rate, and high reconstruction quality. Therefore, in the past It has been widely used for decades, and the current mainstream codec technologies include H.264/H.265/H.266 and MPEG2/MPEG4, etc., which are mainly based on block-based video codec technology. Since the early video coding, the paradigm of coding theory has not changed. The technology adopted in the new generation of coding standards is to improve the compression ratio by "computing for space". For example, the evolution from H.264 to H.265 increases the compression rate by 50%, but it also brings greater computing requirements. This is due to the use of more flexible coding units and more flexible reference frames, so that the compression method based on motion compensation can tap more compression potential.
由于数字视网膜框架融合了与视频相关的特征识别和数据压缩两个方面,因此创造了一种新的范式,这种范式排除了以单一参数为衡量的某种技术,而是以一种面向复杂目标的综合性评价方法。这也正是从视网膜的生物结构中获得的宝贵启示,视网膜并不是单纯的传输或压缩图像数据,而是服务于大脑各项复杂任务的智能前端设备。Since the Digital Retina Framework fuses video-related aspects of feature recognition and data compression, a new paradigm is created that excludes a technique measured by a single Comprehensive evaluation method of the target. This is also the valuable enlightenment obtained from the biological structure of the retina. The retina is not simply transmitting or compressing image data, but an intelligent front-end device that serves various complex tasks of the brain.
然而,数字视网膜技术虽然带来了视频的采集和分析的一体化和智能化,然而这也意味着对存储空间提出了更高的要求。云服务器一方面要存储视频流数据,一方面又要存储 特征流数据,同时还需要存储模型数据。However, although digital retina technology brings the integration and intelligence of video collection and analysis, it also means higher requirements for storage space. On the one hand, the cloud server needs to store video stream data, on the other hand, it also needs to store feature stream data, and at the same time, it also needs to store model data.
如何降低基于智能数字视网膜的视频处理方法的存储开销,是待解决的技术问题。How to reduce the storage overhead of the video processing method based on intelligent digital retina is a technical problem to be solved.
发明内容Contents of the invention
基于此,有必要针对基于现有基于智能数字视网膜的视频处理方法需要消耗大量的存储开销的问题,提供一种基于智能数字视网膜的视频处理方法、装置、电子设备和存储介质。Based on this, it is necessary to provide a video processing method, device, electronic device and storage medium based on an intelligent digital retina to solve the problem that the existing intelligent digital retina-based video processing method consumes a large amount of storage overhead.
第一方面,本申请实施例提供了一种基于智能数字视网膜的视频处理方法,所述方法包括:In the first aspect, the embodiment of the present application provides a video processing method based on intelligent digital retina, the method comprising:
获取视频流和对应的特征流;Get the video stream and the corresponding feature stream;
按照预设划分方式对所述视频流和对应的特征流进行时间片划分,得到对应的时间片划分结果,所述时间片划分结果包括每一个时间片对应的时间戳、对应的视频数据片和对应的特征数据片;The video stream and the corresponding feature stream are divided into time slices according to the preset division method, and the corresponding time slice division results are obtained, and the time slice division results include the timestamp corresponding to each time slice, the corresponding video data slice and Corresponding feature data slice;
将所述时间片划分结果中的每一个时间片与搜索数量和/或回放数量进行关联和解析,得到每一个时间片的关注次数;Associating and analyzing each time slice in the time slice division result with the number of searches and/or the number of playbacks to obtain the number of times of attention for each time slice;
根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除所述目标时间窗的待删除的数据片,所述待删除的数据片包括待删除的视频数据片和/或待删除的特征数据片,得到处理后的视频流和对应的特征流。According to the amount of data to be deleted corresponding to the target time window and the number of times of attention for each time slice, determine and delete the data slices to be deleted in the target time window, the data slices to be deleted include video data slices to be deleted and/or the feature data slice to be deleted to obtain the processed video stream and the corresponding feature stream.
在一种实施方式中,所述待删除的数据片包括所述目标时间窗的第一待删除的视频数据片和所述目标时间窗的第一待删除的特征数据片;所述根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除所述目标时间窗的待删除的数据片包括:In one embodiment, the data slice to be deleted includes the first video data slice to be deleted in the target time window and the first feature data slice to be deleted in the target time window; The amount of data to be deleted corresponding to the window and the number of times of attention for each time slice, determining and deleting the data slice to be deleted in the target time window includes:
根据所述目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除所述第一待删除的视频数据片和所述第一待删除的特征数据片。The first video data piece to be deleted and the first feature data piece to be deleted are determined and deleted according to the amount of data to be deleted corresponding to the target time window and the number of times of attention in each time slice.
在一种实施方式中,所述待删除的数据片包括所述目标时间窗的第二待删除的视频数据片,所述根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除所述目标时间窗的待删除的数据片包括:In one embodiment, the data slice to be deleted includes the second video data slice to be deleted in the target time window, and the data volume of the data to be deleted corresponding to the target time window and the number of each time slice Focusing on the number of times, determining and deleting the data pieces to be deleted in the target time window include:
根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除所述第二待删除的视频数据片。The second video data piece to be deleted is determined and deleted according to the amount of data to be deleted corresponding to the target time window and the number of times of attention in each time slice.
在一种实施方式中,所述方法还包括:In one embodiment, the method also includes:
根据预设删除方式删除所述目标时间窗的待删除的视频数据片。The video data pieces to be deleted in the target time window are deleted according to a preset deletion manner.
在一种实施方式中,所述根据预设删除方式删除所述目标时间窗的待删除的视频数据片包括:In one embodiment, the deleting the video data pieces to be deleted in the target time window according to the preset deletion method includes:
获取所述目标时间窗内的总数据量,以及获取最大待分配存储数据量;Obtain the total amount of data within the target time window, and obtain the maximum amount of stored data to be allocated;
计算所述目标时间窗内的总数据量和所述最大待分配存储数据量之间的差值;calculating the difference between the total amount of data within the target time window and the maximum amount of stored data to be allocated;
基于所述差值确定所述目标时间窗的所述待删除的视频数据片;determining the video data slice to be deleted in the target time window based on the difference;
将所述待删除的视频数据片删除,并生成用于重建视频数据的重建特征数据;Deleting the video data piece to be deleted, and generating reconstruction feature data for reconstructing video data;
存储所述目标时间窗的特征数据片和所述重建特征数据。The characteristic data piece of the target time window and the reconstructed characteristic data are stored.
在一种实施方式中,所述方法还包括:In one embodiment, the method also includes:
获取视频重建模型、未删除的视频数据、所述目标时间窗的特征数据片和所述重建特征数据;Obtaining a video reconstruction model, undeleted video data, feature data slices of the target time window, and the reconstructed feature data;
基于所述视频重建模型、未删除的视频数据、所述目标时间窗的特征数据片和所述重建特征数据,对所述目标时间窗的待删除的数据片进行重建处理,生成对应的重建视频数据。Based on the video reconstruction model, undeleted video data, feature data slices of the target time window, and the reconstructed feature data, perform reconstruction processing on the data slices to be deleted in the target time window, and generate corresponding reconstructed videos data.
在一种实施方式中,所述方法还包括:In one embodiment, the method also includes:
基于深度模型的类型匹配对应的视频重建模型。Match the corresponding video reconstruction model based on the type of depth model.
在一种实施方式,所述基于深度模型的类型匹配对应的视频重建模型包括:In one embodiment, the type matching based on the depth model to the corresponding video reconstruction model includes:
若所述深度模型为生成图像为具有第一预设分辨率范围的模型,则匹配出的对应的视频重建模型为具有第二预设分辨率范围的重建深度模型;或者,If the depth model is a model that generates an image with a first preset resolution range, the matched corresponding video reconstruction model is a reconstructed depth model with a second preset resolution range; or,
若所述深度模型为特征提取模型,则匹配出的对应的视频重建模型为自编码机的解码器;或者,If the depth model is a feature extraction model, the matched corresponding video reconstruction model is a decoder of an autoencoder; or,
若所述深度模型为基于生成对抗模型进行重建特征提取的模型,则匹配出的对应的视频重建模型为生成对抗网络。If the depth model is a model for extracting reconstruction features based on a generative adversarial model, the corresponding video reconstruction model that is matched is a generative adversarial network.
第二方面,本申请实施例提供了一种基于智能数字视网膜的视频处理装置,所述装置包括:In the second aspect, the embodiment of the present application provides a video processing device based on intelligent digital retina, the device includes:
获取模块,用于获取视频流和对应的特征流;An acquisition module, configured to acquire video streams and corresponding feature streams;
划分模块,用于按照预设划分方式对所述获取模块获取的所述视频流和对应的特征流进行时间片划分,得到对应的时间片划分结果,所述时间片划分结果包括每一个时间片对应的时间戳、对应的视频数据片和对应的特征数据片;A division module, configured to divide the video stream and the corresponding feature stream acquired by the acquisition module into time slices according to a preset division method to obtain corresponding time slice division results, the time slice division results including each time slice Corresponding timestamp, corresponding video data slice and corresponding feature data slice;
关联解析模块,用于将所述划分模块得到的所述时间片划分结果中的每一个时间片与搜索数量和/或回放数量进行关联和解析,得到每一个时间片的关注次数;An association analysis module, for associating and analyzing each time slice in the time slice division result obtained by the division module with the number of searches and/or the number of playbacks, to obtain the number of times of attention for each time slice;
确定及删除模块,用于根据目标时间窗对应的待删除数据的数据量和所述关联解析模块得到的每一个时间片的关注次数,确定并删除所述目标时间窗的待删除的数据片,所述待删除的数据片包括待删除的视频数据片和/或待删除的特征数据片,得到处理后的视频流和对应的特征流。The determination and deletion module is used to determine and delete the data slice to be deleted in the target time window according to the amount of data to be deleted corresponding to the target time window and the number of times of attention for each time slice obtained by the association analysis module, The data slices to be deleted include video data slices to be deleted and/or feature data slices to be deleted, and a processed video stream and corresponding feature streams are obtained.
第三方面,本申请实施例提供一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器运行所述计算机程序以实现如上所述的方法步骤。In the third aspect, the embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor, and the processor runs the computer program to realize Method steps as described above.
第四方面,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行实现如上所述的方法步骤。In a fourth aspect, the embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and the program is executed by a processor to implement the above-mentioned method steps.
本申请实施例提供的技术方案可以包括以下有益效果:The technical solutions provided by the embodiments of the present application may include the following beneficial effects:
在本申请实施例中,获取视频流和对应的特征流;按照预设划分方式对视频流和对应的特征流进行时间片划分,得到对应的时间片划分结果,时间片划分结果包括每一个时间片对应的时间戳、对应的视频数据片和对应的特征数据片;将时间片划分结果中的每一个时间片与搜索数量和/或回放数量进行关联和解析,得到每一个时间片的关注次数;以及根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除目标时间窗的待删除的数据片,待删除的数据片包括待删除的视频数据片和/或待删除的特征数据片,得到处理后的视频流和对应的特征流。本申请实施例提供的基于智能数字视网膜的视频处理方法,能够根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,精准地确定并删除目标时间窗的待删除的数据片,这样,能够有效地减少基于智能数字视网膜的视频处理过程中的存储开销。应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本发明。In the embodiment of the present application, the video stream and the corresponding feature stream are obtained; the video stream and the corresponding feature stream are divided into time slices according to the preset division method, and the corresponding time slice division results are obtained. The time slice division results include each time slice The time stamp corresponding to the slice, the corresponding video data slice, and the corresponding feature data slice; associate and analyze each time slice in the time slice division result with the number of searches and/or playback numbers, and obtain the number of attentions for each time slice ; and according to the amount of data to be deleted corresponding to the target time window and the number of times of concern for each time slice, determine and delete the data slices to be deleted in the target time window, the data slices to be deleted include video data slices to be deleted and/or Or the feature data slice to be deleted, to obtain the processed video stream and the corresponding feature stream. The video processing method based on the intelligent digital retina provided by the embodiment of the present application can accurately determine and delete the data to be deleted in the target time window according to the data volume of the data to be deleted corresponding to the target time window and the number of attentions in each time slice In this way, the storage overhead in intelligent digital retina-based video processing can be effectively reduced. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention.
附图说明Description of drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本发明的实施例,并与说明书一起用于解释本发明的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description serve to explain the principles of the invention.
图1是本申请实施例提供的一种基于智能数字视网膜的视频处理方法的流程示意图;FIG. 1 is a schematic flow diagram of a video processing method based on an intelligent digital retina provided by an embodiment of the present application;
图2是本申请实施例提供的具体应用场景下的存储判决工作机制示意图;FIG. 2 is a schematic diagram of a storage judgment working mechanism in a specific application scenario provided by an embodiment of the present application;
图3是本申请实施例提供的具体应用场景下的时间窗对应的视频数据片和特征数据片的示意图;FIG. 3 is a schematic diagram of video data slices and feature data slices corresponding to time windows in a specific application scenario provided by an embodiment of the present application;
图4是本申请实施例提供的具体应用场景下的时间窗对应的删除视频数据片和删除特征数据片的示意图;4 is a schematic diagram of a deleted video data piece and a deleted feature data piece corresponding to a time window in a specific application scenario provided by an embodiment of the present application;
图5是本申请实施例提供的具体应用场景下的存储重建特征数据存储特征数据片的示意图;Fig. 5 is a schematic diagram of storing and reconstructing feature data and storing feature data slices in a specific application scenario provided by an embodiment of the present application;
图6是本申请实施例提供的一种基于智能数字视网膜的视频处理装置的结构示意图;Fig. 6 is a schematic structural diagram of a video processing device based on an intelligent digital retina provided by an embodiment of the present application;
图7示出了根据本申请实施例的电子设备连接结构示意图。Fig. 7 shows a schematic diagram of a connection structure of an electronic device according to an embodiment of the present application.
具体实施方式Detailed ways
以下描述和附图充分地示出本发明的具体实施方案,以使本领域的技术人员能够实践它们。The following description and drawings illustrate specific embodiments of the invention sufficiently to enable those skilled in the art to practice them.
应当明确,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。It should be clear that the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.
下面结合附图详细说明本公开的可选实施例。Optional embodiments of the present disclosure will be described in detail below in conjunction with the accompanying drawings.
如图1所示,是本申请实施例提供的一种基于智能数字视网膜的视频处理方法的流程示意图;如图1所示,本申请实施例提供了一种基于智能数字视网膜的视频处理方法,具体包括如下方法步骤:As shown in FIG. 1 , it is a schematic flowchart of a video processing method based on an intelligent digital retina provided by an embodiment of the present application; as shown in FIG. 1 , an embodiment of the present application provides a video processing method based on an intelligent digital retina, Concretely include the following method steps:
S102:获取视频流和对应的特征流。S102: Obtain a video stream and a corresponding feature stream.
本申请实施例提供的视频处理方法是基于智能数字视网膜技术,该智能数字视网膜技术的原理具体如下所述:The video processing method provided in the embodiment of the present application is based on the intelligent digital retina technology, and the principle of the intelligent digital retina technology is as follows:
前端设备同时具有视频压缩和用于视频特征提取的深度模型。由于后端可以通过传输的方法向前端部署不同的模型,因此,可以理解前端设备拥有自适应获取任何深度模型的能力。因此,只要在离线的情况下训练一个具有特殊特征提取能力的模型,都可以通过智能数字视网膜模型流部署到前端设备处。在云端,特征流的主要目的在于进行图像检索,在用户得到检索结果后,一种常见的联动需求就是对图像或视频进行回放。然而,在诸如智慧城市等应用领域,一个大型城市的前端采集设备高达上百万。尽管云服务器和高速通信网络能够实现数据的传输和云端处理,但是存储空间仍然是一个制约瓶颈。这是因为实时生成的视频数据的数据量巨大造成的,云端的存储器在过往只能存储有限时间内的视频数据流。由于传统视频编码技术是基于像素的,因此存储端只能根据时间戳来选择保留视频数据,例如只保存过去7天的数据,换句话,系统将自动“遗忘”7天之前的数据。然而,这与人脑记忆数据的方式是完全不同的,人脑对视频数据的记忆几乎是不基于像素的,例如杏仁核负责基于情绪的记忆,杏仁核在压力荷尔蒙的作用下形成重要的记忆,也就是说杏仁核是基于事件的重要性来存储数据的;小脑记忆数据的模式则是基于隐式信息,而 不是直接信息;前额叶层负责对加工后的语义信息进行加工和记忆;此外,大脑还进行长短期记忆的转换,这个工作是通过海马体来完成的。Front-end devices have both video compression and deep models for video feature extraction. Since the backend can deploy different models to the frontend through the transmission method, it can be understood that the frontend device has the ability to adaptively acquire any depth model. Therefore, as long as a model with special feature extraction capabilities is trained offline, it can be deployed to the front-end device through the intelligent digital retina model stream. In the cloud, the main purpose of the feature stream is to perform image retrieval. After the user obtains the retrieval results, a common linkage requirement is to playback images or videos. However, in application areas such as smart cities, a large city has millions of front-end acquisition devices. Although cloud servers and high-speed communication networks can realize data transmission and cloud processing, storage space is still a bottleneck. This is because the amount of video data generated in real time is huge, and cloud storage can only store video data streams for a limited time in the past. Since traditional video coding technology is based on pixels, the storage side can only choose to retain video data based on the timestamp, for example, only save the data of the past 7 days. In other words, the system will automatically "forget" the data before 7 days. However, this is completely different from the way the human brain memorizes data. The human brain's memory of video data is almost non-pixel-based. For example, the amygdala is responsible for emotion-based memories, and the amygdala forms important memories under the action of stress hormones. , that is to say, the amygdala stores data based on the importance of events; the memory data mode of the cerebellum is based on implicit information rather than direct information; the prefrontal cortex is responsible for processing and memorizing the processed semantic information; in addition , The brain also performs long-term and short-term memory conversion, and this work is done through the hippocampus.
本申请实施例提供的视频处理方法中所采用的智能数字视网膜不仅能够提供视频流和特征流的同时获取和传输,还能够对数据进行存储,并利用新增的特征流数据。The intelligent digital retina used in the video processing method provided by the embodiment of the present application can not only provide simultaneous acquisition and transmission of video streams and feature streams, but also store data and utilize newly added feature stream data.
S104:按照预设划分方式对视频流和对应的特征流进行时间片划分,得到对应的时间片划分结果,时间片划分结果包括每一个时间片对应的时间戳、对应的视频数据片和对应的特征数据片。S104: Divide the video stream and the corresponding feature stream into time slices according to the preset division method, and obtain the corresponding time slice division results. The time slice division results include the timestamp corresponding to each time slice, the corresponding video data slice and the corresponding Feature data sheet.
在一种实施方式中,预设划分方式可以为:时间片的划分按照等长的时间片划分。在另外一种实施方式中,预设划分方式可以为:时间片的划分按照编码的GOP分段划分。In an implementation manner, the preset division manner may be: the division of time slices is divided into time slices of equal length. In another implementation manner, the preset division method may be: the time slice is divided according to coded GOP segments.
S106:将时间片划分结果中的每一个时间片与搜索数量和/或回放数量进行关联和解析,得到每一个时间片的关注次数。S106: Associating and analyzing each time slice in the time slice division result with the number of searches and/or the number of playbacks to obtain the number of attentions for each time slice.
S108:根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除目标时间窗的待删除的数据片,待删除的数据片包括待删除的视频数据片和/或待删除的特征数据片,得到处理后的视频流和对应的特征流。S108: According to the amount of data to be deleted corresponding to the target time window and the number of times of attention for each time slice, determine and delete the data slices to be deleted in the target time window, the data slices to be deleted include the video data slices to be deleted and/or Or the feature data slice to be deleted, to obtain the processed video stream and the corresponding feature stream.
在一种可能的实现方式中,待删除的数据片包括目标时间窗的第一待删除的视频数据片和目标时间窗的第一待删除的特征数据片,根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除目标时间窗的待删除的数据片包括以下步骤:In a possible implementation, the data slices to be deleted include the first video data slice to be deleted in the target time window and the first feature data slice to be deleted in the target time window, according to the data to be deleted corresponding to the target time window The amount of data and the number of times of attention in each time slice, determining and deleting the data slices to be deleted in the target time window includes the following steps:
根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除第一待删除的视频数据片和第一待删除的特征数据片。Determine and delete the first video data piece to be deleted and the first feature data piece to be deleted according to the amount of data to be deleted corresponding to the target time window and the number of times of attention in each time slice.
在一种可能的实现方式中,待删除的数据片包括目标时间窗的第二待删除的视频数据片,根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除目标时间窗的待删除的数据片包括以下步骤:In a possible implementation, the data slice to be deleted includes the second video data slice to be deleted in the target time window, and is determined according to the amount of data to be deleted corresponding to the target time window and the number of times of attention in each time slice And deleting the data piece to be deleted in the target time window includes the following steps:
根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除第二待删除的视频数据片。According to the amount of data to be deleted corresponding to the target time window and the number of times of attention in each time slice, determine and delete the second video data slice to be deleted.
如图2所示,是本申请实施例提供的具体应用场景下的存储判决工作机制示意图。As shown in FIG. 2 , it is a schematic diagram of a storage judgment working mechanism in a specific application scenario provided by the embodiment of the present application.
如图2所示,搜索引擎用于为用户提供视频检索。搜索引擎用户从特征存储器中匹配出搜索结果,并反馈给用户。如果用户进一步有视频回放的需求,则回放引擎会根据时间戳信息对视频流信息进行解码,并播放视频。进一步,如图3所示存储判决模块,该模块根据搜索引擎的搜索结果和/或回放引擎的播放结果,生成对特征数据和视频流数据的删除/保留决定。这是由于,搜索引擎获得用户的特征输入,这意味着搜索的特征属于用户关注 特征。回放引擎的用户操作也与用户关注的内容高度关联。因此,存储判决模块通过用户的反馈判断保留的特征和视频数据。As shown in Figure 2, a search engine is used to provide users with video retrieval. The search engine users match the search results from the feature storage and give feedback to the users. If the user further needs video playback, the playback engine will decode the video stream information according to the timestamp information and play the video. Further, as shown in FIG. 3, a decision module is stored, which generates a decision to delete/retain feature data and video stream data according to the search results of the search engine and/or the playback results of the playback engine. This is because the search engine obtains the user's feature input, which means that the searched feature belongs to the user's attention feature. The user operations of the playback engine are also highly correlated with the content that the user cares about. Therefore, the storage judgment module judges the preserved features and video data through the user's feedback.
下面具体介绍存储判决的工作机制。The working mechanism of storage judgment is introduced in detail below.
首先,将视频流和对应的特征流进行时间片划分。在一种实施方式中,时间片的划分按照等长的时间片划分。在另外一种实施方式中,时间片的划分按照编码的GOP分段划分。进一步,将每个时间片与搜索和回放的数量进行关联。得到与每个时间片对应的一个关注次数。在一个时间窗内,计算生成一个删除数据的数据量,并将该数据量按照关注次数由低到高的顺序删除。时间窗周期性更新,因此,周期性执行数据删除。如图3所示,每个视频数据片对应了与其时间相同的特征数据片,并且都有与一个关注次数相关联。注意,在相同时间内的视频数据片和特征数据片对应的关注次数并不一定相等。时间窗覆盖了时间戳3~时间戳8的视频数据片和特横数据片。First, divide the video stream and the corresponding feature stream into time slices. In an implementation manner, the time slices are divided into time slices of equal length. In another embodiment, the time slice is divided according to encoded GOP segments. Further, each time slice is associated with the number of searches and playbacks. Obtain an attention count corresponding to each time slice. Within a time window, calculate and generate a data volume of deleted data, and delete the data volume in order of the number of attentions from low to high. The time window is updated periodically, therefore, data deletion is performed periodically. As shown in FIG. 3 , each video data piece corresponds to the feature data piece with the same time, and is associated with an attention count. Note that the attention times corresponding to the video data slice and the feature data slice in the same time period are not necessarily equal. The time window covers the video data slices and special data slices from timestamp 3 to timestamp 8.
进一步,根据时间窗内的关注次数和计划的数据删除量,存储判决模块在一种示例中将删除时间戳4对应的视频数据片和时间戳5对应的特征数据片。具体的,假设时间窗内总数据量为D t,而最大的可分配存储空间为D max,且有 Further, according to the number of attentions in the time window and the planned amount of data deletion, the storage decision module will delete the video data slice corresponding to timestamp 4 and the feature data slice corresponding to timestamp 5 in an example. Specifically, suppose the total amount of data in the time window is D t , and the maximum allocable storage space is D max , and there is
D t-D Max=D D>0。 D t −D Max =D D >0.
此时,删除判决根据D D计算执行删除的数据片,在图2中可以直接按照关注次数计算出结果,但本申请实施例提供的视频处理方法包括使用任何算法来计算出所需的删除数据片。删除后的数据如图4所示。由于特征数据量的多少取决于所使用的特征提取模型,在一些情况下特征流的数据量非常小,将对应的特征数据片删除并不能节约太多的存储空间。因此,在一种实施方式中,删除过程可以只在视频数据片或特征数据片上进行。 At this time, the deletion judgment calculates and executes the deleted data slice according to D , and the result can be directly calculated according to the number of attentions in Figure 2, but the video processing method provided in the embodiment of the present application includes using any algorithm to calculate the required deleted data piece. The deleted data is shown in Figure 4. Since the amount of feature data depends on the feature extraction model used, in some cases the amount of data in the feature stream is very small, and deleting the corresponding feature data slices does not save much storage space. Therefore, in one embodiment, the deletion process can only be performed on the video data slice or the feature data slice.
在一种可能的实现方式中,本申请实施例提供的视频处理方法还包括以下步骤:In a possible implementation manner, the video processing method provided in the embodiment of the present application further includes the following steps:
根据预设删除方式删除目标时间窗的待删除的视频数据片。The video data pieces to be deleted in the target time window are deleted according to a preset deletion method.
在本申请实施例提供的视频处理方法中,视频或特征数据的删除是一种直接删除,也就是所有相关的数据将被放弃。尽管这种方法较为容易实施,但是却存在一些弊端。在有限的时间内,关注次数的分布并不能代表所有的潜在用户需求。如果一段数据被删除,则用户将无法再对时间戳内的视频获得任何反馈。因此,如果能通过其他方法满足小概率的需求进行响应的方法,则相比传统的存储和删除方法带来质的突破。由于云端的存储资源 受限,但计算资源则是可以在任何时间进行协调的。因此,在一种实施方式中,数据的删除采用“不完全”删除的方法。In the video processing method provided in the embodiment of the present application, the deletion of video or feature data is a direct deletion, that is, all related data will be discarded. Although this method is relatively easy to implement, it has some disadvantages. In a limited time, the distribution of attention times cannot represent all potential user needs. If a piece of data is deleted, the user will no longer be able to get any feedback on the video within the timestamp. Therefore, if other methods can be used to meet the needs of small probability and respond, it will bring a qualitative breakthrough compared to traditional storage and deletion methods. Due to the limited storage resources in the cloud, computing resources can be coordinated at any time. Therefore, in one embodiment, data deletion adopts an "incomplete" deletion method.
在本申请实施例中,预设删除方式为“不完全”删除的方法,具体步骤如下所述:In the embodiment of this application, the default deletion method is the "incomplete" deletion method, and the specific steps are as follows:
S1:根据时间窗和分配的存储资源,计算出待删除的视频数据片。S1: Calculate the video data slice to be deleted according to the time window and allocated storage resources.
S2:对待删除的视频数据进行删除,并生成可用于重建视频数据的特征数据。S2: Deleting the video data to be deleted, and generating feature data that can be used to reconstruct the video data.
S3:保留特征数据片和S2中生成的重建特征数据。S3: retain the feature data slice and the reconstructed feature data generated in S2.
如前所述,图4的视频数据片可以是一个封闭的GOP,这意味着删除一个视频数据片不只是删除其中数据量较少的B帧或P帧,也包括了数据量较大的I帧。由于一个封闭的GOP是独立编码的,因此意味着该数据片内的所有视频数据将完全消失。因此,S2中所述的特征数据是不依赖于任何基于块编码的视频数据的。在一种较为可行的方法中,S2的重建视频数据的特征数据是通过深度学习模型获得的。图5展示了上述过程的示意图。根据存储判决结果,一个视频数据片被输入至一个深度模型,用于提取重建特征数据,同时原始编码数据将被丢弃。最终,在存储器端将只保留重建特征数据和特征数据片。As mentioned above, the video data slice in Figure 4 can be a closed GOP, which means that deleting a video data slice not only deletes the B frame or P frame with a small amount of data, but also includes the I frame with a large amount of data. frame. Since a closed GOP is encoded independently, it means that all video data in the data slice will completely disappear. Therefore, the feature data described in S2 does not depend on any block-based coding video data. In a more feasible method, the feature data of the reconstructed video data of S2 is obtained through a deep learning model. Figure 5 shows a schematic diagram of the above process. Based on the storage decision, a piece of video data is input to a deep model for extracting reconstruction feature data, while the original encoded data is discarded. Ultimately, only the reconstructed feature data and feature data slices will be kept on the memory side.
在一种可能的实现方式中,根据预设删除方式删除目标时间窗的待删除的视频数据片包括以下步骤:In a possible implementation manner, deleting the video data piece to be deleted in the target time window according to a preset deletion method includes the following steps:
获取目标时间窗内的总数据量,以及获取最大待分配存储数据量;Obtain the total amount of data in the target time window, and obtain the maximum amount of storage data to be allocated;
计算目标时间窗内的总数据量和最大待分配存储数据量之间的差值;Calculate the difference between the total amount of data in the target time window and the maximum amount of stored data to be allocated;
基于差值确定目标时间窗的待删除的视频数据片;Determine the video data slice to be deleted in the target time window based on the difference;
将待删除的视频数据片删除,并生成用于重建视频数据的重建特征数据;Deleting the video data sheet to be deleted, and generating reconstruction feature data for reconstructing the video data;
存储目标时间窗的特征数据片和重建特征数据。Store feature data slices and reconstruction feature data of the target time window.
在一种可能的实现方式中,本申请实施例提供的视频处理方法还包括以下步骤:In a possible implementation manner, the video processing method provided in the embodiment of the present application further includes the following steps:
获取视频重建模型、未删除的视频数据、目标时间窗的特征数据片和重建特征数据;Obtain the video reconstruction model, the undeleted video data, the feature data slice of the target time window and the reconstructed feature data;
基于视频重建模型、未删除的视频数据、目标时间窗的特征数据片和重建特征数据,对目标时间窗的待删除的数据片进行重建处理,生成对应的重建视频数据。Based on the video reconstruction model, undeleted video data, feature data slices of the target time window, and reconstruction feature data, the data slices to be deleted in the target time window are reconstructed to generate corresponding reconstructed video data.
在一种可能的实现方式中,本申请实施例提供的视频处理方法还包括以下步骤:In a possible implementation manner, the video processing method provided in the embodiment of the present application further includes the following steps:
基于深度模型的类型匹配对应的视频重建模型。Match the corresponding video reconstruction model based on the type of depth model.
在一种可能的实现方式中,基于深度模型的类型匹配对应的视频重建模型包括以下步骤:In a possible implementation manner, matching the corresponding video reconstruction model based on the type of the depth model includes the following steps:
若深度模型为生成图像为具有第一预设分辨率范围的模型,则匹配出的对应的视频重建模型为具有第二预设分辨率范围的重建深度模型。If the depth model is a model that generates an image with a first preset resolution range, the corresponding video reconstruction model that is matched is a reconstructed depth model with a second preset resolution range.
在本步骤中,第一预设分辨率范围往往为超低分辨率的范围,对应的重建深度模型的第二预设分辨率范围为超高分辨率范围。若深度模型是一个能够生成超低分辨率图像的模型时,则生成的超低分辨率图像是经过基于残差的编码方式进行连续图像的编码而生成的图像。In this step, the first preset resolution range is often an ultra-low resolution range, and the corresponding second preset resolution range of the reconstructed depth model is an ultra-high resolution range. If the depth model is a model capable of generating super-low-resolution images, the generated super-low-resolution images are images generated by coding consecutive images in a residual-based coding manner.
在一种可能的实现方式中,基于深度模型的类型匹配对应的视频重建模型包括以下步骤:In a possible implementation manner, matching the corresponding video reconstruction model based on the type of the depth model includes the following steps:
若深度模型为特征提取模型,则匹配出的对应的视频重建模型为自编码机的解码器。If the depth model is a feature extraction model, the matched corresponding video reconstruction model is a decoder of an autoencoder.
在本步骤中,若深度模型为特征提取模型,例如,子编码机的编码器,则对应的视频重建模型为自编码机的解码器。In this step, if the depth model is a feature extraction model, for example, an encoder of a sub-encoder, the corresponding video reconstruction model is a decoder of an auto-encoder.
在一种可能的实现方式中,基于深度模型的类型匹配对应的视频重建模型包括以下步骤:In a possible implementation manner, matching the corresponding video reconstruction model based on the type of the depth model includes the following steps:
若深度模型为基于生成对抗模型进行重建特征提取的模型,则匹配出的对应的视频重建模型为生成对抗网络。If the depth model is a model that extracts reconstruction features based on a generative adversarial model, the corresponding video reconstruction model that is matched is a generative adversarial network.
在本步骤中,若深度模型为基于生成对抗模型进行重建特征提取的模型时,该特征提取的模型主要用于提取人体的骨骼特征和外观属性信息的记忆特征,则对应的视频重建模型为一个生成对抗网络。该生成对抗网络的输入为特征值,并根据一个训练后的生成模型来重建视频数据。In this step, if the depth model is a model based on the generative confrontation model for reconstruction feature extraction, the feature extraction model is mainly used to extract the memory features of human body bone features and appearance attribute information, and the corresponding video reconstruction model is a Generative Adversarial Networks. The GAN takes feature values as input and reconstructs video data according to a trained generative model.
经过如图5所示的处理,存储判决器结合深度模型仍然能保证时间窗内的存储空间消耗量。同时,由于特征数据片保存完整,因此,用户可能产生对已经删除的视频数据片产生回放需求。此时,回放引擎将利用深度模型从重建特征数据中重建删除的数据。用于视频重建的深度模型与图5中用于重建特征提取的深度模型一一对应。After processing as shown in Figure 5, the storage decision unit combined with the depth model can still guarantee the storage space consumption within the time window. At the same time, since the characteristic data pieces are completely preserved, the user may have a playback requirement for the deleted video data pieces. At this point, the playback engine will utilize the depth model to reconstruct the deleted data from the reconstructed feature data. The depth model used for video reconstruction corresponds one-to-one with the depth model used for reconstruction feature extraction in Fig. 5.
在本申请实施例中,获取视频流和对应的特征流;按照预设划分方式对视频流和对应的特征流进行时间片划分,得到对应的时间片划分结果,时间片划分结果包括每一个时间片对应的时间戳、对应的视频数据片和对应的特征数据片;将时间片划分结果中的每一个时间片与搜索数量和/或回放数量进行关联和解析,得到每一个时间片的关注次数;以及根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除目标时间窗的待删除的数据片,待删除的数据片包括待删除的视频数据片和/或待删除的特征数据片,得到处理后的视频流和对应的特征流。本申请实施例提供的基于智能数字视网膜的视频处理方法,能够根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,精准地确定并删除目标时间窗的待删除的数据片,这样,能够有效地减少基于智能 数字视网膜的视频处理过程中的存储开销。In the embodiment of the present application, the video stream and the corresponding feature stream are obtained; the video stream and the corresponding feature stream are divided into time slices according to the preset division method, and the corresponding time slice division results are obtained. The time slice division results include each time slice Timestamp corresponding to the slice, corresponding video data slice, and corresponding feature data slice; associate and analyze each time slice in the time slice division result with the number of searches and/or replays, and obtain the number of attentions for each time slice ; and according to the amount of data to be deleted corresponding to the target time window and the number of times of concern for each time slice, determine and delete the data slices to be deleted in the target time window, the data slices to be deleted include video data slices to be deleted and/or Or the feature data slice to be deleted, to obtain the processed video stream and the corresponding feature stream. The intelligent digital retina-based video processing method provided in the embodiment of the present application can accurately determine and delete the data to be deleted in the target time window according to the amount of data to be deleted corresponding to the target time window and the number of attentions in each time slice In this way, the storage overhead in intelligent digital retina-based video processing can be effectively reduced.
下述为本申请实施例基于智能数字视网膜的视频处理实施例,可以用于执行本申请实施例基于智能数字视网膜的视频处理方法实施例。对于本申请实施例基于智能数字视网膜的视频处理装置实施例中未披露的细节,请参照本申请实施例基于智能数字视网膜的视频处理方法实施例。The following is an embodiment of video processing based on intelligent digital retina in the embodiment of the present application, which can be used to implement the embodiment of the video processing method based on intelligent digital retina in the embodiment of the present application. For the details not disclosed in the embodiment of the intelligent digital retina-based video processing device in the embodiment of the present application, please refer to the embodiment of the intelligent digital retina-based video processing method in the embodiment of the present application.
请参见图6,其示出了本发明一个示例性实施例提供的基于智能数字视网膜的视频处理装置的结构示意图。该基于智能数字视网膜的视频处理装置可以通过软件、硬件或者两者的结合实现成为终端的全部或一部分。该基于智能数字视网膜的视频处理装置包括获取模块602、划分模块604、关联解析模块606和确定及删除模块608。Please refer to FIG. 6 , which shows a schematic structural diagram of an intelligent digital retina-based video processing device provided by an exemplary embodiment of the present invention. The intelligent digital retina-based video processing device can be implemented as all or a part of the terminal through software, hardware or a combination of the two. The intelligent digital retina-based video processing device includes an acquisition module 602 , a division module 604 , an association analysis module 606 and a determination and deletion module 608 .
具体而言,获取模块602,用于获取视频流和对应的特征流;Specifically, the acquisition module 602 is configured to acquire video streams and corresponding feature streams;
划分模块604,用于按照预设划分方式对获取模块602获取的视频流和对应的特征流进行时间片划分,得到对应的时间片划分结果,时间片划分结果包括每一个时间片对应的时间戳、对应的视频数据片和对应的特征数据片;The division module 604 is configured to divide the video stream and the corresponding feature stream acquired by the acquisition module 602 into time slices according to a preset division method, and obtain corresponding time slice division results, the time slice division results including the timestamp corresponding to each time slice , the corresponding video data slice and the corresponding feature data slice;
关联解析模块606,用于将划分模块604得到的时间片划分结果中的每一个时间片与搜索数量和/或回放数量进行关联和解析,得到每一个时间片的关注次数;The association analysis module 606 is used to associate and analyze each time slice in the time slice division result obtained by the division module 604 with the search quantity and/or playback quantity, and obtain the number of times of attention of each time slice;
确定及删除模块608,用于根据目标时间窗对应的待删除数据的数据量和关联解析模块606得到的每一个时间片的关注次数,确定并删除目标时间窗的待删除的数据片,待删除的数据片包括待删除的视频数据片和/或待删除的特征数据片,得到处理后的视频流和对应的特征流。The determination and deletion module 608 is used to determine and delete the data slice to be deleted in the target time window according to the data volume of the data to be deleted corresponding to the target time window and the number of times of concern for each time slice obtained by the association analysis module 606. The data slices include video data slices to be deleted and/or feature data slices to be deleted, and a processed video stream and corresponding feature streams are obtained.
可选的,待删除的数据片包括目标时间窗的第一待删除的视频数据片和目标时间窗的第一待删除的特征数据片,确定及删除模块608用于:Optionally, the data slices to be deleted include the first video data slice to be deleted of the target time window and the first feature data slice to be deleted of the target time window, and the determination and deletion module 608 is used for:
根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除第一待删除的视频数据片和第一待删除的特征数据片。Determine and delete the first video data piece to be deleted and the first feature data piece to be deleted according to the amount of data to be deleted corresponding to the target time window and the number of times of attention in each time slice.
可选的,待删除的数据片包括目标时间窗的第二待删除的视频数据片,确定及删除模块608用于:Optionally, the data slice to be deleted includes the second video data slice to be deleted in the target time window, and the determination and deletion module 608 is used for:
根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除第二待删除的视频数据片。According to the amount of data to be deleted corresponding to the target time window and the number of times of attention in each time slice, determine and delete the second video data slice to be deleted.
可选的,所述装置还包括:Optionally, the device also includes:
删除模块(在图6中未示出),用于根据预设删除方式删除目标时间窗的待删除的视频数据片。A deletion module (not shown in FIG. 6 ), configured to delete the video data pieces to be deleted in the target time window according to a preset deletion method.
可选的,删除模块具体用于:Optionally, remove modules specifically for:
获取目标时间窗内的总数据量,以及获取最大待分配存储数据量;Obtain the total amount of data in the target time window, and obtain the maximum amount of storage data to be allocated;
计算目标时间窗内的总数据量和最大待分配存储数据量之间的差值;Calculate the difference between the total amount of data in the target time window and the maximum amount of stored data to be allocated;
基于差值确定目标时间窗的待删除的视频数据片;Determine the video data slice to be deleted in the target time window based on the difference;
将待删除的视频数据片删除,并生成用于重建视频数据的重建特征数据;Deleting the video data sheet to be deleted, and generating reconstruction feature data for reconstructing the video data;
存储目标时间窗的特征数据片和重建特征数据。Store feature data slices and reconstruction feature data of the target time window.
可选的,获取模块602还用于:获取视频重建模型、目标时间窗的特征数据片和重建特征数据;Optionally, the obtaining module 602 is also used to: obtain the video reconstruction model, the characteristic data slice of the target time window and the reconstructed characteristic data;
可选的,所述装置还包括:Optionally, the device also includes:
视频数据重建模块(在图6中未示出),用于基于获取模块602获取的视频重建模型、未删除的视频数据、目标时间窗的特征数据片和重建特征数据,对目标时间窗的待删除的数据片进行重建处理,生成对应的重建视频数据。The video data reconstruction module (not shown in Fig. 6) is used for based on the video reconstruction model obtained by the acquisition module 602, the undeleted video data, the feature data sheet and the reconstruction feature data of the target time window, to the target time window The deleted data slices are reconstructed to generate corresponding reconstructed video data.
可选的,所述装置还包括:Optionally, the device also includes:
重建模型匹配模块(在图6中未示出),用于基于深度模型的类型匹配对应的视频重建模型。The reconstruction model matching module (not shown in FIG. 6 ) is configured to match the corresponding video reconstruction model based on the type of the depth model.
可选的,重建模型匹配模块具体用于:Optionally, the reconstruction model matching module is specifically used for:
若深度模型为生成图像为具有第一预设分辨率范围的模型,则匹配出的对应的视频重建模型为具有第二预设分辨率范围的重建深度模型;或者,If the depth model is a model that generates an image with a first preset resolution range, the corresponding video reconstruction model that is matched is a reconstructed depth model with a second preset resolution range; or,
若深度模型为特征提取模型,则匹配出的对应的视频重建模型为自编码机的解码器;或者,If the depth model is a feature extraction model, the matched corresponding video reconstruction model is a decoder of an autoencoder; or,
若深度模型为基于生成对抗模型进行重建特征提取的模型,则匹配出的对应的视频重建模型为生成对抗网络。If the depth model is a model that extracts reconstruction features based on a generative adversarial model, the corresponding video reconstruction model that is matched is a generative adversarial network.
需要说明的是,上述实施例提供的基于智能数字视网膜的视频处理装置在执行基于智能数字视网膜的视频处理方法时,仅以上述各功能单元的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元完成,即将设备的内部结构划分成不同的功能单元,以完成以上描述的全部或者部分功能。另外,上述实施例提供的基于智能数字视网膜的视频处理装置与基于智能数字视网膜的视频处理方法实施例属于同一构思,其体现实现过程详见基于智能数字视网膜的视频处理方法实施例,这里不再赘述。It should be noted that when the intelligent digital retina-based video processing device provided in the above-mentioned embodiments executes the intelligent digital retina-based video processing method, the division of the above-mentioned functional units is used as an example for illustration. The above function allocation is completed by different functional units, that is, the internal structure of the device is divided into different functional units, so as to complete all or part of the functions described above. In addition, the intelligent digital retina-based video processing device and the intelligent digital retina-based video processing method embodiment provided in the above-mentioned embodiments belong to the same concept, and its implementation process is detailed in the intelligent digital retina-based video processing method embodiment, which is not repeated here. repeat.
在本申请实施例中,获取模块用于获取视频流和对应的特征流;划分模块用于按照预设划分方式对获取模块获取的视频流和对应的特征流进行时间片划分,得到对应的时间片 划分结果,时间片划分结果包括每一个时间片对应的时间戳、对应的视频数据片和对应的特征数据片;关联解析模块用于将划分模块得到的时间片划分结果中的每一个时间片与搜索数量和/或回放数量进行关联和解析,得到每一个时间片的关注次数;以及确定及删除模块用于根据目标时间窗对应的待删除数据的数据量和关联解析模块得到的每一个时间片的关注次数,确定并删除目标时间窗的待删除的数据片,待删除的数据片包括待删除的视频数据片和/或待删除的特征数据片,得到处理后的视频流和对应的特征流。本申请实施例提供的基于智能数字视网膜的视频处理装置,能够根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,精准地确定并删除目标时间窗的待删除的数据片,这样,能够有效地减少基于智能数字视网膜的视频处理过程中的存储开销。In the embodiment of the present application, the acquisition module is used to obtain the video stream and the corresponding feature stream; the division module is used to divide the video stream and the corresponding feature stream acquired by the acquisition module into time slices according to the preset division method, and obtain the corresponding time Slice division results, the time slice division results include the timestamp corresponding to each time slice, the corresponding video data slice and the corresponding feature data slice; the association analysis module is used to divide each time slice in the time slice division result obtained by the division module Correlating and analyzing with the number of searches and/or playbacks to obtain the number of attentions of each time slice; and the determination and deletion module is used to obtain each time according to the amount of data to be deleted corresponding to the target time window and the association analysis module The number of attentions of the slices, determine and delete the data slices to be deleted in the target time window, the data slices to be deleted include the video data slices to be deleted and/or the feature data slices to be deleted, and the processed video stream and corresponding features flow. The intelligent digital retina-based video processing device provided in the embodiment of the present application can accurately determine and delete the data to be deleted in the target time window according to the amount of data to be deleted corresponding to the target time window and the number of attentions in each time slice In this way, the storage overhead in intelligent digital retina-based video processing can be effectively reduced.
如图7所示,本实施例提供一种电子设备,该电子设备包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,处理器运行计算机程序以实现如上所述的方法步骤。As shown in FIG. 7 , this embodiment provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and operable on the processor. The processor runs the computer program to realize the above-mentioned Method steps.
本申请实施例提供了一种存储有计算机可读指令的存储介质,其上存储有计算机程序,程序被处理器执行实现如上所述的方法步骤。An embodiment of the present application provides a storage medium storing computer-readable instructions, on which a computer program is stored, and the program is executed by a processor to implement the above method steps.
下面参考图7,其示出了适于用来实现本申请实施例的电子设备的结构示意图。本申请实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图7示出的电子设备仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。Referring to FIG. 7 , it shows a schematic structural diagram of an electronic device suitable for implementing the embodiment of the present application. The terminal equipment in the embodiment of the present application may include but not limited to such as mobile phone, notebook computer, digital broadcast receiver, PDA (personal digital assistant), PAD (tablet computer), PMP (portable multimedia player), vehicle terminal (such as mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers and the like. The electronic device shown in FIG. 7 is only an example, and should not limit the functions and scope of use of this embodiment of the present application.
如图7所示,电子设备可以包括处理装置(例如中央处理器、图形处理器等)701,其可以根据存储在只读存储器(ROM)702中的程序或者从存储装置708加载到随机访问存储器(RAM)703中的程序而执行各种适当的动作和处理。在RAM703中,还存储有电子设备操作所需的各种程序和数据。处理装置701、ROM702以及RAM703通过总线704彼此相连。输入/输出(I/O)接口705也连接至总线704。As shown in FIG. 7 , an electronic device may include a processing device (such as a central processing unit, a graphics processing unit, etc.) (RAM) 703 to execute various appropriate actions and processing. In RAM 703, various programs and data necessary for the operation of the electronic device are also stored. The processing device 701 , ROM 702 , and RAM 703 are connected to each other through a bus 704 . An input/output (I/O) interface 705 is also connected to the bus 704 .
通常,以下装置可以连接至I/O接口705:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置706;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置707;包括例如磁带、硬盘等的存储装置708;以及通信装置709。通信装置709可以允许电子设备与其他设备进行无线或有线通信以交换数据。虽然图7示出了具有各种装置的电子设备,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。Typically, the following devices can be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speaker, vibration an output device 707 such as a computer; a storage device 708 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device to communicate with other devices wirelessly or by wire to exchange data. While FIG. 7 shows an electronic device having various means, it is to be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置709从网络上被下载和安装,或者从存储装置708被安装,或者从ROM702被安装。在该计算机程序被处理装置701执行时,执行本申请实施例的方法中限定的上述功能。In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, where the computer program includes program codes for executing the methods shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communication means 709 , or from storage means 708 , or from ROM 702 . When the computer program is executed by the processing device 701, the above-mentioned functions defined in the method of the embodiment of the present application are performed.
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device . Program code embodied on a computer readable medium may be transmitted by any appropriate medium, including but not limited to wires, optical cables, RF (radio frequency), etc., or any suitable combination of the above.
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。The above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供 商来通过因特网连接)。Computer program code for carrying out the operations of the present disclosure can be written in one or more programming languages, or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional Procedural Programming Language - such as "C" or a similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In cases involving a remote computer, the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider). Internet connection).
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
描述于本申请实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定。The units involved in the embodiments described in the present application may be implemented by means of software or by means of hardware. Wherein, the name of a unit does not constitute a limitation of the unit itself under certain circumstances.
以上所揭露的仅为本申请较佳实施例而已,当然不能以此来限定本申请之权利范围,因此依本申请权利要求所作的等同变化,仍属本申请所涵盖的范围。The above disclosures are only preferred embodiments of the present application, which certainly cannot limit the scope of the present application. Therefore, equivalent changes made according to the claims of the present application still fall within the scope of the present application.

Claims (11)

  1. 一种基于智能数字视网膜的视频处理方法,其特征在于,所述方法包括:A video processing method based on an intelligent digital retina, characterized in that the method comprises:
    获取视频流和对应的特征流;Get the video stream and the corresponding feature stream;
    按照预设划分方式对所述视频流和对应的特征流进行时间片划分,得到对应的时间片划分结果,所述时间片划分结果包括每一个时间片对应的时间戳、对应的视频数据片和对应的特征数据片;The video stream and the corresponding feature stream are divided into time slices according to the preset division method, and the corresponding time slice division results are obtained, and the time slice division results include the timestamp corresponding to each time slice, the corresponding video data slice and Corresponding feature data slice;
    将所述时间片划分结果中的每一个时间片与搜索数量和/或回放数量进行关联和解析,得到每一个时间片的关注次数;Associating and analyzing each time slice in the time slice division result with the number of searches and/or the number of playbacks to obtain the number of times of attention for each time slice;
    根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除所述目标时间窗的待删除的数据片,所述待删除的数据片包括待删除的视频数据片和/或待删除的特征数据片,得到处理后的视频流和对应的特征流。According to the amount of data to be deleted corresponding to the target time window and the number of times of attention for each time slice, determine and delete the data slices to be deleted in the target time window, the data slices to be deleted include video data slices to be deleted and/or the feature data slice to be deleted to obtain the processed video stream and the corresponding feature stream.
  2. 根据权利要求1所述的方法,其特征在于,所述待删除的数据片包括所述目标时间窗的第一待删除的视频数据片和所述目标时间窗的第一待删除的特征数据片;所述根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除所述目标时间窗的待删除的数据片包括:The method according to claim 1, wherein the data slice to be deleted comprises the first video data slice to be deleted in the target time window and the first feature data slice to be deleted in the target time window ; According to the amount of data to be deleted corresponding to the target time window and the number of times of concern for each time slice, determining and deleting the data slice to be deleted in the target time window includes:
    根据所述目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除所述第一待删除的视频数据片和所述第一待删除的特征数据片。The first video data piece to be deleted and the first feature data piece to be deleted are determined and deleted according to the amount of data to be deleted corresponding to the target time window and the number of times of attention in each time slice.
  3. 根据权利要求1所述的方法,其特征在于,所述待删除的数据片包括所述目标时间窗的第二待删除的视频数据片,所述根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除所述目标时间窗的待删除的数据片包括:The method according to claim 1, wherein the data slice to be deleted comprises the second video data slice to be deleted in the target time window, and the data volume of the data to be deleted corresponding to the target time window is and the number of times of concern for each time slice, determining and deleting the data slices to be deleted in the target time window includes:
    根据目标时间窗对应的待删除数据的数据量和每一个时间片的关注次数,确定并删除所述第二待删除的视频数据片。The second video data piece to be deleted is determined and deleted according to the amount of data to be deleted corresponding to the target time window and the number of times of attention in each time slice.
  4. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, further comprising:
    根据预设删除方式删除所述目标时间窗的待删除的视频数据片。The video data pieces to be deleted in the target time window are deleted according to a preset deletion mode.
  5. 根据权利要求4所述的方法,其特征在于,所述根据预设删除方式删除所述目标时间窗的待删除的视频数据片包括:The method according to claim 4, wherein the deleting the video data slice to be deleted in the target time window according to a preset deletion method comprises:
    获取所述目标时间窗内的总数据量,以及获取最大待分配存储数据量;Obtain the total amount of data within the target time window, and obtain the maximum amount of stored data to be allocated;
    计算所述目标时间窗内的总数据量和所述最大待分配存储数据量之间的差值;calculating the difference between the total amount of data within the target time window and the maximum amount of stored data to be allocated;
    基于所述差值确定所述目标时间窗的所述待删除的视频数据片;determining the video data slice to be deleted in the target time window based on the difference;
    将所述待删除的视频数据片删除,并生成用于重建视频数据的重建特征数据;Deleting the video data piece to be deleted, and generating reconstruction feature data for reconstructing video data;
    存储所述目标时间窗的特征数据片和所述重建特征数据。The characteristic data piece of the target time window and the reconstructed characteristic data are stored.
  6. 根据权利要求5所述的方法,其特征在于,所述方法还包括:The method according to claim 5, wherein the method further comprises:
    获取视频重建模型、未删除的视频数据、所述目标时间窗的特征数据片和所述重建特征数据;Obtaining a video reconstruction model, undeleted video data, feature data slices of the target time window, and the reconstructed feature data;
    基于所述视频重建模型、未删除的视频数据、所述目标时间窗的特征数据片和所述重建特征数据,对所述目标时间窗的待删除的数据片进行重建处理,生成对应的重建视频数据。Based on the video reconstruction model, undeleted video data, feature data slices of the target time window, and the reconstructed feature data, perform reconstruction processing on the data slices to be deleted in the target time window, and generate corresponding reconstructed videos data.
  7. 根据权利要求6所述的方法,其特征在于,所述方法还包括:The method according to claim 6, further comprising:
    基于深度模型的类型匹配对应的视频重建模型。Match the corresponding video reconstruction model based on the type of depth model.
  8. 根据权利要求7所述的方法,其特征在于,所述基于深度模型的类型匹配对应的视频重建模型包括:The method according to claim 7, wherein the type matching of the corresponding video reconstruction model based on the depth model comprises:
    若所述深度模型为生成图像为具有第一预设分辨率范围的模型,则匹配出的对应的视频重建模型为具有第二预设分辨率范围的重建深度模型;或者,If the depth model is a model that generates an image with a first preset resolution range, the matched corresponding video reconstruction model is a reconstructed depth model with a second preset resolution range; or,
    若所述深度模型为特征提取模型,则匹配出的对应的视频重建模型为自编码机的解码器;或者,If the depth model is a feature extraction model, the matched corresponding video reconstruction model is a decoder of an autoencoder; or,
    若所述深度模型为基于生成对抗模型进行重建特征提取的模型,则匹配出的对应的视频重建模型为生成对抗网络。If the depth model is a model for extracting reconstruction features based on a generative adversarial model, the corresponding video reconstruction model that is matched is a generative adversarial network.
  9. 一种基于智能数字视网膜的视频处理装置,其特征在于,所述装置包括:A video processing device based on an intelligent digital retina, characterized in that the device includes:
    获取模块,用于获取视频流和对应的特征流;An acquisition module, configured to acquire video streams and corresponding feature streams;
    划分模块,用于按照预设划分方式对所述获取模块获取的所述视频流和对应的特征流进行时间片划分,得到对应的时间片划分结果,所述时间片划分结果包括每一个时间片对应的时间戳、对应的视频数据片和对应的特征数据片;A division module, configured to divide the video stream and the corresponding feature stream acquired by the acquisition module into time slices according to a preset division method to obtain corresponding time slice division results, the time slice division results including each time slice Corresponding timestamp, corresponding video data slice and corresponding feature data slice;
    关联解析模块,用于将所述划分模块得到的所述时间片划分结果中的每一个时间片与搜索数量和/或回放数量进行关联和解析,得到每一个时间片的关注次数;An association analysis module, for associating and analyzing each time slice in the time slice division result obtained by the division module with the number of searches and/or the number of playbacks, to obtain the number of times of attention for each time slice;
    确定及删除模块,用于根据目标时间窗对应的待删除数据的数据量和所述关联解析模块得到的每一个时间片的关注次数,确定并删除所述目标时间窗的待删除的数据片,所述待删除的数据片包括待删除的视频数据片和/或待删除的特征数据片,得到处理后的视频流和对应的特征流。The determination and deletion module is used to determine and delete the data slice to be deleted in the target time window according to the amount of data to be deleted corresponding to the target time window and the number of times of attention for each time slice obtained by the association analysis module, The data slices to be deleted include video data slices to be deleted and/or feature data slices to be deleted, and a processed video stream and corresponding feature streams are obtained.
  10. 一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器运行所述计算机程序以实现如权利要求1-8 任一项所述的视频处理方法。An electronic device comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, characterized in that the processor runs the computer program to implement claims 1- 8. The video processing method described in any one.
  11. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述程序被处理器执行实现如权利要求1-8中任一项所述的视频处理方法。A computer-readable storage medium on which a computer program is stored, wherein the program is executed by a processor to implement the video processing method according to any one of claims 1-8.
PCT/CN2022/124876 2021-11-26 2022-10-12 Video processing method and apparatus based on intelligent digital retina WO2023093339A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111418776.1 2021-11-26
CN202111418776.1A CN113840147B (en) 2021-11-26 2021-11-26 Video processing method and device based on intelligent digital retina

Publications (1)

Publication Number Publication Date
WO2023093339A1 true WO2023093339A1 (en) 2023-06-01

Family

ID=78971696

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/124876 WO2023093339A1 (en) 2021-11-26 2022-10-12 Video processing method and apparatus based on intelligent digital retina

Country Status (2)

Country Link
CN (1) CN113840147B (en)
WO (1) WO2023093339A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113840147B (en) * 2021-11-26 2022-04-05 浙江智慧视频安防创新中心有限公司 Video processing method and device based on intelligent digital retina
CN114157863B (en) * 2022-02-07 2022-07-22 浙江智慧视频安防创新中心有限公司 Video coding method, system and storage medium based on digital retina

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102232220A (en) * 2010-10-29 2011-11-02 华为技术有限公司 Method and system for extracting and correlating video interested objects
CN107846576A (en) * 2017-09-30 2018-03-27 北京大学 Method and system for visual signature data encoding and decoding
CN110035330A (en) * 2019-04-16 2019-07-19 威比网络科技(上海)有限公司 Video generation method, system, equipment and storage medium based on online education
CN111092926A (en) * 2019-08-28 2020-05-01 北京大学 Digital retina multivariate data rapid association method
CN111787218A (en) * 2020-06-18 2020-10-16 安徽超清科技股份有限公司 Monitoring camera based on digital retina technology
CN113110421A (en) * 2021-03-23 2021-07-13 特斯联科技集团有限公司 Tracking linkage method and system for scenic spot river visual identification mobile ship
CN113269722A (en) * 2021-04-22 2021-08-17 北京邮电大学 Training method for generating countermeasure network and high-resolution image reconstruction method
CN113840147A (en) * 2021-11-26 2021-12-24 浙江智慧视频安防创新中心有限公司 Video processing method and device based on intelligent digital retina

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10891019B2 (en) * 2016-02-29 2021-01-12 Huawei Technologies Co., Ltd. Dynamic thumbnail selection for search results

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102232220A (en) * 2010-10-29 2011-11-02 华为技术有限公司 Method and system for extracting and correlating video interested objects
CN107846576A (en) * 2017-09-30 2018-03-27 北京大学 Method and system for visual signature data encoding and decoding
CN110035330A (en) * 2019-04-16 2019-07-19 威比网络科技(上海)有限公司 Video generation method, system, equipment and storage medium based on online education
CN111092926A (en) * 2019-08-28 2020-05-01 北京大学 Digital retina multivariate data rapid association method
CN111787218A (en) * 2020-06-18 2020-10-16 安徽超清科技股份有限公司 Monitoring camera based on digital retina technology
CN113110421A (en) * 2021-03-23 2021-07-13 特斯联科技集团有限公司 Tracking linkage method and system for scenic spot river visual identification mobile ship
CN113269722A (en) * 2021-04-22 2021-08-17 北京邮电大学 Training method for generating countermeasure network and high-resolution image reconstruction method
CN113840147A (en) * 2021-11-26 2021-12-24 浙江智慧视频安防创新中心有限公司 Video processing method and device based on intelligent digital retina

Also Published As

Publication number Publication date
CN113840147A (en) 2021-12-24
CN113840147B (en) 2022-04-05

Similar Documents

Publication Publication Date Title
WO2023093339A1 (en) Video processing method and apparatus based on intelligent digital retina
CN110933429B (en) Video compression sensing and reconstruction method and device based on deep neural network
WO2022105597A1 (en) Method and apparatus for playing back video at speed multiples , electronic device, and storage medium
US10965948B1 (en) Hierarchical auto-regressive image compression system
WO2022111110A1 (en) Virtual video livestreaming processing method and apparatus, storage medium, and electronic device
US20150181217A1 (en) Object archival systems and methods
WO2023116233A1 (en) Video stutter prediction method and apparatus, device and medium
WO2022206200A1 (en) Point cloud encoding method and apparatus, point cloud decoding method and apparatus, and computer-readable medium, and electronic device
CN111385576B (en) Video coding method and device, mobile terminal and storage medium
KR102612528B1 (en) Interruptible video transcoding
CN112714273A (en) Screen sharing display method, device, equipment and storage medium
EP4343614A1 (en) Information processing method and apparatus, device, readable storage medium and product
CN114679607B (en) Video frame rate control method and device, electronic equipment and storage medium
CN111883107A (en) Speech synthesis and feature extraction model training method, device, medium and equipment
WO2023071578A1 (en) Text-voice alignment method and apparatus, device and medium
CN116233445B (en) Video encoding and decoding processing method and device, computer equipment and storage medium
CN110852801B (en) Information processing method, device and equipment
US11095901B2 (en) Object manipulation video conference compression
WO2019047663A1 (en) Video format-based end-to-end automatic driving data storage method and device
Zhu Big data-based multimedia transcoding method and its application in multimedia data mining-based smart transportation and telemedicine
CN112866715A (en) Universal video compression coding system supporting man-machine hybrid intelligence
CN111800649A (en) Method and device for storing video and method and device for generating video
US11831698B2 (en) Data streaming protocols in edge computing
Zhou et al. Subjective and Objective Quality-of-Experience Assessment for 3D Talking Heads
US20240129047A1 (en) Method for creating sparse isobmff haptics tracks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22897423

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE