CN113711619B

CN113711619B - Multimedia data storage method, device, equipment, storage medium and program product

Info

Publication number: CN113711619B
Application number: CN202080000516.XA
Authority: CN
Inventors: 周江鲤
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2020-03-20
Filing date: 2020-03-20
Publication date: 2022-12-06
Anticipated expiration: 2040-03-20
Also published as: WO2021184333A1; CN113711619A

Abstract

A multimedia data storage method, a device, equipment, a storage medium and a program product belong to the technical field of data processing. The different event grades correspond to different storage strategies, one or more multimedia data segments can be determined from the obtained multimedia data, each multimedia data segment is recorded with a media event, the media event corresponds to event information, the event information comprises the event grade of the corresponding media event, and the storage strategy corresponding to the event grade can be selected to store the one or more multimedia data segments according to the event grade of the media event corresponding to each media data segment, so that the storage flexibility is improved.

Description

Multimedia data storage method, device, equipment, storage medium and program product

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a multimedia data storage method, apparatus, device, storage medium, and program product.

Background

The multimedia data acquisition system such as a monitoring system mainly comprises front-end equipment and back-end equipment, wherein the front-end equipment is used for acquiring multimedia data and transmitting the acquired multimedia data to the back-end equipment, and the back-end equipment is used for displaying and storing the multimedia data acquired by the front-end equipment.

In the related art, for example, the multimedia data acquired by the front-end device is a video, after receiving the video acquired by the front-end device, the back-end device divides the video into a plurality of video segments according to a set time length or a set file size, and stores each of the plurality of video segments according to a set storage policy. The storage policy may be: storing a video segment into a second storage area after the video segment is stored in a first storage area for M days, storing the video segment into a third storage area after the video segment is stored in a first storage area for N days, deleting the video segment after the video segment is stored in the third storage area for S days, sequentially decreasing the access frequency of the first storage area, the second storage area and the third storage area, wherein M, N and S are integers more than 1.

However, the storage strategy is a general data storage scheme, which results in poor flexibility of the method.

Disclosure of Invention

The application provides a multimedia data storage method, a device, equipment, a storage medium and a program product, which can improve the flexibility of a storage scheme. The technical scheme is as follows:

in a first aspect, a multimedia data storage method is provided, the method comprising:

acquiring multimedia data; determining one or more multimedia data segments from the multimedia data, wherein the multimedia data segments record media events, the media events correspond to event information, and the event information comprises event grades of the corresponding media events; selecting storage strategies corresponding to the event grades according to the event grades of the media events corresponding to the one or more multimedia data segments, wherein the storage strategies corresponding to different event grades are different; the one or more multimedia data segments are stored using the selected storage policy.

Because different event grades correspond to different storage strategies, the storage strategy corresponding to the event grade is selected according to the determined event grade of the media event corresponding to each multimedia data segment, and the one or more multimedia data segments are stored, so that the storage flexibility can be improved.

In the present application, the front-end device may collect multimedia data, such as video, image sequence, audio, etc., and send the collected multimedia data to the back-end device. The back-end equipment is provided with a semantic analysis model, and can perform semantic analysis section by section through the semantic analysis model according to the sequence of the acquisition time of each data frame included in the multimedia data to obtain one or more media events, wherein each media event corresponds to event information. The backend device may determine one or more multimedia data segments from the multimedia data based on the one or more media events. That is, determining one or more multimedia data segments from the multimedia data includes: performing semantic analysis on the multimedia data through a semantic analysis model to obtain one or more media events; one or more segments of multimedia data are determined from the multimedia data based on the one or more media events.

Since the multimedia data may be a video, an image sequence, etc. in the present application, the present application takes the multimedia data as the video and the image sequence as an example, and respectively introduces implementations of determining one or more multimedia data segments from the multimedia data according to the one or more media events.

(1) The multimedia data being video

In the case that the multimedia data is a video, the event information further includes an event start time and an event end time; determining one or more multimedia data segments from the multimedia data based on the one or more media events, comprising: and determining a video segment from the video as a multimedia data segment corresponding to the first media event according to the event starting time and the event ending time of the first media event, wherein the first media event is one of the one or more media events.

In the application, event lists corresponding to the event levels are stored in the back-end device, and each event list comprises a plurality of event keywords. The back-end device performs semantic analysis on the video through a semantic analysis model, and can also obtain event description information corresponding to each media event, that is, the event information can also include the event description information. The backend device may match the event description information of the first media event with each of the event keywords included in each event list, and if there is one event keyword included in one event list that matches the event description information, may determine an event rank corresponding to the matched event list as the event rank of the first media event. Then, the backend device may intercept, from the video, a video segment from the start time to the end time corresponding to the first media event as a multimedia data segment corresponding to the first media event.

It should be noted that, since the backend device performs semantic analysis on the video through the semantic analysis model, one or more media events can be obtained, and other media events except the first media event can determine the multimedia data segment corresponding to the corresponding media event according to a processing procedure similar to the first media event, which is not repeated here

(2) The multimedia data being a sequence of images

Under the condition that the multimedia data is an image sequence, the image sequence comprises a plurality of frames of images, the acquisition time corresponding to the plurality of frames of images is sequentially increased, and the event information further comprises event starting time and event ending time; determining one or more multimedia data from the multimedia data based on the one or more media events, comprising: and determining an image subsequence from the image sequence as a multimedia data segment corresponding to a first media event according to the starting time and the ending time of the first media event, wherein the first media event is one of the one or more media events.

In the application, the back-end device performs semantic analysis on the image sequence through a semantic analysis model, and can also obtain event description information corresponding to each media event. The backend device may match the event description information of the first media event with each of the event keywords included in each event list, and if there is one event keyword included in one event list that matches the event description information, may determine an event rank corresponding to the matched event list as the event rank of the first media event. Then, the backend device may select, from the image sequence, a plurality of frames of images between an event start time and an event end time corresponding to the first media event as an image sub-sequence, and use the image sub-sequence as a multimedia data segment corresponding to the first media event.

It should be noted that, since the backend device performs semantic analysis on the image sequence through the semantic analysis model, one or more media events may be obtained, and other media events except the first media event may determine the multimedia data segment corresponding to the corresponding media event according to a similar processing procedure to the first media event, which is not described herein again.

Optionally, for the above two cases, the backend device may also intercept, according to the event start time and the event deadline of the first media event, the start time of the multimedia data segment intercepted from the multimedia data being earlier than the event start time of the first media event, and the deadline of the intercepted multimedia data segment being later than the event deadline of the first media event. Therefore, all multimedia data of a complete event can be saved as far as possible so as to be seen more completely during subsequent case investigation.

In this application, the backend device is provided with a storage policy corresponding to each event level, and after obtaining one or more multimedia data segments and determining the event level of the media event corresponding to each multimedia data segment, the backend device may select a storage policy corresponding to the event level according to a correspondence between the event level and the storage policy, and store each multimedia data segment of the one or more multimedia data segments using the selected storage policy.

Optionally, the storage policy comprises a combination of one or more of the following policies: a storage time strategy, a compression ratio strategy, a deletion strategy and a redundancy strategy; the storage time strategy is used for indicating the storage time of the multimedia data segment, the compression rate strategy is used for indicating the compression rate of the stored multimedia data segment, the deletion strategy is used for indicating the deletion condition of the stored multimedia data segment, and the redundancy strategy is used for indicating the redundancy level of the stored multimedia data segment.

The storage time policy corresponding to each event level may be: the method comprises the steps of permanently storing a multimedia data segment with a first-level event level, permanently storing a multimedia data segment with a second-level event level, wherein the storage time length is X days, the event level is a third-level multimedia data segment, the storage time length is Y days, the event level is a fourth-level multimedia data segment, the storage time length is Z days, X, Y and Z are integers which are larger than 1, X is larger than Y, and Y is larger than Z.

The compression rate policy for each event level may be: the compression rate of the multimedia data segment with the first-level event level is 0 (uncompressed), the compression rate of the multimedia data segment with the second-level event level is x%, the compression rate of the multimedia data segment with the third-level event level is y%, the compression rate of the multimedia data segment with the fourth-level event level is z%, x, y and z are all larger than 0, x is smaller than y, and y is smaller than z.

The deletion policy corresponding to each event level may be: the multimedia data segment with the first-level event level is not automatically deleted, the multimedia data segment with the second-level event level is automatically deleted after I days of storage, the multimedia data segment with the third-level event level is automatically deleted after J days of storage, the multimedia data segment with the fourth-level event level is automatically deleted after K days of storage, and I, J and K are integers which are larger than 1, I is larger than J, and J is larger than K.

The redundancy policy corresponding to each event level may be: performing mirror image storage (storing two identical multimedia data segments) on the multimedia data segment with the first-level event level; multimedia data segment with the event level of two levels is stored by using 3+2 redundancy (the video data segment is divided into data blocks, and each 3 data blocks are redundant by using 2 check blocks); multimedia data segments with event level three are stored using 3+1 redundancy (dividing video data segments into data blocks, and using 1 parity block for redundancy every 3 data blocks).

Optionally, before determining the one or more multimedia data segments from the multimedia data, the method further comprises: acquiring a plurality of sample multimedia data segments and event information of a media event corresponding to each sample multimedia data segment in the plurality of sample multimedia data segments; and respectively inputting the plurality of sample multimedia data segments into an initial semantic analysis model, and training the initial semantic analysis model so as to enable the output of the initial semantic analysis model to be the event information of the media event corresponding to the corresponding sample multimedia data segment in the plurality of sample multimedia data segments, thereby obtaining the semantic analysis model.

In a second aspect, there is provided a multimedia data storage apparatus having functionality for performing the acts of the method of storing multimedia data of the first aspect. The multimedia data storage device comprises one or more modules, and the one or more modules are used for realizing the multimedia data storage method provided by the first aspect.

That is, the present application provides a multimedia data storage apparatus, comprising:

the first acquisition module is used for acquiring multimedia data;

the determining module is used for determining one or more multimedia data segments from the multimedia data, the multimedia data segments record media events, the media events correspond to event information, and the event information comprises event grades of the corresponding media events;

a selection module, configured to select a storage policy corresponding to an event level according to the event level of the media event corresponding to the one or more multimedia data segments, where the storage policies corresponding to different event levels are different;

a storage module for storing the one or more multimedia data segments using the selected storage policy.

Because different event grades correspond to different storage strategies, the multimedia data storage device provided by the application can select the storage strategy corresponding to the event grade according to the determined event grade of the media event corresponding to each multimedia data segment, and store one or more multimedia data segments, so that the storage flexibility is improved.

Optionally, the determining module includes:

the semantic analysis unit is used for performing semantic analysis on the multimedia data through a semantic analysis model to obtain one or more media events;

a determining unit for determining the one or more multimedia data segments from the multimedia data based on the one or more media events.

Optionally, the multimedia data is a video, and the event information further includes an event start time and an event end time;

the determination unit includes:

the first determining subunit is configured to determine, according to the event start time and the event end time of the first media event, a video segment from the video as a multimedia data segment corresponding to the first media event, where the first media event is one of the one or more media events.

Optionally, the multimedia data is an image sequence, the image sequence includes multiple frames of images, the acquisition times corresponding to the multiple frames of images are sequentially increased, and the event information further includes an event start time and an event end time;

the determination unit includes:

and the second determining subunit is used for determining an image subsequence from the image sequence as a multimedia data segment corresponding to the first media event according to the event starting time and the event ending time of the first media event, wherein the first media event is one of the one or more media events.

Optionally, the storage policy comprises a combination of one or more of the following policies:

a storage time policy for indicating a storage time duration of the multimedia data segment;

a compression rate policy for indicating a compression rate of the stored multimedia data segment;

a redundancy policy for indicating a redundancy level of the stored multimedia data segments;

a deletion policy for indicating a deletion condition of the stored multimedia data segment.

Optionally, the apparatus further comprises:

the second acquisition module is used for acquiring a plurality of sample multimedia data segments and event information of a media event corresponding to each sample multimedia data segment in the plurality of sample multimedia data segments;

and the training module is used for respectively inputting the plurality of sample multimedia data segments into an initial semantic analysis model and training the initial semantic analysis model so as to enable the output of the initial semantic analysis model to be the event information of the media event corresponding to the corresponding sample multimedia data segment in the plurality of sample multimedia data segments, thereby obtaining the semantic analysis model.

Optionally, the number of the multimedia data segments determined from the multimedia data is multiple, and the multiple multimedia data segments correspond to multiple event levels.

In a third aspect, a computer device, which may also be referred to as a multimedia data storage device, is provided that includes a processor and a memory for storing one or more multimedia data segments. The processor is integrated with a program code, so that the program code is executed to implement the method for storing multimedia data segments according to the first aspect. Or the program code is stored in a memory, the processor being configured to execute the program code stored in the memory. The operating means of the memory device may further comprise a communication bus for establishing a connection between the processor and the memory.

The technical effect obtained by the multimedia data storage device provided by the application is similar to the technical effect obtained by the corresponding technical means in the first aspect. That is, because different event levels correspond to different storage policies, the multimedia data storage device provided by the present application may select a storage policy corresponding to an event level according to the determined event level of the media event corresponding to each multimedia data segment, and store each multimedia data segment, so as to improve storage flexibility.

In a fourth aspect, a computer-readable storage medium is provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the multimedia data storage method of the first aspect.

The technical effect obtained by the computer-readable storage medium provided by the present application is similar to the technical effect obtained by the corresponding technical means in the first aspect. That is, since different event levels correspond to different storage policies, the computer-readable storage medium provided by the present application stores each multimedia data segment using a storage policy corresponding to an event level according to the determined event level of the media event corresponding to each multimedia data segment, which can improve storage flexibility.

In a fifth aspect, there is provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the method of multimedia data storage of the first aspect described above.

The technical effect obtained by the computer program product provided by the present application is similar to the technical effect obtained by the corresponding technical means in the first aspect. That is, because different event ranks correspond to different storage policies, the computer program product provided by the present application may determine the event rank of a media event corresponding to each multimedia data segment, and select a storage policy corresponding to the event rank to store the one or more multimedia data segments, which may improve storage flexibility.

The technical scheme provided by the application at least comprises the following beneficial effects:

in the application, different event levels correspond to different storage strategies, one or more multimedia data segments can be determined from the acquired multimedia data, the multimedia data segments are recorded with media events, the media events correspond to event information, the event information comprises event levels of corresponding media events, storage strategies corresponding to the event levels can be selected according to the event levels of the media events corresponding to the media data segments, and the one or more multimedia data segments are stored by using the selected storage strategies, so that the storage flexibility is improved.

Drawings

Fig. 1 is a diagram of a system architecture according to a multimedia data storage method provided in an exemplary embodiment of the present application;

FIG. 2 is a schematic block diagram of a computer device provided in an exemplary embodiment of the present application;

FIG. 3 is a flow chart of a method for storing multimedia data provided by an exemplary embodiment of the present application;

FIG. 4 is a flow chart of a method for storing multimedia data provided by another exemplary embodiment of the present application;

FIG. 5 is a flow chart of a multimedia data storage method provided by yet another exemplary embodiment of the present application;

FIG. 6 is a flow chart of a method for storing multimedia data provided by yet another exemplary embodiment of the present application;

fig. 7 is a schematic structural diagram of a multimedia data storage device according to an exemplary embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

It is to be understood that reference herein to "at least one" means one or more and "a plurality" means two or more. In the description of this application, "/" indicates an OR meaning, for example, A/B may indicate A or B; "and/or" herein is merely an association describing an associated object, and means that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, in order to facilitate clear description of technical solutions of the embodiments of the present application, in the embodiments of the present application, terms such as "first" and "second" are used to distinguish the same items or similar items having substantially the same functions and actions. Those skilled in the art will appreciate that the terms "first," "second," etc. do not denote any order or quantity, nor do the terms "first," "second," etc. denote any order or importance.

Fig. 1 is a system architecture diagram according to an embodiment of the present application, which provides a multimedia data storage method. Referring to fig. 1, the system architecture includes a front-end device 101 and a back-end device 102. The front-end device 101 and the back-end device 102 may communicate through a wired or wireless connection.

The front-end device 101 is configured to collect multimedia data, such as video, image sequence, and audio, and transmit the collected multimedia data to the back-end device 102.

The backend device 102 is deployed with a semantic analysis model, and is configured to perform semantic analysis on the received multimedia data through the semantic analysis model, determine one or more multimedia data segments from the multimedia data, determine an event level of a media event corresponding to each multimedia data segment, and select a corresponding storage policy to store each multimedia data segment according to the storage policy corresponding to each event level provided by the present application. The back-end device can determine the multimedia data segment according to the semantic analysis model, and can also determine the multimedia data segment from the multimedia data according to manual analysis or data quantity detection.

The backend device 102 may implement the multimedia data storage method provided in the embodiment of the present application in an online manner, that is, perform semantic analysis on the received multimedia data in real time, determine each multimedia data segment, and store the multimedia data segment according to a corresponding storage policy. Alternatively, the backend device 102 may also implement the multimedia data storage method provided in the embodiment of the present application in an offline manner. That is, the backend device 102 may first divide the received multimedia data into a plurality of initial data segments according to a preset duration or a preset file size, and temporarily store the initial data segments according to the collection time sequence, and then the backend device 102 may splice the plurality of temporarily stored initial data segments according to the collection time sequence, perform semantic analysis, determine each multimedia data segment, and store the multimedia data segment according to a corresponding storage policy.

For example, in a case that the amount of data collected by the front-end device 101 is large, the processing pressure of the back-end device 102 may be large, and in this case, the back-end device 102 may implement the multimedia data storage method provided in the embodiment of the present application in an offline manner.

In the embodiment of the present application, the front-end device 101 may include: sensor, encoder, acquisition card, etc. The front-end device 101 may be a camera, which may be, for example, an analog camera, a web camera, a high-definition camera, or the like. The encoder is configured to encode the collected multimedia data, and when the front-end device 101 is a camera, the encoder may be located inside the camera or may be independent of the camera. In other embodiments, the head end 101 may also include a sound pickup, which may also be referred to as a listening head, including a microphone, an amplifier, and the like.

In the embodiment of the present application, the backend device 102 is a storage apparatus (storage apparatus), and may include a processor, a memory (storage device), and a controller. The storage device can be a storage server, a Network Video Recorder (NVR), an IP SAN, a cloud storage and the like; memory such as disks, solid state disk drives, lamps; the controller may include a multimedia distributor, a control keypad, an integrated manager, etc. In other embodiments, the backend device 102 may also include a display, such as a monitor, a large screen video wall, or the like.

In the embodiment of the present application, the system architecture further includes a transmission device, configured to transmit the multimedia data collected and encoded by the front-end device 101 to the back-end device 102. The transmission device may be a video line, an audio line, a network cable, an optical fiber, an optical transceiver, a switch, a router, etc.

In some possible cases, the front-end device 101 may also deploy a semantic analysis model (software) for performing semantic analysis on the collected multimedia data to obtain media events in the multimedia data, mark a start time and an end time of each media event, that is, obtain a semantic analysis result, send the obtained semantic analysis result to the back-end device 102, the back-end device 102 determines one or more multimedia data segments from the multimedia data according to the semantic analysis result, and stores the one or more multimedia data segments according to a corresponding storage policy. Alternatively, the backend device 102 may also perform semantic analysis on the multimedia data according to the semantic analysis model, and then determine one or more multimedia data segments according to the semantic analysis result obtained by the front-end device 101.

It should be noted that the semantic analysis model deployed in the front-end device 101 may be the same as or different from the semantic analysis model deployed in the back-end device 102. For example, in a case where the processing capability of the front-end device 101 is weak, limited by the processing capability of the front-end device 101, the semantic analysis model deployed in the front-end device 101 may be a simpler model, and the model has a lower algorithm complexity, a smaller amount of computation, and a smaller number of types of events that can be analyzed. The backend device 102 generally has a relatively high processing capability, and can deploy a relatively optimal semantic analysis model, which may have a relatively high algorithm complexity, a relatively large computation amount, and a relatively large number of types of events that can be analyzed.

Referring to fig. 2, fig. 2 is a schematic structural diagram of a computer device, which may also be referred to as a multimedia data storage device, according to an embodiment of the present application, where the computer device may be the backend device shown in fig. 1. The computer device may include one or more processors 201, a communication bus 202, memory 203, internal or external storage 205, and one or more communication interfaces 204.

The processor 201 may be a general-purpose Central Processing Unit (CPU), a Network Processor (NP), a microprocessor, or may be one or more integrated circuits such as an application-specific integrated circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof, for implementing the aspects of the present disclosure. The PLD may be a Complex Programmable Logic Device (CPLD), a field-programmable gate array (FPGA), a General Array Logic (GAL), or any combination thereof. The processor may execute the multimedia data storage method in the method embodiment of the present invention by executing the program instructions.

A communication bus 202 is used to transfer information between the above components. The communication bus 202 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The memory 203 may be a read-only memory (ROM), and the memory 203 may be separate and connected to the processor 201 through the communication bus 202. Memory 203 may also be integrated with processor 201. Memory 203 may provide program instructions for processor 201 to execute. When the memory 203 and the processor 201 are separate, the memory 203 may also be a Random Access Memory (RAM), an electrically erasable programmable read-only memory (EEPROM), an optical disk (including a compact disk read-only memory (CD-ROM), a compact disk, a laser disk, a digital versatile disk, a blu-ray disk, etc.), a magnetic disk storage medium or other magnetic storage devices, a Solid State Disk (SSD), or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto.

The memory 205 is, for example, an optical disk (including a compact disk read-only memory (CD-ROM), compact disk, laser disk, digital versatile disk, blu-ray disk, etc.), a magnetic disk storage medium or other magnetic storage device, a Solid State Disk (SSD), or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory 205 is used for storing multimedia data segments. The communication interface 204 uses any transceiver or the like for communicating with other devices or communication networks. The communication interface 204 includes a wired communication interface, and may also include a wireless communication interface. The wired communication interface may be an ethernet interface, for example. The ethernet interface may be an optical interface, an electrical interface, or a combination thereof. The wireless communication interface may be a Wireless Local Area Network (WLAN) interface, a cellular network communication interface, or a combination thereof. The communication interface 204 may be used to provide a connection to the outside to acquire multimedia data.

In some embodiments, the computer device may include a plurality of processors 201, each of which may be a single-core processor or a multi-core processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).

In particular implementations, the computer device may also include an output device 206 and an input device 207, as one embodiment. The output device 206 is in communication with the processor 201 and may display information in a variety of ways. For example, the output device 206 may be a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display device, a Cathode Ray Tube (CRT) display device, a projector (projector), or the like. The input device 207 is in communication with the processor 201 and may receive user input in a variety of ways. For example, the input device 207 may be a mouse, a keyboard, a touch screen device, a sensing device, or the like.

In some embodiments, the memory 203 is used for storing program code 210 for executing the present application, and the processor 201 can execute the program code 210 stored in the memory 203. The program code may include one or more software modules, and the computer device may implement the multimedia data storage method provided in the embodiment of fig. 3 below through the processor 201 and the program code 210 in the memory 203.

Fig. 3 is a flowchart of a multimedia data storage method according to an embodiment of the present application, and is described as an example of applying the method to the backend device shown in fig. 1 and implementing the method in an online manner. Referring to fig. 3, the method includes the following steps.

Step 301: multimedia data is acquired.

In the embodiment of the present application, the front-end device may collect multimedia data, such as video (e.g., h.264 streaming media file, h.265 streaming media file, and rm, rmvb, mpeg1-4, mov, mtv, dat, wmv, avi, 3gp, amv, dmv, and flv, etc., format files), image sequence (e.g., bmp, gif, jpeg, pdf, and png), sound file (e.g., wav format file), and send the collected multimedia data to the back-end device.

Exemplarily, assuming that the multimedia data collected by the front-end device is a video, the front-end device may collect the video in real time to form a video stream, and send the video stream to the back-end device. The method comprises the steps that the front-end equipment can acquire images in real time, the acquired multi-frame images form an image sequence, and the image sequence is sent to the back-end equipment. If the multimedia data acquired by the front-end device is an image sequence, the front-end device may acquire a plurality of frames of images according to a certain acquisition period to form the image sequence, for example, the acquisition period may be 0.5s, 1s, 3s, and the like.

Step 302: one or more multimedia data segments are determined from the multimedia data, the multimedia data segments record media events, the media events correspond to event information, and the event information comprises event grades of the corresponding media events.

In the embodiment of the application, a semantic analysis model is deployed in the backend device, and the semantic analysis model can perform semantic analysis on the multimedia data to obtain one or more media events, and determine one or more multimedia data segments from the multimedia data according to the one or more media events, wherein the semantic analysis model can be any machine learning model.

It should be noted that each determined multimedia data segment is an independent multimedia file (the file format may be consistent with the multimedia data acquired from the front-end device). Or, the multimedia data is divided into one or more multimedia data segments, and further, the one or more obtained multimedia data segments can be cut out to form an independent multimedia file. The following embodiments will be described with a multimedia data segment as a separate multimedia file.

In this embodiment of the application, the backend device may perform semantic analysis segment by segment through the semantic analysis model according to the sequence of the acquisition time of each data frame included in the multimedia data to obtain one or more media events, where each media event corresponds to event information. The segment-by-segment may refer to frame-by-frame, or every preset number of data frames.

It should be noted that, in addition to determining the multimedia data segment through the semantic analysis model, the multimedia data segment may be determined from the multimedia data through manual analysis, or the multimedia data segment may be determined from the multimedia data through data amount detection. For example, when a video picture frequently changes, it may be considered that a media event has occurred. And the picture change frequency can be detected by detecting the amount of video frame data in different time periods, specifically: in the same time, the data amount of the video segment with frequently changing pictures is obviously larger than that of the video segment with frequently changing pictures. Different video segments can be divided into different event classes according to the size of the average amount of data per second, for example: dividing media events corresponding to video segments with data volume per second larger than 2MB into a first grade; dividing media events corresponding to video segments with data volume of 1M-2M per second into a second grade; and dividing the media event corresponding to the video segment with the data volume per second less than 1M into a third grade.

As can be seen from the foregoing, in the embodiment of the present application, the multimedia data may be a video, a sequence of images, and the like. Thus, the following describes an implementation of determining one or more multimedia data segments from multimedia data according to the one or more media events, respectively, taking the multimedia data as a video and an image sequence, respectively, as an example.

(1) The multimedia data being video

In the case that the multimedia data is a video, the event information further includes an event start time and an event deadline, and the backend device may determine, according to the event start time and the event deadline of the first media event, a video segment from the video as a multimedia data segment corresponding to the first media event, where the first media event is one of the one or more media events.

In the embodiment of the application, event lists corresponding to all event levels are stored in the back-end device, and each event list comprises a plurality of event keywords. The back-end device performs semantic analysis on the video through a semantic analysis model, and can also obtain event description information corresponding to each media event, that is, the event information can also include the event description information. The backend device may match the event description information of the first media event with each of the event keywords included in each event list, and if there is one event keyword included in one event list that matches the event description information, may determine an event rank corresponding to the matched event list as the event rank of the first media event. Then, the backend device may intercept a video segment from the video to an event deadline corresponding to the first media event, as a multimedia data segment corresponding to the first media event.

It should be noted that, since the backend device performs semantic analysis on the video through the semantic analysis model, one or more media events can be obtained, and other media events except the first media event can determine the multimedia data segment corresponding to the corresponding media event according to a processing procedure similar to the first media event, which is not described herein again.

(2) The multimedia data being a sequence of images

Under the condition that the multimedia data is an image sequence, the image sequence comprises a plurality of frames of images, the acquisition time corresponding to the plurality of frames of images is sequentially increased, and the event information further comprises event starting time and event ending time. The back-end device may determine an image sub-sequence from the image sequence as a multimedia data segment corresponding to the first media event according to the start time and the end time of the first media event, where the first media event is also one of the one or more media events.

In the embodiment of the application, the back-end device performs semantic analysis on the image sequence through a semantic analysis model, and can also obtain event description information corresponding to each media event. The backend device may match the event description information of the first media event with each of the event keywords included in each event list, and if there is one event keyword included in one event list that matches the event description information, may determine an event rank corresponding to the matched event list as the event rank of the first media event. Then, the backend device may select, from the image sequence, a plurality of frames of images between an event start time and an event end time corresponding to the first media event as an image sub-sequence, and use the image sub-sequence as a multimedia data segment corresponding to the first media event.

It should be noted that, since the backend device performs semantic analysis on the image sequence through the semantic analysis model, one or more media events can be obtained, and other media events except the first media event can determine the multimedia data segment corresponding to the corresponding media event according to a processing procedure similar to the first media event, which is not described herein again.

Optionally, for the above two cases, the backend device may also, according to the event start time and the event deadline of the first media event, start time of the multimedia data segment extracted from the multimedia data may be earlier than the event start time of the first media event, and deadline of the extracted multimedia data segment may be later than the event deadline of the first media event, so that all multimedia data of a complete event may be saved as much as possible, so as to view more complete multimedia data in a subsequent case-based survey.

Alternatively, in the case that the multimedia data is a video, the backend device may determine the start time and the end time of the multimedia data segment to be intercepted according to the event level of the first media event, and the event start time and the event end time of the first media event. In the case where the multimedia data is an image sequence, the back-end device may determine a start time and an end time of a multimedia data segment to be cut, or a start image frame and an end image frame included in the multimedia data segment to be cut, according to an event level of the first media event and an event start time and an event end time of the first media event.

For example, assuming that the multimedia data is a video, for a media event with a first-level event rating, the backend device may determine that the start time of the segment of multimedia data to be intercepted is 10 minutes earlier than the event start time of the corresponding media event, and determine that the deadline of the segment of multimedia data to be intercepted is 10 minutes later than the event deadline of the corresponding media event. For a media event with an event level of two, the backend device may determine that the start time of the segment of multimedia data to be intercepted is 5 minutes earlier than the event start time of the corresponding media event, and determine that the deadline time of the segment of multimedia data to be intercepted is 5 minutes later than the event deadline time of the corresponding media event. For a media event with an event level of three, the backend device may determine that the start time of the multimedia data segment to be intercepted is the same as the event start time of the corresponding media event, and determine that the deadline of the multimedia data segment to be intercepted is the same as the event deadline of the corresponding media event.

For another example, assuming that the multimedia data is an image subsequence, the collection time interval of two adjacent frames of images in a plurality of frames of images included in the image subsequence is the same, and the collection time interval, that is, the collection period, is assumed to be 5s, for a media event with an event level of one level, the backend device may determine that the start time of the multimedia data segment to be intercepted is 3 minutes earlier than the event start time of the corresponding media event, and determine that the deadline of the multimedia data segment to be intercepted is 3 minutes later than the event deadline of the corresponding media event. For a media event with an event level of two, the backend device may determine that the start time of the multimedia data segment to be intercepted is the same as the event start time of the corresponding media event, and determine that the deadline of the multimedia data segment to be intercepted is the same as the event deadline of the corresponding media event.

For another example, assuming that the multimedia data is an image subsequence, for a media event with a first-level event level, the back-end device may determine that a start image frame of a multimedia data segment to be cut is a third frame image from the previous image frame corresponding to the event start time of the corresponding media event, and determine that a stop image frame of the multimedia data segment to be cut is a third frame from the next image frame corresponding to the event stop time of the corresponding media event. For a media event with an event level of two, the back-end device may determine that a start image frame of a multimedia data segment to be intercepted is the same as an image frame corresponding to an event start time of a corresponding media event, and determine that an end image frame of the multimedia data segment to be intercepted is the same as an image frame corresponding to an event end time of a corresponding media event.

It should be noted that, in the case that the backend device determines that the number of the multimedia data segments from the multimedia data segments is multiple, the multiple multimedia data segments may correspond to multiple event levels. For example, different multimedia data segments may correspond to different event levels, or some of the multimedia data segments may correspond to the same event level.

Step 303: and selecting storage strategies corresponding to the event grades according to the event grades of the media events corresponding to the one or more multimedia data segments, wherein the storage strategies corresponding to different event grades are different.

In the embodiment of the application, the backend device is provided with the storage strategies corresponding to the event levels, and after obtaining one or more multimedia data segments and determining the event level of the media event corresponding to each multimedia data segment, the backend device can select the storage strategy corresponding to the event level according to the corresponding relationship between the event level and the storage strategy.

In an embodiment of the present application, the storage policy may include one or more of the following policies in combination: a storage time strategy, a compression rate strategy, a deletion strategy and a redundancy strategy. The storage time strategy is used for indicating the storage time of the multimedia data segment, the compression rate strategy is used for indicating the compression rate of the stored multimedia data segment, the deletion strategy is used for indicating the deletion condition of the stored multimedia data segment, and the redundancy strategy is used for indicating the redundancy level of the stored multimedia data segment.

For example, assuming that the storage policy includes a storage time policy, a compression rate policy, and a deletion policy, the storage policy corresponding to each event level may be: the multimedia data segment with the first-level event level is permanently stored, not automatically deleted and not compressed; the multimedia data segment with the second-level event level is stored for 90 days, is compressed by 30 percent after being stored for 30 days, is compressed by 60 percent after being stored for 60 days, and is automatically deleted after being stored for 90 days; the multimedia data segment with the event grade of three levels is stored for 60 days, the compression rate is 60% for compression after 30 days of storage, the compression rate is 80% for compression after 45 days of storage, and the multimedia data segment is deleted after 60 days of storage; and the multimedia data segment with the event level of four levels is stored for 30 days, and is compressed by the compression rate of 60% after being stored for 15 days and is deleted after being stored for 30 days. The storage strategies can be arbitrarily selected and combined according to the requirements of actual products. For example: (1) Only the storage time policy may be set, and no other policy may be set. Or, (2) different storage time policies are set for different event levels, but the same compression rate policy is set.

The storage policy introduced above is only an example, and in actual application, the storage policies corresponding to different event levels may also be set reasonably according to the requirements of various application scenarios, and in addition, the event levels may also be divided according to the actual conditions. For example, the event level may be divided into one level to ten levels, each level corresponding to one event list and each level corresponding to one storage policy.

Step 304: the one or more multimedia data segments are stored using the selected storage policy.

In this embodiment, after selecting the storage policy corresponding to the event class, the backend device may store the one or more multimedia data segments using the selected storage policy, respectively.

For example, the backend device may periodically obtain the storage policy corresponding to each event level, and may also perform archival storage on the stored multimedia data, for example, after the storage time expires, perform compressed storage, redundant storage, deletion, or the like on the multimedia data segment.

According to the embodiment of the application, semantic analysis is combined, the event level of the multimedia data is formulated, the event level represents the importance of the media event, the management of the multimedia data with different importance degrees is reasonably and effectively realized, the important multimedia data can be automatically stored for a long time, the multimedia data with lower importance degrees can be quickly aged, filed and deleted, and the storage cost is reduced.

It should be noted that, the semantic analysis model introduced above is obtained by training in advance according to the multimedia sample, and the training process may be implemented on the backend device, or implemented on other computer devices, and then introduced by being implemented on the backend device.

In this embodiment of the present application, the backend device may obtain a plurality of sample multimedia data segments and event information of a media event corresponding to each of the plurality of sample multimedia data segments, input the plurality of sample multimedia data segments into the initial semantic analysis model, and train the initial semantic analysis model, so that output of the initial semantic analysis model is event information of a media event corresponding to a corresponding sample multimedia data segment in the plurality of sample multimedia data segments, respectively, thereby obtaining the semantic analysis model.

Illustratively, the back-end device may obtain a plurality of sample multimedia data segments, event information of a media event corresponding to each sample multimedia data segment is event description information of a corresponding media event, and the back-end device may train the initial semantic analysis model in a supervised learning manner, thereby obtaining the semantic analysis model.

Fig. 4 is a flowchart of another multimedia data storage method according to an embodiment of the present application. Referring to fig. 4, assuming that multimedia data is a video, the front-end device transmits the acquired video to the back-end device in the form of a video stream, and the back-end device performs semantic analysis on the video through a semantic analysis model to determine video segments, determines event levels of corresponding video segments according to a set event list and event levels, and stores each video segment according to a corresponding storage policy. The backend device may use an online analysis mode or an offline analysis mode to store the multimedia data. When the data volume is small, the back-end equipment can adopt an online analysis mode, and when the data volume is large, the back-end equipment adopts an offline analysis mode.

Fig. 5 is a flowchart of another multimedia data storage method provided in an embodiment of the present application. Referring to fig. 5, taking a video monitoring system as an example, the video monitoring system includes a front-end device and a back-end device, the back-end device includes a semantic analysis module, a data storage module, a data archiving module and a configuration module, and the back-end device stores multimedia data in an online analysis mode. The front-end equipment collects and sends multimedia data to the semantic analysis module in real time, the semantic analysis module performs semantic analysis on the multimedia data, determines a media event and records the start time and the end time of the media event, the semantic analysis module periodically or triggered by other operations acquires an event grade and an event list from the configuration module, determines the event grade corresponding to the media event, forms event information corresponding to the media event into structured data and sends the multimedia data and the structured data to the data storage module, and the semantic analysis module can also send a data receiving response to the front-end equipment after receiving the multimedia data. The data storage module can be used for carrying out persistent storage on the multimedia data and carrying out structured storage on the start-stop time and the corresponding event grade of each media event according to the event grade. The data filing module can determine each multimedia data segment according to the start-stop time of the media event and the corresponding event grade, acquire the storage strategy corresponding to each event grade from the configuration module, and perform compression, retention, deletion and other processing on each multimedia data segment according to the corresponding storage strategy.

Fig. 6 is a flowchart of another multimedia data storage method provided in an embodiment of the present application. Referring to fig. 6, still taking the video surveillance system as an example, the backend device stores the multimedia data in an offline analysis mode. The front-end equipment collects and sends multimedia data to the semantic analysis module in real time, the semantic analysis module directly sends the multimedia data to the data storage module, the data storage module temporarily stores the multimedia data for a long time and sends a data receiving response to the semantic analysis module, and the semantic analysis module can also send a data receiving response to the front-end equipment. The semantic analysis module may obtain the multimedia data from the data storage module and perform semantic analysis on the multimedia data, and the processing of each subsequent module may refer to the related description of fig. 5, which is not described herein again.

In the embodiment of the application, corresponding event levels, event lists and storage strategies can be formulated according to various application scenarios of various industries. In practical application, a Graphical User Interface (GUI) can be provided for a user to make an event level, an event list and a storage policy corresponding to each level, and in addition, a typical scene template can be provided for the user to refer to, and the event level, the event list and the storage policy which are set according to experience are provided for the typical scene template to be selected by the user.

The present solution is further described in an application scenario of the transportation industry.

For the traffic industry, users are concerned about processing accident videos, the videos are high in importance, namely value, and can be used for accident judgment, and meanwhile, historical inquiry can be provided for cases, and long-term reservation is needed. For still videos without people and vehicle movement, such as those in the early morning hours, the importance of such videos is low, and fast aging and deletion are required to reduce storage cost, while high-value videos vacate storage space.

Illustratively, the traffic scene template: the event list with the first-level event level comprises keywords of accident-related events, such as vehicle collision, vehicle scraping and the like, and the storage strategy is long-term storage, and is not automatically deleted or compressed; the event list with the second-level event grade comprises normal traffic passing when people move, normal traffic passing when vehicles move and the like, the storage strategy is that the storage time is 90 days, compression is carried out when the compression rate is 30% after 30 days of storage, compression is carried out when the compression rate is 60% after 60 days of storage, and automatic deletion is carried out after 90 days of storage; the event list with the third-level event level comprises still videos, the storage time is 30 days according to the storage strategy, compression is carried out according to the compression rate of 60% after storage for 15 days, and automatic deletion is carried out after storage for 30 days.

The user can also modify the configuration by himself, for example, the event list with the first-level event level can also include running red light, not giving away pedestrians, and the like, and can also modify the storage duration, the compression strategy, and the like.

Description of accidents: assume 2019/09/11: 50 traffic scratch accident happens at a certain crossroad, turning vehicles touch the straight driving vehicles carelessly, 13:52-14:30, the two parties negotiate indemnity automatically, reach an agreement, leave by themselves and recover the traffic.

Taking the multimedia data storage method shown in fig. 5 as an example, the front-end device may collect and transmit a video stream to the semantic analysis module of the back-end device in real time, and the semantic analysis module performs semantic analysis on the video stream to identify 13: two cars were scratched at 50, at 14:30, automatically identifying 2019/09/11 according to configuration information obtained from a configuration module, including an event level and an event list: 47-14:33 The event level of the video segment (3 minutes before and after the start and stop of the event) is one level, the semantic analysis result is formed into structured data, and then the video and the structured data are sent to the data storage module. The data storage module can be used for storing the video data permanently and storing the structured data. The video archiving module automatically clears the historical data every 15 minutes. The storage time of the still video is 30 days, and the storage time of 2019/10/11: 00, when cleaning, inquiring structured data, and finding that 2019/09/11: 47-14: the event level of the video segment of 33 is one level, and the video segment is stored for a long time without automatic deletion or compression, and the deletion is 13:47 previous still video segment. Thus, if the subsequent tie of the accident repents the negotiation, the traffic administrator can view the relevant video segment to assist in accurately handling the accident.

The present solution is next described again in an application scenario of a campus.

For a park, especially a key warehouse, the processing of fire incidents and foreign people is particularly concerned, the importance of fire videos is very high and is high-value data which needs to be stored in a key mode, and the importance of still videos is lower and is low-value data.

Illustratively, campus scenario templates: the event list with the first-level event level comprises keywords of fire-related events, such as fire, smoke, fire and the like, and the storage strategy is long-term storage, and is not automatically deleted or compressed; the event list with the second-level event level comprises the intrusion of external personnel, the intrusion of external vehicles and the like, the storage strategy is that the storage time is 30 days, the compression is carried out by pressing the compression rate to be 30% after 7 days of storage, the compression is carried out by pressing the compression rate to be 60% after 15 days of storage, and the automatic deletion is carried out after 30 days of storage; the event list with the third-level event level comprises regular (white list) personnel movement, regular vehicle movement and the like, the storage policy is that the storage time is 15 days, compression is carried out by pressing the compression rate of 60% after 10 days of storage, and automatic deletion is carried out after 15 days of storage; the event list with the four-level event grade comprises no personnel movement, no vehicle movement and the like, the storage time is 7 days, the compression rate is 60% after the storage is carried out for 4 days, and the events are automatically deleted after the storage is carried out for 7 days.

The user can also modify the configuration by himself, for example, the event list with the first-level event level can also comprise the collapse of stacked articles, and can also modify the storage duration, the compression strategy and the like.

Event description: every 10 days, a campus warehouse will routinely inventory goods. 2019/09/11 09:00, an external stranger a enters the park, wanders in the park for 10 minutes, then enters a monitoring dead corner, 09:50 again, 10:00 leaving the park; 2019/09/20, when a park warehouse manager checks routinely, the article is found stolen, the person A is found to be a suspect by checking the video with the importance level of 2,3, the police is informed, after the police checks the video, the person A is caught, and the person A can not recognize the suspect and pay for the stolen article.

Still taking the multimedia data storage method shown in fig. 5 as an example, the front-end device collects and transmits the video stream to the semantic analysis module of the back-end device in real time. The semantic analysis module performs white list comparison through semantic analysis to identify 09:00 a stranger enters the campus, moves for 10 minutes in the video monitoring range, and automatically identifies 2019/09/11: 57-09:13 The event level of the video segment (3 minutes before and after the start and stop of the event) is two levels, the semantic analysis result is formed into structured data, and then the video and the structured data are sent to the data storage module. The data storage module can be used for storing the video data permanently and storing the structured data. The semantic analysis module identifies 09:50 hours A people appeared again, 10: after 00, A leaves the campus, and automatically identifying 2019/09/11 09:47-10:03 The event level of the video segment (3 minutes before and after the start and stop of the event) is 2 level, the semantic analysis result is formed into structured data, and then the video and the structured data are sent to the data storage module. The data storage module can be used for storing the video data permanently and storing the structured data.

The video archiving module automatically clears the historical data every 15 minutes. Still video retention period 7 days, 2019/09/18 09:00, inquiring structured data when cleaning, and finding that 2019/09/11 08:57-09:13 video segment with event level 2, retention by 30 days, delete 08:47 previous still video segment, save 08:57-09:13, video segment. 2019/09/18 10:00, when cleaning, inquiring the structured data, and finding that 2019/09/11 09:47-10: the 03 video segment has an event level of 2, persisted for 30 days, deleted 09:47 previous still video, save 09:47-10:03 video segment. Thus, if the gardon warehouse manager routinely checks the video on the 2019/09/20 day and finds that the goods are stolen, the video with the importance level of 2 on the 2019/09/11 day can be checked, the person A is found to be a suspect, and the police is notified.

The above introduces a multimedia data storage method under the condition that the multimedia data is a video or an image sequence, if the multimedia data is a picture, and no time relation or event relation exists between pictures, the semantic analysis model in the scheme can also be obtained by training a large number of pictures and the corresponding type or importance level of each picture. The back-end equipment can determine the type, the importance level and the like of each picture by performing semantic analysis on each picture, and select a corresponding storage strategy to store each picture according to the corresponding relation between the type or the importance level and the storage strategy.

In summary, in the embodiment of the present application, different event levels correspond to different storage policies, one or more multimedia data segments may be determined from the obtained multimedia data, the multimedia data segments record media events, the media events correspond to event information, the event information includes event levels of corresponding media events, and a storage policy corresponding to an event level may be selected to store the one or more multimedia data segments according to the event level of the media event corresponding to each media data segment, thereby improving storage flexibility.

Fig. 7 is a schematic structural diagram of a multimedia data storage apparatus provided in an embodiment of the present application, where the multimedia data storage apparatus may be implemented by software, hardware, or a combination of the two as part or all of a computer device, which may also be referred to as a multimedia data storage device, and the computer device may be a backend device shown in fig. 1. Referring to fig. 7, the apparatus includes: a first obtaining module 701, a determining module 702, a selecting module 703 and a storing module 704.

A first obtaining module 701, configured to obtain multimedia data;

a determining module 702, configured to determine one or more multimedia data segments from the multimedia data, where a media event is recorded in a multimedia data segment, and the media event corresponds to event information, where the event information includes an event rating of the corresponding media event;

a selecting module 703, configured to select, according to the event level of the media event corresponding to the one or more multimedia data segments, a storage policy corresponding to the event level;

a storage module 704 for storing the one or more multimedia data segments using the selected storage policy.

Optionally, the determining module 702 includes:

the semantic analysis unit is used for carrying out semantic analysis on the multimedia data through a semantic analysis model to obtain one or more media events;

a determining unit for determining the one or more multimedia data segments from the multimedia data according to the one or more media events.

the determination unit includes:

a storage time policy for indicating a storage time length of the multimedia data segment;

Optionally, the apparatus 700 further comprises:

In the embodiment of the application, different event levels correspond to different storage strategies, one or more multimedia data segments can be determined from the obtained multimedia data, the multimedia data segments record media events, the media events correspond to event information, the event information comprises event levels of the corresponding media events, and the one or more multimedia data segments can be stored by selecting the storage strategies corresponding to the event levels according to the event levels of the media events corresponding to the media data segments, so that the storage flexibility is improved.

It should be noted that: in the multimedia data storage device provided in the above embodiment, when storing multimedia data, only the division of the above functional modules is used for illustration, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the above described functions. In addition, the multimedia data storage device provided by the above embodiment and the multimedia data storage method embodiment belong to the same concept, and specific implementation processes thereof are detailed in the method embodiment and are not described herein again.

In the above embodiments, the implementation may be wholly or partly realized by software, hardware, firmware, or any combination thereof. When implemented in software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., digital Versatile Disk (DVD)), or a semiconductor medium (e.g., solid State Disk (SSD)), among others. It is noted that the computer-readable storage medium referred to herein may be a non-volatile storage medium, in other words, a non-transitory storage medium.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for storing multimedia data, the method comprising:

acquiring multimedia data;

performing semantic analysis on the multimedia data through a semantic analysis model to obtain a first media event and a second media event; each media event corresponds to event information, and the event information comprises an event grade and an event starting time of the media event;

determining a start time of a first multimedia data segment to be intercepted according to the event level of the first media event and the event start time of the first media event, wherein the start time of the first multimedia data segment is earlier than the event start time of the first media event by a first time length threshold;

determining the starting time of a second multimedia data segment to be intercepted according to the event grade of the second media event and the event starting time of the second media event, wherein the starting time of the second multimedia data segment is earlier than the event starting time of the second media event by a second duration threshold; the event rating of the first media event is different from the event rating of the second media event, and the first duration threshold is different from the second duration threshold;

determining the first multimedia data segment and the second multimedia data segment from the multimedia data according to the starting time of the first multimedia data segment and the starting time of the second multimedia data segment respectively;

respectively selecting a first storage strategy and a second storage strategy corresponding to the event grades according to the event grades of the first media event and the second media event, wherein the first storage strategy is different from the second storage strategy;

storing the first multimedia data segment and the second multimedia data segment using the first storage policy and the second storage policy, respectively.

2. The method of claim 1, wherein the storage policy comprises one or a combination of policies:

3. The method of claim 1, wherein prior to the semantically analyzing the multimedia data by the semantic analysis model, the method further comprises:

acquiring a plurality of sample multimedia data segments and event information of a media event corresponding to each sample multimedia data segment in the plurality of sample multimedia data segments;

and respectively inputting the plurality of sample multimedia data segments into an initial semantic analysis model, and training the initial semantic analysis model so as to enable the output of the initial semantic analysis model to be the event information of the media event corresponding to the corresponding sample multimedia data segment in the plurality of sample multimedia data segments, thereby obtaining the semantic analysis model.

4. A multimedia data storage apparatus, the apparatus comprising:

the first acquisition module is used for acquiring multimedia data;

the determining module is used for performing semantic analysis on the multimedia data through a semantic analysis model to obtain a first media event and a second media event; each media event corresponds to event information, and the event information comprises an event grade and an event starting time of the media event; determining a start time of a first multimedia data segment to be intercepted according to the event level of the first media event and the event start time of the first media event, wherein the start time of the first multimedia data segment is earlier than the event start time of the first media event by a first time length threshold; determining the starting time of a second multimedia data segment to be intercepted according to the event grade of the second media event and the event starting time of the second media event, wherein the starting time of the second multimedia data segment is earlier than the event starting time of the second media event by a second duration threshold; the event level of the first media event is different from the event level of the second media event, and the first duration threshold is different from the second duration threshold; determining the first multimedia data segment and the second multimedia data segment from the multimedia data according to the starting time of the first multimedia data segment and the starting time of the second multimedia data segment respectively;

a selection module, configured to select a first storage policy and a second storage policy corresponding to the event ratings of the first media event and the second media event, respectively, according to the event ratings of the first media event and the second media event, where the first storage policy is different from the second storage policy;

and the storage module is used for storing the first multimedia data segment and the second multimedia data segment by using the selected first storage strategy and the selected second storage strategy respectively.

5. The apparatus of claim 4, wherein the storage policy comprises a combination of one or more of:

the deletion policy is used to indicate a deletion condition of the stored multimedia data segment.

6. The apparatus of claim 4, wherein the apparatus further comprises:

7. A multimedia data storage device, the multimedia data storage device comprising:

a processor for executing program code to perform the method of any one of claims 1-3;

a memory for storing the first segment of multimedia data and the second segment of multimedia data.

8. A computer-readable storage medium, characterized in that a computer program is stored in the storage medium, which computer program, when being executed by a processor, carries out the steps of the method according to any one of claims 1-3.