CN112969068A

CN112969068A - Monitoring video data storage and playing method and device

Info

Publication number: CN112969068A
Application number: CN202110542891.3A
Authority: CN
Inventors: 雷雪萍
Original assignee: Sichuan Shangtou Information Technology Co Ltd
Current assignee: Sichuan Shangtou Information Technology Co Ltd
Priority date: 2021-05-19
Filing date: 2021-05-19
Publication date: 2021-06-15
Anticipated expiration: 2041-05-19
Also published as: CN112969068B

Abstract

The invention provides a storage and play method and a storage and play device of a monitoring video, which comprises the steps of extracting sound from a video frequency band, and sampling the sound at a first frequency; extracting a picture from the video frequency band, and sampling a second frequency of the picture; wherein the first frequency is N times of the second frequency; every N sampling sounds sequentially correspond to one frame of sampling picture and are provided with a pointer, and the pointer is used for pointing to the corresponding sampling picture; comparing whether the sampling picture pointed by the i + N sampling sounds and the picture pointed by the i sampling frame have changes or not; if so, continuing to traverse; if no change exists, the (i + N) th sampling sound is directed to the sampling picture directed by the ith sampling sound; storing the sampling sound and the sampling picture pointed by the sampling sound pointer; and after the processing is finished, repeating the steps for the rest videos. The invention can solve the problems that the monitoring video keeps sound to carry out high-definition compression, and sound cards are asynchronous and unsmooth.

Description

Monitoring video data storage and playing method and device

Technical Field

The invention relates to the field of data storage, in particular to a method and a device for storing and playing monitoring video data.

Background

The characteristics of the monitoring video: the method has the characteristics of multiple paths, very large data volume, long-time fixed picture and the like, and the storage of the current monitoring video at least has the following problems:

1. the monitoring video data volume is huge, in order to be stored for a long time, the existing storage mode usually stores the monitoring video data by a low-bit-rate and low-resolution method, but the video stored by the storage mode is not clear enough when being called, and details cannot be reflected after the video is amplified.

2. In order to solve the above technical problem, there is a storage method at present that motion detection is performed on a video, and if there is a moving object in two frames of videos before and after, the video is stored, otherwise, the video is not stored. On one hand, as only the motion video is stored, the stored data is segmented, and managers need to open the video segment by segment when viewing the video, the video cannot be played continuously, and the operation is not convenient; on the other hand, since the video segment without the moving picture is not stored, the background sound at the time of no moving picture is not stored, and the background sound generally has an important role such as that two persons in the video talk but have no action, and if only the moving picture is stored, the content of the talk cannot be recorded.

3. At present, there are some standards capable of performing video compression according to motion detection, taking h.265 as an example, the video can be compressed greatly through the steps of intra-frame prediction, inter-frame prediction, conversion, quantization, deblocking filter, entropy coding and the like, but a serious problem with the standard is that the calculation amount is huge, and the standard can still cope with high-definition movie playing and single-path video processing, and the monitoring video is characterized in that the number of paths is extremely large, and a small park usually has tens or even hundreds of paths of monitoring videos, if h.265 coding is performed on the videos at the same time, huge pressure is caused on a server, so that hardware is successfully and rapidly increased, and server resources are easily occupied, and normal operation of other services is affected; meanwhile, due to the fact that the speed of processing the video is low, the problem of video overflow is easy to occur in a monitoring scene, for example, a video of one minute is shot, but two minutes are used for processing the video, a monitoring video stream can come continuously, and storage overflow caused by the fact that subsequent videos cannot be processed in time is caused.

4. At present, most of video files store audio and picture information in a time synchronization mode, the same time axis is used for corresponding to the audio information and the video picture information, a compression algorithm of a motion detection class generally needs to refer to a front frame or a rear frame, and the time synchronization process is complex, so that the problems of synchronous abnormity, such as delayed or advanced sound, unsynchronized sound and the like, are very easy to occur when videos with more routes are processed simultaneously or the performance of a server is low, and many monitoring schemes sold in the market at present and using an H.265 compression technology have the problem of sound asynchronization.

Disclosure of Invention

In order to solve the problems in the background art and aim at the characteristics of the monitoring video, the application improves the prior art and provides a monitoring video data storage and playing method and a monitoring video data storage and playing device.

According to an aspect of the present invention, there is provided a surveillance video data storage method, comprising the steps of: caching a monitoring video of a time period to obtain a first video segment; extracting the sound of the monitoring video from the first video segment, and performing first frequency sampling on the sound to obtain a sampling sound; extracting a picture of the monitoring video from the first video segment, and performing second frequency sampling on the picture to obtain a sampling picture; wherein the first frequency is N times the second frequency, where N is a positive integer; initializing a first sampling sound corresponding to a first sampling picture, wherein every N sampling sounds sequentially correspond to a frame of sampling picture, and each N sampling sounds are provided with a pointer which is used for pointing to the corresponding sampling picture; traversing the sampling sound, and comparing whether a sampling picture pointed by the (i + N) th sampling sound and a picture pointed by the (i + N) th sampling frame are changed or not when the (i + N) th sampling sound is traversed; if so, continuing to traverse; if no change exists, the (i + N) th sampling sound is directed to the sampling picture directed by the ith sampling sound; storing the sampling sound and the sampling picture pointed by the sampling sound pointer; and after the current cached monitoring video is processed, continuing to read the subsequent monitoring video into the cache, and repeating the steps.

According to one aspect of the invention, the sample pictures are divided into X Y grid pictures, where X, Y is a positive integer greater than 1; every N sampling sounds have X Y pointers, and each pointer points to one sampling picture; the step of traversing the sampling sound, when traversing to the (i + N) th sampling sound, comparing whether a sampling picture pointed by the (i + N) th sampling sound and a picture pointed by the (i) th sampling frame have changes, specifically: when the (i + N) th sampling sound is traversed, comparing whether a sampling picture pointed by each pointer in the (i + N) th sampling sound X Y pointers and a picture pointed by a corresponding pointer in the (i) th sampling sound X Y pointers are changed or not; (ii) a If so, continuing to traverse; and if no change exists, the changed corresponding item in the (i + N) th sampling sound X X Y pointers points to the sampling picture of the corresponding item of the ith sampling sound pointer.

According to an aspect of the present invention, if K times of no change have been continuously detected, comparing whether or not there is a change in the sample picture pointed to by the i + N-th sample sound and the picture pointed to by the i-K x N-th sample frame; if there is a change, the i + N sample sound direction is unchanged.

According to an aspect of the present invention, the present invention provides a method for playing surveillance video data, which is characterized in that: the method is used for playing the video stored by the monitoring video data storage method; the method specifically comprises the following steps: reading the monitoring video stored by the video storage method; reading a corresponding sampling picture by taking the sampling sound as a reference, and reversely deducing a video time axis according to sound information; and playing the video according to the time axis, the sampling sound and the sampling picture.

According to one aspect of the present invention, there is provided a surveillance video data storage apparatus comprising: the cache module is used for caching the monitoring video in a time period to obtain a first video segment; the first sampling module is used for extracting the sound of the monitoring video from the first video segment and carrying out first frequency sampling on the sound to obtain a sampling sound; the second sampling module is used for extracting the picture of the monitoring video from the first video segment and carrying out second frequency sampling on the picture to obtain a sampling picture; wherein the first frequency is N times the second frequency, where N is a positive integer; the device comprises an initialization module, a sampling module and a display module, wherein the initialization module is used for initializing a first sampling sound corresponding to a first sampling picture, every N sampling sounds sequentially correspond to a frame of sampling picture, and every N sampling sounds are provided with a pointer which is used for pointing to the corresponding sampling picture; the traversal module is used for traversing the sampling sound, and when the (i + N) th sampling sound is traversed, comparing whether a sampling picture pointed by the (i + N) th sampling sound and a picture pointed by the (i) th sampling frame are changed; if so, continuing to traverse; if no change exists, the (i + N) th sampling sound is directed to the sampling picture directed by the ith sampling sound; the storage module is used for storing the sampling sound and the sampling picture pointed by the sampling sound pointer; and the circulating module is used for continuously reading the subsequent monitoring video into the cache after the current cached monitoring video is processed, and repeating the steps.

According to an aspect of the invention, the surveillance video data storage device further comprises a partitioning module for partitioning the sample picture into X Y grid pictures, wherein X, Y is a positive integer greater than 1, the N sample sounds have X Y pointers, each pointer pointing to a grid of sample pictures; the traversal module is configured to traverse the sampling sound, and when the i + N th sampling sound is traversed, compare whether a sampling picture pointed by the i + N th sampling sound and a picture pointed by the i th sampling frame have a change, specifically: when the (i + N) th sampling sound is traversed, comparing whether a sampling picture pointed by each pointer in the (i + N) th sampling sound X Y pointers and a picture pointed by a corresponding pointer in the (i) th sampling sound X Y pointers are changed or not; if so, continuing to traverse; and if no change exists, the changed corresponding item in the (i + N) th sampling sound X X Y pointers points to the sampling picture of the corresponding item of the ith sampling sound pointer.

According to an aspect of the present invention, the surveillance video data storage device further includes a detection module, configured to compare whether there is a change between the sampled picture pointed to by the i + N sampled sound and the picture pointed to by the i-K × N sampled frame if there is no change continuously detected K times, and if there is a change, the i + N sampled sound is pointed to unchanged.

According to an aspect of the present invention, the present invention further provides a device for playing surveillance video data, where the device is configured to play a video stored by the method for storing surveillance video data described in the present application; the method specifically comprises the following steps: the reading module is used for reading the monitoring video stored by the video storage method; the computing module is used for reading the corresponding sampling picture by taking the sampling sound as a reference and reversely deducing a video time axis according to the sound information; and the playing module is used for playing the video according to the time axis, the sampling sound and the sampling picture.

In the technical scheme provided by the invention, the volume of the monitoring video can be greatly reduced under the condition of keeping the action picture clear by deleting the unchanged picture frame; all background sound is reserved by taking the sound sampling signal as a control axis; the complexity of the algorithm is greatly reduced through simple comparison between the front frame and the back frame; the problems of sound asynchrony, pause and the like are solved through the pointer type corresponding relation of time sampling and picture sampling.

Drawings

A more complete understanding of exemplary embodiments of the present invention may be had by reference to the following drawings in which:

fig. 1 shows a correspondence of a sound sample frame and a picture sample frame;

FIG. 2 illustrates pointing an audio sample frame pointer to an unchanged picture;

fig. 3 shows the situation after deleting a duplicate picture frame;

fig. 4 shows a case of binning a picture frame;

fig. 5 shows a case of deleting a repeated binning picture;

figure 6 shows the restoration of video from stored data.

Detailed Description

The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, the same units/elements are denoted by the same reference numerals.

Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Further, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.

In one implementation, the present application provides a surveillance video data storage method.

And step S100, caching the monitoring video in a time period.

The monitoring video described herein may be a monitoring video generated by a CCTV monitoring system used in schools, shopping malls, etc., or a video generated by a monitoring system used in roads, environments, etc., and the present invention is not particularly limited to a specific system for capturing videos.

The step is applied to a server, the video is provided by a monitoring camera, and as the monitoring video is usually in a video stream form, for convenience of processing, the video stream can be stored in a section by section according to time, such as one minute by one minute or ten minutes by ten minutes, and the specific time is not limited by the invention and can be set according to actual requirements. After a period of video is read, it is buffered to form a small segment of a video file.

Step S110, extracting the sound of the monitoring video, and performing first frequency sampling on the sound to obtain a sampling sound.

The sound of the surveillance video is usually synchronized with the video in the form of an audio track, which is extracted when processing the buffered small segments of video of S100. Carrying out first frequency sampling on the sound, and carrying out first frequency sampling on the sound for facilitating subsequent processing; in addition, the sound in some audio tracks is already the sampled audio, and we can directly use this sampled audio, and the meaning of the sound sampling at the first frequency at this time can be to directly use the sampled audio, and certainly we can also resample the sampled audio to obtain the sampled sound at the target frequency. For the first frequency, a conventional frequency used in video and audio coding, such as 10KHZ, may be used, meaning 10000 samples per second.

Step S120, extracting a picture of the monitoring video, and performing second frequency sampling on the picture to obtain a sampling picture; the pictures of the control video usually have a certain frame rate, such as 30 frames and 60 frames of video, which means 30 or 60 pictures per second, when the original frame rate in the video meets the requirement, the pictures are subjected to second frequency sampling, and the obtained sampling pictures can be obtained by directly using the original frame rate, or naturally, if the frame rate is not convenient to process, we can also perform resampling to obtain new sampling pictures.

Step S130, initializing that the first frequency is N times of the second frequency, wherein N is a positive integer, every N sampling sounds correspond to one frame of sampling picture, and every N sampling sounds have a pointer used for pointing to the corresponding sampling picture; generally, the frame rate of sound is much greater than that of an image, for convenience of processing, the frame rate of sound is N times of that of an image, for convenience of describing sound, 100HZ sampling is adopted, 10HZ sampling is adopted for video, as shown in fig. 1, a first sampling sound points to a first frame picture, an 11 th sampling sound points to a second frame picture, at this time, N is 10, a pointer space is opened up in both the 1 st and 11 th sound frames, and the first and second frame pictures respectively point to the 1 st and 2 nd frame pictures (fig. 1 is only exemplarily referred to, and the number relationship is not strictly shown), and subsequent processing can be started after the correspondence relationship is initialized.

Most of the current videos are encoded and played by using timestamps, namely, a picture and audio are analyzed and encoded according to a time axis during encoding, the picture and sound with the time axis are respectively sent to a display card and a sound card to be processed during playing, the picture and sound are respectively processed by the display card and the sound card according to the time axis, accumulated errors easily occur when a processor of the display card or the sound card is busy, and automatic repair cannot be realized. For example, in the 1 st minute of a video, the sound is delayed for 1 second, and in the 5 th minute of the video, the sound is delayed for 1 second, so that the subsequent video always has 2 seconds of sound delay, cannot be automatically repaired, and can be solved only by restarting playing.

Sampling sound, taking the sound as a control axis, pushing a section of video each time, reversely pushing a time axis through sound sampling information during playing, and displaying the time axis on a playing interface, wherein each section of video is 5 minutes, and the video played by the first section of video is from 00: 00-04: 59, the starting point of the second video segment is 05: 00, the time axis is analogized in turn, and the sound and the picture of the second video segment are corresponding, the time axis information in the playing process can be determined according to the sampling frequency, taking a 10HZ picture as an example, when the first video segment plays the 100 th picture, the time is 00: 10, when the second video segment plays the 200 th picture, the time is 05: 20. because the synchronization is performed once again, the sound delay of the first period of time is not reserved in the second video, thereby eliminating the accumulated error. For example, the video of the 1 st to 5 th minutes is pushed firstly, if the sound delay of 1 second exists in the video of the 1 st to 5 th minutes, the second section of video is pushed out in the 6 th to 10 th minutes, and the time axis displayed in the second section of video is reversely deduced from the correspondence of the sound axis and the picture axis, so that the problem of picture and sound asynchronization is solved. The above-mentioned segment time is only an example, and it is obvious that if the video segment is divided into smaller segments, for example, 10 seconds segments, the user hardly feels the sound abnormality when playing, and the specific parameters can be selected by those skilled in the art as needed.

In the step, the audio sampling signal is used as the control axis, so that on one hand, the stored video stores complete audio information, and the problem that the stored dynamic picture cannot store complete audio information is solved. On the other hand, the sampling audio is used as a control axis, the synchronization between the audio and the picture is strictly controlled, the problem of sound asynchronism can not occur during playing, and meanwhile, the problem of sound blockage can not occur when the sound axis is complete.

Step S140, traversing the sampling sound, and comparing whether the sampling picture pointed by the (i + N) th sampling sound and the picture pointed by the (i) th sampling frame have changes when the (i + N) th sampling sound is traversed; if so, continuing to traverse; if there is no change, the (i + N) th sample sound is directed to the sample picture to which the (i) th sample sound is directed.

It should be noted that the "change" is not a change of any slight change, and in the field of image processing, whether an image changes is usually measured by a certain change threshold, and if the change is greater than 5%, the image is judged to be unchanged.

In order to conveniently and quickly judge which pictures can be discarded, a sound picture pointer is introduced, and when the pictures do not change, the pictures corresponding to the sound are directly pointed to the previous pictures.

As shown in fig. 2, when traversing to the 11 th sampling sound, that is, the sound corresponding to the 2 nd picture frame, and finding that the 2 nd picture is changed from the 1 st picture, the pointer is not changed, and the 3 rd to 5 th pictures are not changed, the corresponding pointers are all modified to the 2 nd picture, and the corresponding pointer is not changed when the picture is changed to the 6 th picture, and so on, and the processing of the directly processed video and sound is completed. As shown in fig. 3, after the processing is completed, the pictures of the 3 rd to 5 th and 7 th to 8 th frames can be discarded, and the high definition pictures of the 1 st, 2 nd and 6 th frames are saved.

Of course, since there is no pointer information in the intermediate audio signal, the traversal described herein can jump directly to the (i + N) th audio acquisition without each having to process it.

In a more preferred embodiment, to further increase the compression effect, in a more preferred embodiment, when traversing to the (i + N) -th sampling sound, comparing whether the sample picture pointed by each pointer in the (i + N) -th sampling sound X Y and the picture pointed by the corresponding pointer in the (i) -th sampling sound X Y are changed, wherein X, Y may be a positive integer greater than 1.

As shown in fig. 4, the picture is divided into 2 x 2 cells. In the process of changing from the first picture frame to the second picture frame, only the picture at the upper right corner is changed, and in the next audio pointer, three unchanged pointers point to the three unchanged blocks in the previous picture, and the changed pointers are modified to point to the upper right corner.

In the step, the picture without change is not stored, so that the storage space is greatly saved, meanwhile, the stored picture frame does not need to be subjected to low-resolution coding, and more details can be reflected when the video is amplified.

During playing, because the audio is used as a control axis, and a corresponding picture is arranged between every N sampling audios, the playing is continuous, and the technical problem that the playing is not connected when only the moving pictures are stored is solved.

During storage, the algorithm is simple because only simple picture change comparison is adopted, and has certain compression rate loss compared with H.26X, but the processing speed is several times faster, so that the problems of video overflow, overlarge server pressure and the like are solved.

In a preferred scheme, when traversing to the (i + N) th sampling sound, comparing whether the sampling picture pointed by the (i + N) th sampling sound and the picture pointed by the (i) th sampling frame are changed, if K times of no change are continuously detected, comparing whether the sampling picture pointed by the (i + N) th sampling sound and the picture pointed by the (i-K) N th sampling frame are changed, if so, the (i + N) th sampling sound is pointed by the unchanged; if there is no change, the (i + N) th sample sound is directed to the sample picture to which the (i) th sample sound is directed.

Since the current motion detection mainly adopts a differential method or an optical flow method, both the differential method or the optical flow method are insensitive to light and shadow changes, and the conventional practice in the field of the picture change does not judge that the picture has no change at all to determine that the picture has no change, but determines whether the picture has a change according to a certain change rate threshold, if the picture has a change exceeding 1%, the picture is considered to have a change, and if the detection is carried out in a fixed time, such as once detection in five minutes, or at fixed frame intervals, accumulated errors may be generated.

For example, a picture at 9 am and a picture at 10 am are consistent on a main object, but there may be a difference in shadow, and when there is no moving object in the pictures from 9 am to 9 am and 59 am, the change in each minute and last minute may not exceed 1%, and there are only small changes in shadow, which cannot be detected by the motion detection algorithm. When a person is detected to enter the picture at 10 points, if the processing method is still adopted, the picture at 9 th point to 9 th point 59 th point may be the same, but not only the person appears at 10 points, but also the shadow of the picture changes suddenly, for example, the shadow of a tree becomes suddenly short, so that the person gives a very incoherent feeling, and the viewer feels that the video may be fake, and doubts are generated on the authenticity of the monitored video.

In order to solve the above problem, if K times of no change are continuously detected, the sampled picture pointed by the i + N sampled sounds and the picture pointed by the i-K N sampled frames are compared to determine whether there is a change. For example, at 9:

00 to 9: the picture change is checked every minute at 10, and each and the last comparison is considered as no change due to only slight shading changes, but if the comparison is made directly 9: 00 and 9: 10, a more obvious shade change can be compared, and the "change" threshold is reached, in which case the ratio of 9: 00 instead of 9: 10, and 9: 10 to display shading changes on the video.

Step S150, storing the sampling sound and the sampling picture pointed by the sampling sound pointer; after the pointer is modified, the picture not pointed by the pointer is a repeated picture, and at this time, the sampling sound can be used as the control axis again, and the sampling sound, the sampling picture pointed by the sampling sound and the specific information of the pointer can be sequentially traversed and stored. Taking fig. 3 as an example, pictures of frames 1, 2 and 6 are stored, and taking fig. 5 as an example, pictures of the upper right corners of the frame 1 and the frame 2 are stored.

And step S160, after the current cached monitoring video is processed, continuing to read the subsequent monitoring video into the cache, and repeating the steps.

Because the monitoring video is in a video stream form, the video which is continuously shot is still available in the follow-up process after the video is processed for a period of time, and therefore, the steps are only required to be repeated.

Based on the control video stored by the method, the invention also discloses a monitoring video data playing method.

In another implementation, the present application provides a method for playing a surveillance video.

And step S200, reading the monitoring video stored by the method.

In the method, the videos are processed and then stored in the server, and the server reads the video information of the time designated by the user through the viewing request of the user. For example, the video information specified by the user is 1 month, 1 day, 10 in 2020: 00, reading the video of the corresponding time period by one section. When the server is running, the server directly reads the data, and when the client is running, the server transmits the related data to the client.

And step S210, reading the corresponding sampling picture by taking the sampling sound as a reference, restoring the video, and reversely deducing the video time axis according to the sound information.

In this step, it is necessary to combine videos specified by the user, for example, the video information specified by the user is 10/1/2020: 00, the server reads the video in the corresponding time period and restores the video according to the sampling information of the sound, as shown in fig. 6, and the reverse playing start time is 10: 00 and then synthesizing the time axis according to the sampling rule for subsequent playing. When it needs to be explained, this step may be performed at the server or at the client.

And step S220, playing the video according to the time axis, the sampling sound and the sampling picture.

The step is carried out at the client, and after the time axis is obtained, the sound and the picture are respectively pushed to the sound card and the display card, and then video playing can be carried out.

In another embodiment, the present application provides a surveillance video data storage device, comprising:

and the cache module is used for caching the monitoring video in a time period.

The module is applied to a server, videos are provided by a monitoring camera, and as the monitoring videos are usually in a video stream form, for convenience of processing, the video stream can be stored in a section by section according to time, such as one minute by one minute or ten minutes by ten minutes. After a period of video is read, it is buffered to form a small segment of a video file.

The first sampling module is used for extracting the sound of the monitoring video and carrying out first frequency sampling on the sound to obtain sampling sound.

The second sampling module is used for extracting the picture of the monitoring video and carrying out second frequency sampling on the picture to obtain a sampling picture; the pictures of the control video usually have a certain frame rate, such as 30 frames and 60 frames of video, which means 30 or 60 pictures per second, when the original frame rate in the video meets the requirement, the pictures are subjected to second frequency sampling, and the obtained sampling pictures can be obtained by directly using the original frame rate, or naturally, if the frame rate is not convenient to process, we can also perform resampling to obtain new sampling pictures.

The initialization module is used for initializing that the first frequency is N times of the second frequency, wherein N is a positive integer, every N sampling sounds correspond to one frame of sampling picture, and every N sampling sounds are provided with a pointer which is used for pointing to the corresponding sampling picture; generally, the frame rate of sound is much greater than that of an image, for convenience of processing, the frame rate of sound is N times of that of an image, for convenience of describing sound, 100HZ sampling is adopted, 10HZ sampling is adopted for video, as shown in fig. 1, a first sampling sound points to a first frame picture, an 11 th sampling sound points to a second frame picture, at this time, N is 10, a pointer space is opened up in both the 1 st and 11 th sound frames, and the first and second frame pictures respectively point to the 1 st and 2 nd frame pictures (fig. 1 is only exemplarily referred to, and the number relationship is not strictly shown), and subsequent processing can be started after the correspondence relationship is initialized.

In the module, because the audio sampling signal is used as the control axis, on one hand, the stored video stores complete audio information, and the problem that the stored dynamic picture can not store complete audio information is solved. On the other hand, the sampling audio is used as a control axis, the synchronization between the audio and the picture is strictly controlled, the problem of sound asynchronism can not occur during playing, and meanwhile, the problem of sound blockage can not occur when the sound axis is complete.

The traversal module is used for traversing the sampling sound, and when the (i + N) th sampling sound is traversed, comparing whether a sampling picture pointed by the (i + N) th sampling sound and a picture pointed by the (i) th sampling frame are changed; if so, continuing to traverse; if there is no change, the (i + N) th sample sound is directed to the sample picture to which the (i) th sample sound is directed.

The storage module is used for storing the sampling sound and the sampling picture pointed by the sampling sound pointer; after the pointer is modified, the picture not pointed by the pointer is a repeated picture, and at this time, the sampling sound can be used as the control axis again, and the sampling sound, the sampling picture pointed by the sampling sound and the specific information of the pointer can be sequentially traversed and stored. Taking fig. 3 as an example, pictures of frames 1, 2 and 6 are stored, and taking fig. 5 as an example, pictures of the upper right corners of the frame 1 and the frame 2 are stored.

And the circulating module is used for continuously reading the subsequent monitoring video into the cache after the current cached monitoring video is processed, and repeating the steps.

Based on the control video stored by the method of the first embodiment, the invention also discloses a monitoring video data playing device.

And the reading module is used for reading the monitoring video stored by the video storage method.

In the modules, videos are processed and then stored in the server, and the server reads video information of time designated by a user through a viewing request of the user. For example, the video information specified by the user is 1 month, 1 day, 10 in 2020: 00, reading the video of the corresponding time period by one section. When the server is running, the server directly reads the data, and when the client is running, the server transmits the related data to the client.

And the computing module is used for reading the corresponding sampling picture by taking the sampling sound as a reference and reversely deducing the video time axis according to the sound information.

In this module, it is necessary to combine videos specified by the user, for example, the video information specified by the user is 10/1/2020: 00, the server reads the video in the corresponding time period and then reversely deduces the playing start time to be 10 according to the sampling information of the sound: 00 and then synthesizing the time axis according to the sampling rule for subsequent playing. When it needs to be explained, this step may be performed at the server or at the client.

And the playing module is used for playing the video according to the time axis, the sampling sound and the sampling picture.

In this application, the term "plurality" means two or more unless explicitly defined otherwise. The terms "mounted," "connected," "fixed," and the like are to be construed broadly, and for example, "connected" may be a fixed connection, a removable connection, or an integral connection; "coupled" may be direct or indirect through an intermediary. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.

In the description herein, the description of the terms "one embodiment," "some embodiments," "specific embodiments," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A surveillance video data storage method, comprising the steps of:

caching a monitoring video of a time period to obtain a first video segment;

extracting the sound of the monitoring video from the first video segment, and performing first frequency sampling on the sound to obtain a sampling sound;

extracting a picture of the monitoring video from the first video segment, and performing second frequency sampling on the picture to obtain a sampling picture;

wherein the first frequency is N times the second frequency, where N is a positive integer;

initializing a first sampling sound corresponding to a first sampling picture, wherein every N sampling sounds sequentially correspond to a frame of sampling picture, and each N sampling sounds are provided with a pointer which is used for pointing to the corresponding sampling picture;

traversing the sampling sound, and comparing whether a sampling picture pointed by the (i + N) th sampling sound and a picture pointed by the (i + N) th sampling frame are changed or not when the (i + N) th sampling sound is traversed; if so, continuing to traverse; if no change exists, the (i + N) th sampling sound is directed to the sampling picture directed by the ith sampling sound;

storing the sampling sound and the sampling picture pointed by the sampling sound pointer;

and after the current cached monitoring video is processed, continuing to read the subsequent monitoring video into the cache, and repeating the steps.

2. The surveillance video data storage method according to claim 1, characterized by:

dividing the sample picture into X Y grid pictures, wherein X, Y is a positive integer greater than 1;

every N sampling sounds have X Y pointers, and each pointer points to one sampling picture;

the step of traversing the sampling sound, when traversing to the (i + N) th sampling sound, comparing whether a sampling picture pointed by the (i + N) th sampling sound and a picture pointed by the (i) th sampling frame have changes, specifically: when the (i + N) th sampling sound is traversed, comparing whether a sampling picture pointed by each pointer in the (i + N) th sampling sound X Y pointers and a picture pointed by a corresponding pointer in the (i) th sampling sound X Y pointers are changed or not; if so, continuing to traverse; and if no change exists, the changed corresponding item in the (i + N) th sampling sound X X Y pointers points to the sampling picture of the corresponding item of the ith sampling sound pointer.

3. The surveillance video data storage method according to claim 1 or 2, characterized by: if K times of no change are continuously detected, comparing whether the sample picture pointed by the i + N sample sound and the picture pointed by the i-K N sample frame are changed or not; if there is a change, the i + N sample sound direction is unchanged.

4. A method for playing monitoring video data is characterized in that: for playing video stored by the surveillance video data storage method according to claim 1 or 2; the method specifically comprises the following steps: reading a video stored by the surveillance video data storage method according to claim 1 or 2; reading a corresponding sampling picture by taking the sampling sound as a reference, restoring a video, and reversely deducing a video time axis according to sound information; and playing the video according to the time axis, the sampling sound and the sampling picture.

5. A surveillance video data storage device, comprising:

the cache module is used for caching the monitoring video in a time period to obtain a first video segment;

the first sampling module is used for extracting the sound of the monitoring video from the first video segment and carrying out first frequency sampling on the sound to obtain a sampling sound;

the second sampling module is used for extracting the picture of the monitoring video from the first video segment and carrying out second frequency sampling on the picture to obtain a sampling picture; wherein the first frequency is N times the second frequency, where N is a positive integer;

the device comprises an initialization module, a sampling module and a display module, wherein the initialization module is used for initializing a first sampling sound corresponding to a first sampling picture, every N sampling sounds sequentially correspond to a frame of sampling picture, and every N sampling sounds are provided with a pointer which is used for pointing to the corresponding sampling picture;

the traversal module is used for traversing the sampling sound, and when the (i + N) th sampling sound is traversed, comparing whether a sampling picture pointed by the (i + N) th sampling sound and a picture pointed by the (i) th sampling frame are changed; if so, continuing to traverse; if no change exists, the (i + N) th sampling sound is directed to the sampling picture directed by the ith sampling sound;

the storage module is used for storing the sampling sound and the sampling picture pointed by the sampling sound pointer;

6. The surveillance video data storage device of claim 5, comprising:

a partitioning module for partitioning the sample picture into X Y grid pictures, wherein X, Y is a positive integer greater than 1,

the traversal module is configured to traverse the sampling sound, and when the i + N th sampling sound is traversed, compare whether a sampling picture pointed by the i + N th sampling sound and a picture pointed by the i th sampling frame have a change, specifically: when the (i + N) th sampling sound is traversed, comparing whether a sampling picture pointed by each pointer in the (i + N) th sampling sound X Y pointers and a picture pointed by a corresponding pointer in the (i) th sampling sound X Y pointers are changed or not; if so, continuing to traverse; and if no change exists, the changed corresponding item in the (i + N) th sampling sound X X Y pointers points to the sampling picture of the corresponding item of the ith sampling sound pointer.

7. The surveillance video data storage device of claim 5 or 6, wherein: and the detection module is used for comparing whether the sample picture pointed by the i + N sample sound and the picture pointed by the i-K x N sample frame are changed or not if K times of no change are continuously detected, and if so, keeping the i + N sample sound pointed direction unchanged.

8. A monitoring video data playing device is characterized in that: the apparatus is used for playing the video stored by the monitoring video data storage method according to claim 1 or 2; the method specifically comprises the following steps: a reading module for reading the surveillance video stored in the surveillance video data storage method according to claim 1 or 2; the computing module is used for reading the corresponding sampling picture by taking the sampling sound as a reference, restoring a video and reversely deducing a video time axis according to sound information; and the playing module is used for playing the video according to the time axis, the sampling sound and the sampling picture.