CN113255438B - Structured video file marking method, system, host and storage medium - Google Patents

Structured video file marking method, system, host and storage medium Download PDF

Info

Publication number
CN113255438B
CN113255438B CN202110390544.3A CN202110390544A CN113255438B CN 113255438 B CN113255438 B CN 113255438B CN 202110390544 A CN202110390544 A CN 202110390544A CN 113255438 B CN113255438 B CN 113255438B
Authority
CN
China
Prior art keywords
mark
association table
list
recorded video
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110390544.3A
Other languages
Chinese (zh)
Other versions
CN113255438A (en
Inventor
朱波
林睦权
贝树
杨宗睿
温新昌
唐仕元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Samoon Science & Technology Co ltd
Original Assignee
Shenzhen Samoon Science & Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Samoon Science & Technology Co ltd filed Critical Shenzhen Samoon Science & Technology Co ltd
Priority to CN202110390544.3A priority Critical patent/CN113255438B/en
Publication of CN113255438A publication Critical patent/CN113255438A/en
Application granted granted Critical
Publication of CN113255438B publication Critical patent/CN113255438B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/7867Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a structured video file marking method, a system, a host and a storage medium, wherein the method comprises the steps of obtaining image information, recording the image information in real time and establishing an association table; acquiring a scene mode; calibrating a frame of a corresponding time point in the recorded video to be used as an event node, and writing the time point of the event node into an association table; selecting a mark in the mark list, and writing the selected mark into an association table corresponding to the time point of the event node; writing the association table into the recorded video and uploading the recorded video to the background system; acquiring a recorded video recorded into a background, reading an association table, corresponding information in the association table to a file name of the recorded video, and recording into a management form; classifying each event node into an index page in a management form based on the corresponding relation between the event node and the mark in the original association table; and establishing a link relation between the event node and a corresponding frame of the background storage video. The method and the device have the effect of improving the consulting efficiency of the law enforcement record images.

Description

Structured video file marking method, system, host and storage medium
Technical Field
The present application relates to the field of video file marking, and in particular, to a method, a system, a host and a storage medium for marking a structured video file.
Background
In a law enforcement or inspection scene, a law enforcement/work recorder is usually adopted to record information of a law enforcement/inspection site, and in order to ensure the integrity of the information, all information from the beginning to the end of the law enforcement/inspection is usually recorded, which results in generation of massive video data, great inconvenience for later backup and query, and low efficiency because a user needs to check the whole file from the beginning to find out a focus.
Disclosure of Invention
In order to improve the consulting efficiency of law enforcement recorded images, the application provides a structured video file marking method, a system, a host and a storage medium.
In a first aspect, the present application provides a method for marking a structured video file, which adopts the following technical solution:
a structured video file marking method is used for marking and sorting real-time videos in law enforcement or inspection scenes and comprises a foreground processing flow and a background processing flow;
wherein, the foreground processing flow comprises:
acquiring image information of a current scene, recording the image information in real time, and establishing an association table corresponding to a recorded video, wherein the information of the association table comprises a scene type, an equipment feature code, a time node and a mark;
acquiring a current scene mode, and writing the type of the scene mode and the equipment feature code into an association table of a recorded video, wherein the scene mode is selected manually or automatically;
acquiring a calibration signal, calibrating a frame of a corresponding time point in the recorded video based on the acquired time point of the calibration signal, taking the frame as an event node, and writing the time point of the event node into an association table; wherein, the calibration signal is manually input or automatically generated;
selecting a mark in a mark list based on the image information acquired in real time, and writing the selected mark into an association table corresponding to the time point of the event node; the mark list comprises a plurality of marks corresponding to different event types, and the marks in the mark list are sorted based on the use frequency under the scene mode;
writing the association table into the recorded video based on a preset rule and uploading the recorded video to a background system;
the background processing flow comprises the following steps:
acquiring a recorded video recorded into a background, reading an association table, corresponding information in the association table to a file name of the recorded video, and recording into a management form;
classifying each event node into an index page in a management form based on the corresponding relation between the event node and the mark in the original association table, wherein the index page is provided with a plurality of index pages which respectively correspond to different marks;
and establishing a link relation between the event node and a corresponding frame of the background storage video.
By adopting the technical scheme, the foreground processing flow is usually completed by the law enforcement recorder, and the image of the current scene is obtained in real time and the scene is judged. The scene judgment can pre-classify the recorded videos so as to call a mark list applicable to the corresponding scene. And selecting the event node, namely capturing the corresponding frame in the recorded video, so that sufficient time is provided for a user to select the corresponding mark from the mark list in the follow-up process, and the user can quickly locate the frame section corresponding to the target event in the recorded video according to the event node in the process of retrieving the video file, thereby effectively improving the retrieval efficiency of the user on the video file.
In addition, the event types of all recorded videos are classified and summarized in the background processing flow, so that the method is beneficial for a user to quickly search and look up the related event types when the user reads the video files in the later period.
Preferably, the sorting of the marks in the mark list based on the frequency of use in the scene mode includes:
acquiring a mark list and a mark frequency table corresponding to the current scene mode, wherein the mark frequency table comprises the use frequency of each mark in the mark list;
based on the selected mark, increasing the use frequency of the corresponding event type in the mark frequency table;
sorting the marks in the mark list in a descending order based on the use frequency of each mark in the mark frequency table;
temporarily sequencing the marks in the current mark list based on the occurrence frequency of a mark group in the historical recorded video and the selection of a preamble mark, wherein the mark group is a combination formed by any continuous marks in the recorded video;
and after the current mark selection is finished, canceling the temporary ordering of the current mark list.
The law enforcement recorder is generally a push-button type device with a small screen, and when a mark list is displayed on the screen of the law enforcement recorder, the user needs to push a push button for moving a selection cursor up and down for selecting a mark to be used. When the types of the marks are more and a plurality of event types exist in the single video recording process, law enforcement personnel are required to repeatedly press the keys to browse the mark list, the time spent is longer, the use experience is poorer, and the service life of the keys can be shortened by frequently pressing the keys.
By adopting the technical scheme, one scene mode corresponds to one mark list, so that marks of various scene types are prevented from being classified into the same list, and the mark types in the mark list are reduced.
Since some tags are necessary in a certain scene and are used less frequently, the tags can be tagged according to the use frequency. After each time of using the mark, the use frequency of the mark recorded in the mark frequency table will rise, and the mark with high use frequency is arranged at the front position in the mark list, so that the required mark can be quickly found when the mark is selected in the mark list, and the press frequency of the key is reduced.
Because some marks are arranged at the position behind the mark list after being sorted for the first time, but the event type corresponding to the marks in the scene mode is low-frequency but must appear, the marks need to be pressed for a plurality of times when the marks are selected, the marks in the mark list are temporarily sorted through the relative relation between the marks and the front marks and the rear marks, and the marks are temporarily placed when the marks are pressed in the pressing sequence of the specific marks so as to be convenient for selecting the marks.
Preferably, the temporarily sorting the tags in the tag list based on the frequency of occurrence of the tag group and the selection of the preamble tag in the history-recorded video includes:
reading a mark group list and a mark group frequency table corresponding to the current scene mode, wherein the mark group list comprises a plurality of mark groups, each mark group comprises a preposed mark sequence and a target mark, the preposed mark sequence is formed by one or more continuous marks selected by preambles, and the mark group frequency table comprises the generation frequency of each preposed mark sequence in the mark group list;
based on a mark group formed by the currently selected mark and a preposed mark sequence, increasing the use frequency of the formed mark group in a mark group frequency table, wherein the mark contained in the preposed mark sequence is a null mark or one mark or more than one mark;
and judging whether the use frequency of the mark group in the mark group frequency table exceeds a preset threshold value, and if so, setting the mark in the mark list, wherein the use frequency of the mark group is the proportion of the mark group in all mark groups with the same mark sequence and the same length of the preposed mark sequence.
By adopting the technical scheme, when the mark list is called out, the system temporarily sequences the marks in the mark list. Each temporal ordering is based on the selection of the preamble mark, and therefore each mark is subject to a different temporal reordering. Each scene mode is correspondingly recorded with a mark group list and a mark group frequency table, the mark group frequency table records the use habit of a user, when the use of a certain mark group in the mark group list exceeds a preset threshold value, namely the use of a certain preposed mark sequence exceeds a certain frequency, a target mark is set, and people can directly obtain the target mark at the top of the mark list under the specific situation without turning over the mark used by the low frequency. After the current selection, the temporary rearrangement is cancelled, another temporary rearrangement is performed again after the next selection, or the arrangement of the mark list is continued with the single mark use frequency.
Preferably, the preset rule is to write the association table corresponding to the recorded video into the recorded video in a metadata form.
Preferably, the acquiring the current scene mode includes:
acquiring an environmental audio based on a preset duration;
extracting human voice information in the environmental audio, and matching the human voice information with a scene mode based on a built-in dictionary, wherein the scene mode is stored in the built-in dictionary;
and if the matching is successful, the matched scene mode is used as the current scene mode, and if the matching is failed, manual matching is prompted.
By adopting the technical scheme, the current scene mode can be acquired according to the voice, and the phenomenon that people spend too much time in selecting too many scene modes is avoided.
In a second aspect, the present application provides a structured video file marking system, which adopts the following technical solutions:
a structured video file system is used for realizing the structured video file method on a law enforcement recorder, and comprises the following steps:
the recording module is used for recording and acquiring image information of the current scene in real time and establishing an association table corresponding to the recorded video;
the scene selection module is used for inputting a current scene mode and writing the type of the scene mode and the equipment feature code into an association table of the recorded video;
the calibration module is used for inputting a calibration signal, calibrating frames of corresponding time points in the recorded video based on the acquired time points of the calibration signal and using the frames as event nodes, and writing the time points of the event nodes into the association table;
the marking module comprises a marking list, is used for selecting a mark in the marking list and writes a time point corresponding to the event node into the association table;
the uploading module is used for writing the association table into the recorded video and uploading the recorded video to the background system;
the integration module is used for acquiring the recorded video recorded into the background, reading the association table, corresponding information in the association table to the file name of the recorded video and recording the information into the management form;
and the index module comprises a plurality of link buttons and is used for classifying each event node into an index page in the management form according to the corresponding relation between the event node and the mark in the original association table, and the link buttons correspond to the event node in the index page and have a link relation with the corresponding frame of the background storage video.
By adopting the technical scheme, the foreground processing flow is usually completed by a law enforcement recorder, and the image of the current scene is obtained in real time and the scene is judged. The scene judgment can perform pre-classification on the recorded videos so as to call a mark list applicable to the corresponding scene. And selecting the event node, namely capturing the corresponding frame in the recorded video, so that sufficient time is provided for a user to select the corresponding mark from the mark list in the follow-up process, and the user can quickly position the frame node corresponding to the target event in the recorded video according to the event node in the process of reading the video file, thereby effectively improving the efficiency of reading the video file by the user.
In addition, the event types of all recorded videos are classified and summarized in the background processing flow, so that the method is beneficial for a user to quickly search and look up the related event types when the user reads the video files in the later period.
In a third aspect, the present application provides a host, which adopts the following technical solution:
a host computer comprises a memory and a processor, wherein the memory is stored with a computer program which can be loaded by the processor and executes the foreground processing flow.
In a fourth aspect, the present application provides a computer-readable storage medium, which adopts the following technical solutions:
a computer readable storage medium storing a computer program that can be loaded by a processor and executes the foreground processing flow.
Drawings
Fig. 1 is a flow chart of foreground processing flow in the embodiment of the present application.
Fig. 2 is a flowchart of a method for sorting tags in a tag list based on usage frequency in the scene mode in this embodiment of the application.
Fig. 3 is a flowchart of step four of a method for sorting tags in a tag list based on usage frequency in the scene mode in this embodiment.
Fig. 4 is a flow chart of a background processing flow in the embodiment of the present application.
Detailed Description
The present application is described in further detail below with reference to figures 1-4.
Currently, in order to retain evidence or review, law enforcement/work recorders are often used to record law enforcement/inspection site information during law enforcement or inspection. The law enforcement/work recorder is usually a button-type instrument with a small screen, which has the functions of recording, taking pictures, talking and the like, and needs to adjust the position of a cursor on the small screen by pressing a button for many times so as to select the function to be used. At the time of recording, all information from the start to the end of law enforcement/inspection is generally recorded, which results in a huge amount of video data. When the user performs the disk duplication or searches for the key information in the later period, the user needs to spend a large amount of time even watching at double speed, and the efficiency is very low.
In order to overcome the above problems, an embodiment of the present application discloses a structured video file marking method. Referring to fig. 1, the method includes a foreground processing flow and a background processing flow, where the foreground processing flow is executed by a law enforcement/work recorder, and in this embodiment, the law enforcement recorder is taken as an example, and the background processing flow is executed by a server.
Referring to fig. 1, the foreground processing flow includes:
s1, obtaining image information of a current scene, recording the image information in real time, and establishing an association table corresponding to a recorded video, wherein the information of the association table comprises a scene type, an equipment feature code, a time node and a mark.
The law enforcement recorders are typically proprietary, so the device signature of each law enforcement recorder corresponds to the identity of the user, in other words, the device signatures in the association table produced by each law enforcement recorder are all the same. Due to the fact that law enforcement scenes are diverse, for example, traffic drunk driving inspection scenes, general population inspection scenes, hotel inspection scenes, tax inspection scenes and the like are provided, and the method is beneficial to rapidly obtaining target information in the later period when videos are recorded in a double-disk mode by initially classifying the law enforcement scenes. Different event types are corresponding to each scene mode, and the association table records marks corresponding to the event types and time nodes of occurrence.
And S2, acquiring a current scene mode, and writing the type of the scene mode and the equipment feature code into an association table of the recorded video, wherein the scene mode is selected manually or automatically.
The selection of the scene type can be manual selection, or the law enforcement recorder reads external information for automatic identification and selection, when the scene modes are too many, people need to press the key many times to select the corresponding scene mode, which is inconvenient, therefore, the system automatically identifies the current scene for selecting the scene mode, and the operation cost can be effectively reduced. The automatic identification method can be various, such as automatic identification through an environment image, identification through an environment sound, or automatic selection through presetting, and the following methods are selected and not limited in the embodiment:
the method for acquiring the current scene mode comprises the following sub-steps:
the first substep: acquiring an environment audio based on a preset duration;
the preset time is preset by a user, and the audio recorded by the microphone is read only within a period of time which reaches the preset time. For example, the preset time duration is ten seconds, the system acquires the environmental audio within ten seconds of starting to record the video, and performs feature extraction based on the ten seconds of audio only.
And a second substep: extracting human voice information in the environmental audio, and matching the human voice information with a scene mode based on a built-in dictionary, wherein the scene mode is stored in the built-in dictionary;
and if the matching is successful, the matched scene mode is used as the current scene mode, and if the matching is failed, manual matching is prompted.
The audio information is subjected to feature extraction, so that the voice information is obtained, the ten-second limitation can avoid the input of excessive irrelevant information, the calculation force is saved, and the missing of key information is avoided. Briefly, an audio feature is a sequence of frames, and each frame is a multi-dimensional vector. This frame sequence contains information such as the frequency spectrum and amplitude of the ambient sound signal. The method for extracting audio features generally comprises the following steps: analog-to-digital conversion, direct current removal, framing, pre-emphasis, windowing, fast Fourier transform, mel domain filter bank, logarithm extraction, discrete cosine transform, MFCC, and differential operation to obtain audio features. Wherein, the logarithmic energy is obtained after framing for being used as the parameter of the difference operation.
And comparing the extracted features with features corresponding to a built-in dictionary, judging that matching is successful if the approximation degree is high, judging that matching is failed if the approximation degree is low, sending a voice prompt and entering a manual matching mode, wherein the manual matching mode can be a scene mode selected on a screen by using cases on a law enforcement recorder.
S3, acquiring a calibration signal, calibrating a frame of a corresponding time point in the recorded video based on the acquisition time point of the calibration signal, taking the frame as an event node, and writing the time point of the event node into an association table; wherein the calibration signal is manually input or automatically generated.
S4, selecting a mark in a mark list based on the image information acquired in real time, and writing the selected mark into an association table corresponding to the time point of the event node; the mark list comprises a plurality of marks corresponding to different event types, and the marks in the mark list are sorted based on the use frequency under the scene mode;
each scene corresponds to a corresponding event type, and event types between different scenes may have a superposition part and a difference part. For example, for a drunk driving survey, a license plate of a vehicle owner needs to be recorded, an inquiry process about information of the vehicle owner needs to be performed, a process about blowing air to an alcohol measuring instrument by the vehicle owner needs to be performed, and the like. For another example, for a hotel inspection scene, a door-knocking inquiry process, an inquiry process about tenant information, an inspection process about indoor conditions, and the like need to be recorded. Therefore, the time point of occurrence is marked when the scene event occurs, so as to correspond to the frame of the corresponding time point in the video, and the event occurrence time can be accurately positioned at the later stage.
As described above, the law enforcement recorder is a button type instrument with a small screen, and when a mark list is displayed on the screen of the law enforcement recorder, a button needs to be pressed many times to move a selection cursor up and down, thereby selecting a mark to be used. When the types of the marks are more and a plurality of event types exist in the single video recording process, law enforcement personnel are required to repeatedly press the keys to browse the mark list, the time spent is longer, the use experience is poorer, and the service life of the keys can be shortened by frequently pressing the keys. Since some tags are necessary in a certain scene and are used less frequently, the tags can be tagged according to the use frequency.
Referring to fig. 2, the method for sorting the tags in the tag list based on the frequency of use in the scene mode includes:
the method comprises the following steps: acquiring a mark list and a mark frequency table corresponding to the current scene mode, wherein the mark frequency table comprises the use frequency of each mark in the mark list;
step two: based on the selected mark, increasing the use frequency of the corresponding event type in the mark frequency table;
step three: sorting the marks in the mark list in a descending order based on the use frequency of each mark in the mark frequency table;
step four: temporarily sequencing the marks in the current mark list based on the occurrence frequency of a mark group in the historical recorded video and the selection of a preamble mark, wherein the mark group is a combination formed by any continuous marks in the recorded video;
step five: and after the current mark selection is finished, canceling the temporary ordering of the current mark list.
After each time of using the mark, the use frequency of the mark recorded in the mark frequency table will rise, and the mark with high use frequency is arranged at the front position in the mark list, so that the required mark can be quickly found when the mark is selected in the mark list, and the press frequency of the key is reduced.
For example, for a traffic drunk driving, there are at least three scenarios: recording the event type of the license plate of the vehicle owner, the event type of inquiry of people and the event type of air blowing from the vehicle owner to the alcohol measuring instrument. The recorded event of the vehicle owner's license plate is only once, the number of times of the event that the vehicle owner blows to the alcohol measuring instrument is determined according to the number of people who are driving by the vehicle, the number of times of the event that people are inquired is determined according to the number of people on the vehicle, then the mark for inquiring people is arranged at the first position, the mark for the vehicle owner to blow to the alcohol measuring instrument is arranged at the second position, and the mark for the recorded vehicle owner's license plate is arranged at the third position.
Because some marks are arranged at the position behind the mark list after being sorted for the first time, but the event type corresponding to the marks in the scene mode is low-frequency but must appear, the marks need to be pressed for a plurality of times when the marks are selected, the marks in the mark list are temporarily sorted through the relative relation between the marks and the front marks and the rear marks, and the marks are temporarily placed when the marks are pressed in the pressing sequence of the specific marks so as to be convenient for selecting the marks.
Referring to fig. 3, the temporary sorting of the tags in the tag list based on the steps of the method of using frequency sorting in the scene mode includes the following sub-steps:
the first substep: reading a mark group list and a mark group frequency table corresponding to the current scene mode, wherein the mark group list comprises a plurality of mark groups, each mark group comprises a preposed mark sequence and a target mark, the preposed mark sequence is formed by one or more continuous marks selected by preambles, and the mark group frequency table comprises the generation frequency of each preposed mark sequence in the mark group list;
and a second substep: based on a mark group formed by the currently selected mark and a preposed mark sequence, increasing the use frequency of the formed mark group in a mark group frequency table, wherein the mark contained in the preposed mark sequence is a null mark or one mark or more than one mark;
and a third substep: and judging whether the use frequency of the mark group in the mark group frequency table exceeds a preset threshold value, and if so, carrying out top setting on the mark in the mark list, wherein the use frequency of the mark group is the ratio of the mark group in all mark groups with the same mark sequence and the same length of the preposed mark sequence.
When the tag list is called out, the system temporarily orders the tags within the tag list. Each temporal ordering is based on the selection of the preamble mark, and therefore each mark is subject to a different temporal reordering. Each scene mode is correspondingly recorded with a mark group list and a mark group frequency table, the mark group frequency table records the use habit of a user, when the use of a certain mark group in the mark group list exceeds a preset threshold value, namely the use of a certain preposed mark sequence exceeds a certain frequency, a target mark is set, and people can directly obtain the target mark at the top of the mark list under the specific situation without turning over the mark used by the low frequency. After the current selection, the temporary rearrangement is cancelled, another temporary rearrangement is performed again after the next selection, or the arrangement of the mark list is continued with the single mark use frequency.
Still take the scene mode of traffic drunk driving as an example, and set the preset threshold value to 80%. The traffic police, as an operator, records the license plate of the owner of the vehicle, then inquires and lets the person in the front passenger seat perform blowing detection. Because the recorded marks corresponding to the license plate of the car owner are low in use frequency, the marks are arranged at the end without temporary sequencing, and the marks can be selected by pressing for many times. After the temporary rearrangement, because the pre-marking sequence is a null sequence when the pre-marking sequence is used as the target mark, namely, the system reads the characteristic of a null sequence, the mark corresponding to the license plate of the recorded owner is temporarily set. If the traffic police continuously queries for a plurality of times on the first mark, the frequency of the mark group corresponding to the query is increased, and the frequency of the mark group corresponding to the recorded license plate is reduced until the mark group is lower than 80 percent and is not set on the top.
When the next mark is selected, the preposed mark sequence is the mark corresponding to the license plate of the recorded owner, and the mark corresponding to the inquiry of the person is selected for top setting. Certainly, in the law enforcement process, after the license plate is shot, a person can directly blow air according to the actual situation, or different people are continuously inquired for several times, the mark can be manually selected, and in the situation, the required mark is still close to the top of the list, so that the operator can conveniently select the mark.
And S5, writing the association table into the recorded video based on a preset rule and uploading the recorded video to a background system. The preset rule is to write the association table corresponding to the recorded video into the recorded video in a metadata form.
And the event types of all recorded videos are classified and summarized in the background processing flow, so that the method is beneficial to a user to quickly search and look up the related event types when the user reads the video files in the later period.
Referring to fig. 4, the background processing flow includes:
t1, acquiring the recorded video recorded into the background, reading the association table, corresponding information in the association table to the file name of the recorded video, and recording into a management form;
t2, classifying each event node into an index page in a management form based on the corresponding relation between the event node and the mark in the original association table, wherein the index page is provided with a plurality of marks corresponding to different marks respectively;
and T3, establishing a link relation between the event node and a corresponding frame of the background storage video.
In the background processing flow, the marks are classified and summarized in the index pages, that is, the same marks are all located in the same index page. For example, in a scene of three times of traffic drunk driving, an event for recording a license plate is generated once, after the three recorded videos are uploaded to a background, a mark corresponding to the license plate event recorded in each video is stored in the same index page, and in the index page, an event node and a corresponding frame of the corresponding recorded video have a hyperlink relationship, and people can directly jump to the position of the recorded video corresponding to the event node by pressing the hyperlink to watch the copy and acquire required information.
The application also discloses a structured video file marking system, which adopts the following technical scheme:
a structured video file system is used for realizing the structured video file method on a law enforcement recorder, and comprises the following steps:
the recording module is used for recording and acquiring image information of the current scene in real time and establishing an association table corresponding to the recorded video;
the scene selection module is used for inputting a current scene mode and writing the type of the scene mode and the equipment feature code into an association table of the recorded video;
the calibration module is used for inputting a calibration signal, calibrating frames of corresponding time points in the recorded video based on the acquired time points of the calibration signal and using the frames as event nodes, and writing the time points of the event nodes into the association table;
the marking module comprises a marking list, is used for selecting a mark in the marking list and writes a time point corresponding to the event node into the association table;
the uploading module is used for writing the association table into the recorded video and uploading the recorded video to the background system;
the integration module is used for acquiring the recorded video recorded into the background, reading the association table, corresponding information in the association table to the file name of the recorded video and recording the information into the management form;
and the index module comprises a plurality of link buttons and is used for classifying each event node into an index page in the management form according to the corresponding relation between the event node and the mark in the original association table, and the link buttons correspond to the event nodes in the index page and have a link relation with the corresponding frame of the background storage video.
The application also discloses a host which comprises a memory and a processor, wherein the memory is stored with a computer program which can be loaded by the processor and can execute the foreground processing flow.
The application also discloses a computer readable storage medium, which stores a computer program capable of being loaded by a processor and executing the foreground processing flow.
Of course, the storage medium containing the computer-executable instructions provided in the embodiments of the present application is not limited to the method operations described above, and may also perform related operations in the foreground processing flow provided in any embodiment of the present application.
The computer-readable storage media of the embodiments of the present application may take any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a storage medium may be transmitted over any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or terminal. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above embodiments are preferred embodiments of the present application, and the protection scope of the present application is not limited by the above embodiments, so: all equivalent changes made according to the structure, shape and principle of the present application shall be covered by the protection scope of the present application.

Claims (6)

1. A structured video file marking method is characterized in that the method is used for marking and sorting real-time videos in law enforcement or inspection scenes and comprises a foreground processing flow and a background processing flow;
wherein, the foreground processing flow comprises:
acquiring image information of a current scene, recording the image information in real time, and establishing an association table corresponding to a recorded video, wherein the information of the association table comprises a scene type, an equipment feature code, a time node and a mark;
acquiring a current scene mode, and writing the type of the scene mode and the equipment feature code into an association table of a recorded video, wherein the scene mode is selected manually or automatically;
acquiring a calibration signal, calibrating a frame of a corresponding time point in the recorded video based on the acquisition time point of the calibration signal, taking the frame as an event node, and writing the time point of the event node into an association table; wherein, the calibration signal is manually input or automatically generated;
selecting a mark in a mark list based on the image information acquired in real time, and writing the selected mark into an association table corresponding to the time point of the event node; the mark list comprises a plurality of marks corresponding to different event types, and the marks in the mark list are sorted based on the use frequency under the scene mode;
the sorting of the marks in the mark list based on the use frequency in the scene mode comprises the following steps:
acquiring a mark list and a mark frequency table corresponding to the current scene mode, wherein the mark frequency table comprises the use frequency of each mark in the mark list;
based on the selected mark, increasing the use frequency of the corresponding event type in the mark frequency table;
sorting the marks in the mark list in a descending order based on the use frequency of each mark in the mark frequency table;
temporarily sorting the marks in the current mark list based on the occurrence frequency of a mark group and the selection of a preamble mark in the historical recorded video, wherein the mark group is a combination formed by any continuous marks in the recorded video;
after the current mark selection is finished, the temporary sorting of the current mark list is cancelled;
the temporarily sorting the marks in the mark list based on the occurrence frequency of the mark group and the selection of the preamble mark in the history recording video comprises:
reading a mark group list and a mark group frequency table corresponding to the current scene mode, wherein the mark group list comprises a plurality of mark groups, each mark group comprises a preposed mark sequence and a target mark, the preposed mark sequence is formed by one or more continuous marks selected by preambles, and the mark group frequency table comprises the generation frequency of each preposed mark sequence in the mark group list;
based on a mark group formed by the currently selected mark and a preposed mark sequence, increasing the use frequency of the formed mark group in a mark group frequency table, wherein the mark contained in the preposed mark sequence is a null mark or one mark or more than one mark;
judging whether the use frequency of the mark group in the mark group frequency table exceeds a preset threshold value, if so, carrying out top setting on the mark in the mark list, wherein the use frequency of the mark group is the ratio of the mark group in all mark groups with the same mark sequence and the same length of the preposed mark sequence;
writing the association table into the recorded video based on a preset rule and uploading the recorded video to a background system;
the background processing flow comprises the following steps:
acquiring a recorded video recorded into a background, reading an association table, corresponding information in the association table to a file name of the recorded video, and recording into a management form;
classifying each event node into an index page in a management form based on the corresponding relation between the event node and the mark in the original association table, wherein the index page is provided with a plurality of index pages which respectively correspond to different marks;
and establishing a link relation between the event node and a corresponding frame of the background storage video.
2. The method of claim 1, wherein the predetermined rule is writing an association table corresponding to the recorded video in a form of metadata.
3. The method of claim 1, wherein the obtaining the current scene mode comprises:
acquiring an environmental audio based on a preset duration;
extracting human voice information in the environmental audio, and matching the human voice information with a scene mode based on a built-in dictionary, wherein the scene mode is stored in the built-in dictionary;
and if the matching is successful, the matched scene mode is used as the current scene mode, and if the matching is failed, manual matching is prompted.
4. A structured video file marking system for implementing the structured video file marking method of any one of claims 1 to 3 on a law enforcement recorder, comprising:
the recording module is used for recording and acquiring image information of the current scene in real time and establishing an association table corresponding to the recorded video;
the scene selection module is used for inputting a current scene mode and writing the type of the scene mode and the equipment feature code into an association table of the recorded video;
the calibration module is used for inputting a calibration signal, calibrating frames of corresponding time points in the recorded video based on the acquired time points of the calibration signal and using the frames as event nodes, and writing the time points of the event nodes into the association table;
the marking module comprises a marking list, is used for selecting a mark in the marking list and writes a time point corresponding to the event node into the association table;
the uploading module is used for writing the association table into the recorded video and uploading the recorded video to the background system;
the integration module is used for acquiring the recorded video recorded into the background, reading the association table, corresponding information in the association table to the file name of the recorded video and recording the information into the management form;
and the index module comprises a plurality of link buttons and is used for classifying each event node into an index page in the management form according to the corresponding relation between the event node and the mark in the original association table, and the link buttons correspond to the event node in the index page and have a link relation with the corresponding frame of the background storage video.
5. A host comprising a memory and a processor, the memory having stored thereon a computer program that can be loaded by the processor and that executes the method for structured video file marking according to any of claims 1 to 3.
6. A computer-readable storage medium, characterized in that a computer program is stored which can be loaded by a processor and which performs the method for structured video file marking according to any of claims 1 to 3.
CN202110390544.3A 2021-04-12 2021-04-12 Structured video file marking method, system, host and storage medium Active CN113255438B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110390544.3A CN113255438B (en) 2021-04-12 2021-04-12 Structured video file marking method, system, host and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110390544.3A CN113255438B (en) 2021-04-12 2021-04-12 Structured video file marking method, system, host and storage medium

Publications (2)

Publication Number Publication Date
CN113255438A CN113255438A (en) 2021-08-13
CN113255438B true CN113255438B (en) 2023-03-31

Family

ID=77220751

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110390544.3A Active CN113255438B (en) 2021-04-12 2021-04-12 Structured video file marking method, system, host and storage medium

Country Status (1)

Country Link
CN (1) CN113255438B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114374813A (en) * 2021-12-13 2022-04-19 青岛海信移动通信技术股份有限公司 Multimedia resource management method, recorder and server
CN115587216B (en) * 2022-12-13 2023-08-22 广州电力工程监理有限公司 Calibration software management method, system and medium for supervision witness recorder

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050286863A1 (en) * 2004-06-23 2005-12-29 Howarth Rolf M Reliable capture of digital video images for automated indexing, archiving and editing
US8107541B2 (en) * 2006-11-07 2012-01-31 Mitsubishi Electric Research Laboratories, Inc. Method and system for video segmentation
CN103577412B (en) * 2012-07-20 2017-02-08 永泰软件有限公司 High-definition video based traffic incident frame tagging method
CN103702053B (en) * 2014-01-16 2017-05-10 深圳英飞拓科技股份有限公司 Video storage and search method and system as well as monitoring system
CN106649443B (en) * 2016-09-18 2020-06-02 江苏智通交通科技有限公司 Video data archival management method and system for law enforcement recorder
CN107608727A (en) * 2017-08-31 2018-01-19 努比亚技术有限公司 A kind of display methods of application program, mobile terminal and storage medium
CN108831456B (en) * 2018-05-25 2022-04-15 深圳警翼智能科技股份有限公司 Method, device and system for marking video through voice recognition

Also Published As

Publication number Publication date
CN113255438A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
US9965493B2 (en) System, apparatus, method, program and recording medium for processing image
JP4569471B2 (en) Electronic image storage method, electronic image storage device, and electronic image storage system
US8135239B2 (en) Display control apparatus, display control method, computer program, and recording medium
US7672508B2 (en) Image classification based on a mixture of elliptical color models
CN113255438B (en) Structured video file marking method, system, host and storage medium
US20070195344A1 (en) System, apparatus, method, program and recording medium for processing image
CN111046235A (en) Method, system, equipment and medium for searching acoustic image archive based on face recognition
CN111222373B (en) Personnel behavior analysis method and device and electronic equipment
KR101960667B1 (en) Suspect Tracking Apparatus and Method In Stored Images
US11037604B2 (en) Method for video investigation
US20130120612A1 (en) Content Storage Management in Cameras
JP2011244043A (en) Recorded video playback system
CN101547303B (en) Imaging apparatus, character information association method and character information association system
CN114387977A (en) Voice cutting trace positioning method based on double-domain depth features and attention mechanism
JP4866396B2 (en) Tag information adding device, tag information adding method, and computer program
CN111259789A (en) Face recognition intelligent security monitoring management method and system
CN110876090B (en) Video abstract playback method and device, electronic equipment and readable storage medium
CN111522992A (en) Method, device and equipment for putting questions into storage and storage medium
US8682834B2 (en) Information processing apparatus and information processing method
JP6210634B2 (en) Image search system
CN112926542B (en) Sex detection method and device, electronic equipment and storage medium
KR20150080058A (en) Video sharing system and method of black box for vehicle
CN110876029B (en) Video abstract playback method and device, electronic equipment and readable storage medium
KR20170120849A (en) Method and system for gathering information of car accident situation by video recording
CN112699720A (en) Monitoring method, device, storage medium and device based on character information set

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant