CN110830793A - Video transmission quality time domain detection method based on deep learning frequency scale identification - Google Patents
Video transmission quality time domain detection method based on deep learning frequency scale identification Download PDFInfo
- Publication number
- CN110830793A CN110830793A CN201911104622.8A CN201911104622A CN110830793A CN 110830793 A CN110830793 A CN 110830793A CN 201911104622 A CN201911104622 A CN 201911104622A CN 110830793 A CN110830793 A CN 110830793A
- Authority
- CN
- China
- Prior art keywords
- video
- target
- video transmission
- detection
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N17/00—Diagnosis, testing or measuring for television systems or their details
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N17/00—Diagnosis, testing or measuring for television systems or their details
- H04N17/004—Diagnosis, testing or measuring for television systems or their details for digital television systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30168—Image quality inspection
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a video transmission quality time domain detection method based on deep learning frequency scale identification, which comprises the following steps: making a video for detecting video transmission time domain indexes, and calibrating a serial number and a check number of each video frame at a specific position of the video to be used as a video label, namely a frequency label for short; training an SSD target detection network, and using a video frame as the input of the SSD target detection network for detecting each target and a target frame in a frequency scale; extracting a serial number and a check number from the detected target and the target frame, wherein the serial number is used for positioning a video frame, and the check number is used for checking whether the frequency marker identification is wrong; in one detection, video frames of a video transmission sending end and a video transmission receiving end are extracted simultaneously, and are respectively input into an SSD target detection network, respective frequency labels are extracted, and whether picture freezing exists or not, picture freezing time is calculated, and picture delay is calculated.
Description
Technical Field
The invention relates to the field of target detection, in particular to a video transmission quality time domain detection method based on deep learning frequency scale identification.
Background
In the process of video transmission, due to various reasons such as network conditions, channel quality, cache and the like, picture freezing and picture delay at a receiving end can be generated, the picture freezing can influence the experience of a user for watching the video, and in a specific scene such as real-time video call, the picture delay needs to be avoided as much as possible, so that the method is very important for time domain detection of the picture freezing and the picture delay in the video transmission. Most of the existing video transmission quality detection methods evaluate the video transmission quality based on the image quality, and the technical research on the aspect of video transmission quality time domain detection focuses on the connection of packet loss, frame loss and image distortion and the judgment of picture freezing by using time domain image context. The former can not fully embody the picture freezing and picture delay performance of video transmission in the time domain; and the latter is difficult to calculate the picture freeze time and the picture delay time. Therefore, the time domain detection method capable of efficiently, accurately and intelligently evaluating the video transmission quality has important practical significance.
Disclosure of Invention
In order to solve the above technical problems, an object of the present invention is to provide a video transmission quality time domain detection method based on deep learning frequency scale identification.
The purpose of the invention is realized by the following technical scheme:
a video transmission quality time domain detection method based on deep learning frequency scale identification comprises the following steps:
a, making a video for detecting video transmission time domain indexes, and calibrating each video frame sequence number N at a specific position of the videosAnd check number NcAs a video tag, frequency standard for short;
b, training the SSD target detection network, and using the video frame as the input of the SSD target detection network for detecting each target in the frequency scaleAnd an object framej is 1,2,3, … …, n, n is the total number of detected targets;
c from detectionAndis extracted fromAnd for locating video frames,The frequency scale recognition module is used for verifying whether the frequency scale recognition is wrong;
d, in one detection, simultaneously extracting video frames of a video transmission sending end and a video transmission receiving end, respectively inputting the video frames into the SSD target detection network, extracting respective frequency marks, judging whether a picture is frozen or not, and calculating picture freezing time TfCalculating the picture delay Td。
Compared with the prior art, the invention has the beneficial effects that:
the method provided by the invention can efficiently, accurately and intelligently evaluate the video transmission quality.
Drawings
Fig. 1 is a flowchart of a video transmission quality time domain detection method based on deep learning frequency scale identification.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.
As shown in fig. 1, a time domain detection method for video transmission quality based on deep learning frequency scale identification includes the following steps:
The step 10 specifically includes: the characters in the frequency scale are horizontally arranged from left to right, and the frequency scale comprises a serial number NsAnd check number NcWherein: the sequence number of the i-th frame has a character "S" as a start identifier, Ns,iI, the character "C" is used as an end identifier; the check number of the i-th frame has an end identifier "C" of the sequence number as a start identifier, Nc,iThe sum of each digit of the number i and the number of digits, and the character 'E' is used as an ending identifier; the serial number and the check number are arranged in the same row and continuously, and the serial number is before the check number.
The step 20 specifically includes: the data set of the SSD object detection network is a partial video frame image, and each element and its area in the image frequency standard are marked when the data set is created, and each element type includes characters "S", "C", "E", "0", "1", "2", "3", "4", "5", "6", "7", "8", and "9", including background, and the network has 14 object detection types in total. Because the video transmission process may bring distortion such as noise, color, blur, contrast and the like, the data enhancement is realized by randomly adding distortion interferences of different types and different degrees to the image during training. Inputting video frames into an SSD target detection network for calculation during training and detection to obtain candidate results, performing non-maximum suppression on the candidate results, eliminating repeated detection on the same target, and sequencing according to the horizontal coordinate of the center of a target frame from small to large to obtain a target prediction result corresponding to the video framesAnd target frame prediction results
The step 30 specifically includes:the center coordinates of (a) are:
screening out the targets with the categories of S, C and E, and if the number of the targets with the categories of S, C and E is 1, recording the S-th targetIs the character "S", the c < th > targetIs the character "C", the e-th targetIs the character "E", if s<c, thenObjects corresponding to each digitSatisfies the following conditions:
the order is from small to large for j.
the order is from small to large for j.
The step 30 specifically includes: suppose thatEach digit being n1,n2,……,nkK isAnd (4) digit, the condition that the frequency standard identification needs to meet after verification:
the frequency scale identification cannot pass the verification under the condition that any one of the following conditions is met:
and if the frequency mark identification passes the verification, continuing the subsequent detection, and if the frequency mark identification fails the verification, abandoning the video frame detection.
The step 40 specifically includes: assuming that the picture freezing time threshold is theta, the detection frame rate is f1Picture freeze occurs when the following conditions are met:
picture freezing time TfComprises the following steps:
the step 40 specifically includes: simultaneously extracting video frames of a video transmission sending end and a video transmission receiving end, respectively inputting the video frames into an SSD target detection network, and identifying a sending end video frame serial numberAnd the receiving end video frame sequence numberAssume video frame rate is f2Then the picture is delayed by TdComprises the following steps:
although the embodiments of the present invention have been described above, the above descriptions are only for the convenience of understanding the present invention, and are not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (7)
1. A video transmission quality time domain detection method based on deep learning frequency scale identification is characterized by comprising the following steps:
a, making a video for detecting video transmission time domain indexes, and calibrating each video frame sequence number N at a specific position of the videosAnd check number NcAs a video tag;
b, training the SSD target detection network, and using the video frame as the input of the SSD target detection network for detecting each target in the frequency scaleAnd an object framen is the total number of detected targets;
c from detectionAndis extracted fromAnd for locating video frames,The frequency scale recognition module is used for verifying whether the frequency scale recognition is wrong;
d, in one detection, simultaneously extracting video frames of a video transmission sending end and a video transmission receiving end, respectively inputting the video frames into the SSD target detection network, extracting respective frequency marks, and judgingWhether the picture is frozen or not and calculating the picture freezing time TfCalculating the picture delay Td。
2. The video transmission quality time domain detection method based on deep learning frequency scale identification as claimed in claim 1, wherein in step a, characters in the frequency scale are horizontally arranged from left to right, and the frequency scale comprises a serial number NsAnd check number NcWherein: the sequence number of the i-th frame has a character "S" as a start identifier, Ns,iI, the character "C" is used as an end identifier; the check number of the i-th frame has an end identifier "C" of the sequence number as a start identifier, Nc,iThe sum of each digit of the number i and the number of digits, and the character 'E' is used as an ending identifier; the serial number and the check number are arranged in the same row and continuously, and the serial number is before the check number.
3. The video transmission quality time-domain detection method based on deep learning frequency standard identification according to claim 1, wherein in step B, the data set of the SSD object detection network is a partial video frame image, and the data set is made by labeling each element and the region in the image frequency standard, each element category includes characters "S", "C", "E", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", including the background, and the network has 14 object detection categories; inputting video frames into an SSD target detection network for calculation during training and detection to obtain candidate results, performing non-maximum suppression on the candidate results, eliminating repeated detection on the same target, and sequencing according to the horizontal coordinate of the center of a target frame from small to large to obtain a target prediction result corresponding to the video framesAnd target frame prediction results
4. The method of claim 1The video transmission quality time domain detection method based on deep learning frequency scale identification is characterized in that in the step C,the center coordinates of (a) are:
screening out the targets with the categories of S, C and E, and if the number of the targets with the categories of S, C and E is 1, recording the S-th targetIs the character "S", the c < th > targetIs the character "C", the e-th targetIs the character "E", if s<c, thenObjects corresponding to each digitSatisfies the following conditions:
the sequence is arranged from small to large according to j;
the order is from small to large for j.
5. The video transmission quality time-domain detection method based on deep learning frequency scale identification as claimed in claim 1, wherein in the step C, it is assumed thatEach digit being n1,n2,……,nkK isAnd (4) digit, the condition that the frequency standard identification needs to meet after verification:
the frequency scale identification cannot pass the verification under the condition that any one of the following conditions is met:
and if the frequency mark identification passes the verification, continuing the subsequent detection, and if the frequency mark identification fails the verification, abandoning the video frame detection.
6. The temporal detection method for video transmission quality based on deep learning frequency scale identification as claimed in claim 1, wherein in step D, assuming the picture freezing time threshold is θ, the detection frame rate is f1Picture freeze occurs when the following conditions are met:
picture freezing time TfComprises the following steps:
7. the video transmission quality time domain detection method based on deep learning frequency standard identification as claimed in claim 1, wherein in step D, video frames of a video transmission sending end and a video transmission receiving end are extracted at the same time and input into the SSD object detection network respectively to identify a sending end video frame sequence numberAnd the receiving end video frame sequence numberAssume video frame rate is f2Then the picture is delayed by TdComprises the following steps:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911104622.8A CN110830793B (en) | 2019-11-13 | 2019-11-13 | Video transmission quality time domain detection method based on deep learning frequency scale identification |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911104622.8A CN110830793B (en) | 2019-11-13 | 2019-11-13 | Video transmission quality time domain detection method based on deep learning frequency scale identification |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110830793A true CN110830793A (en) | 2020-02-21 |
CN110830793B CN110830793B (en) | 2021-09-03 |
Family
ID=69554480
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911104622.8A Active CN110830793B (en) | 2019-11-13 | 2019-11-13 | Video transmission quality time domain detection method based on deep learning frequency scale identification |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110830793B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107909014A (en) * | 2017-10-31 | 2018-04-13 | 天津大学 | A kind of video understanding method based on deep learning |
CN108460122A (en) * | 2018-02-23 | 2018-08-28 | 武汉斗鱼网络科技有限公司 | Video searching method, storage medium, equipment based on deep learning and system |
CN108933935A (en) * | 2017-05-22 | 2018-12-04 | 中兴通讯股份有限公司 | Detection method, device, storage medium and the computer equipment of video communication system |
US20190297276A1 (en) * | 2018-03-20 | 2019-09-26 | EndoVigilant, LLC | Endoscopy Video Feature Enhancement Platform |
CN110414574A (en) * | 2019-07-10 | 2019-11-05 | 厦门美图之家科技有限公司 | A kind of object detection method calculates equipment and storage medium |
-
2019
- 2019-11-13 CN CN201911104622.8A patent/CN110830793B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108933935A (en) * | 2017-05-22 | 2018-12-04 | 中兴通讯股份有限公司 | Detection method, device, storage medium and the computer equipment of video communication system |
CN107909014A (en) * | 2017-10-31 | 2018-04-13 | 天津大学 | A kind of video understanding method based on deep learning |
CN108460122A (en) * | 2018-02-23 | 2018-08-28 | 武汉斗鱼网络科技有限公司 | Video searching method, storage medium, equipment based on deep learning and system |
US20190297276A1 (en) * | 2018-03-20 | 2019-09-26 | EndoVigilant, LLC | Endoscopy Video Feature Enhancement Platform |
CN110414574A (en) * | 2019-07-10 | 2019-11-05 | 厦门美图之家科技有限公司 | A kind of object detection method calculates equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110830793B (en) | 2021-09-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220375225A1 (en) | Video Segmentation Method and Apparatus, Device, and Medium | |
CN111161311A (en) | Visual multi-target tracking method and device based on deep learning | |
US11914639B2 (en) | Multimedia resource matching method and apparatus, storage medium, and electronic apparatus | |
CN111612763A (en) | Mobile phone screen defect detection method, device and system, computer equipment and medium | |
CN110363220B (en) | Behavior class detection method and device, electronic equipment and computer readable medium | |
CN108491845B (en) | Character segmentation position determination method, character segmentation method, device and equipment | |
CN112766218B (en) | Cross-domain pedestrian re-recognition method and device based on asymmetric combined teaching network | |
CN103413149B (en) | Method for detecting and identifying static target in complicated background | |
CN112784724A (en) | Vehicle lane change detection method, device, equipment and storage medium | |
CN109859166A (en) | It is a kind of based on multiple row convolutional neural networks without ginseng 3D rendering method for evaluating quality | |
CN110610123A (en) | Multi-target vehicle detection method and device, electronic equipment and storage medium | |
CN111666898A (en) | Method and device for identifying class to which vehicle belongs | |
CN112132103A (en) | Video face detection and recognition method and system | |
CN111401171A (en) | Face image recognition method and device, electronic equipment and storage medium | |
CN104657721B (en) | A kind of video OSD time recognition methods based on adaptive template | |
WO2023125119A1 (en) | Spatio-temporal action detection method and apparatus, electronic device and storage medium | |
CN113763348A (en) | Image quality determination method and device, electronic equipment and storage medium | |
CN110688873A (en) | Multi-target tracking method and face recognition method | |
CN110830793B (en) | Video transmission quality time domain detection method based on deep learning frequency scale identification | |
CN109829887B (en) | Image quality evaluation method based on deep neural network | |
CN112149698A (en) | Method and device for screening difficult sample data | |
CN115424253A (en) | License plate recognition method and device, electronic equipment and storage medium | |
CN110956108B (en) | Small frequency scale detection method based on characteristic pyramid | |
CN114550300A (en) | Video data analysis method and device, electronic equipment and computer storage medium | |
CN112686844B (en) | Threshold setting method, storage medium and system based on video quality inspection scene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |