WO2021154861A1 - Key frame extraction for underwater telemetry and anomaly detection - Google Patents

Key frame extraction for underwater telemetry and anomaly detection Download PDF

Info

Publication number
WO2021154861A1
WO2021154861A1 PCT/US2021/015300 US2021015300W WO2021154861A1 WO 2021154861 A1 WO2021154861 A1 WO 2021154861A1 US 2021015300 W US2021015300 W US 2021015300W WO 2021154861 A1 WO2021154861 A1 WO 2021154861A1
Authority
WO
WIPO (PCT)
Prior art keywords
key frame
key frames
key
frame extraction
frames
Prior art date
Application number
PCT/US2021/015300
Other languages
French (fr)
Inventor
Nader SALMAN
Victor AMBLARD
Original Assignee
Schlumberger Technology Corporation
Schlumberger Canada Limited
Services Petroliers Schlumberger
Schlumberger Technology B.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Schlumberger Technology Corporation, Schlumberger Canada Limited, Services Petroliers Schlumberger, Schlumberger Technology B.V. filed Critical Schlumberger Technology Corporation
Publication of WO2021154861A1 publication Critical patent/WO2021154861A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8549Creating video summaries, e.g. movie trailer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/251Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47202End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting content on demand, e.g. video on demand

Definitions

  • This application relates to systems and methods for using key frame extraction for underwater telemetry and anomaly detection.
  • Video summarization techniques aim at providing a condensed summary of a video footage, keeping meaningful information from the video footage while reducing the overall data size.
  • FIG. 1 illustrates a block diagram of a system for key frame extraction process in accordance with embodiments of the present disclosure
  • FIG. 2 illustrates a flowchart showing operations consistent with embodiments of the present disclosure
  • FIG. 3 depicts an example showing a general framework overview consistent with embodiments of the present disclosure
  • FIG. 4 depicts a description of the key frame extraction module consistent with embodiments of the present disclosure
  • FIG 5 depicts a description of the key frame extraction module consistent with embodiments of the present disclosure
  • FIG. 6 depicts an example of a decision unit consistent with embodiments of the present disclosure
  • FIG. 7 shows a flowchart consistent with embodiments of the present disclosure.
  • FIG. 8 shows transmission and monitoring modules consistent with embodiments of the present disclosure.
  • Embodiments of the present disclosure are directed towards a system and method for real-time key frame extraction from a live video feed based on video frames global features. It extracts in real-time and on the fly key frames aided by an anti-blur component that prevents from selecting key frames with a great amount of motion blur. These key frames can then be transmitted through a low-bandwidth transmission channel in lieu of the video feed while preserving useful information.
  • first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms may be used to distinguish one element from another.
  • a first object or step could be termed a second object or step, and, similarly, a second object or step could be termed a first object or step, without departing from the scope of the disclosure.
  • the first object or step, and the second object or step are both objects or steps, respectively, but they are not to be considered a same object or step.
  • a key frame extraction process 10 that may reside on and or be associated with an autonomous underwater vehicle (“AUV”), an unmanned aerial vehicle (“UAV”) or any other computing device. Aspects of process 10 may be (wholly or partly) executed by server computer 12, which may be connected to network 14 (e.g., the Internet or a local area network). Examples of server computer 12 may include, but are not limited to: a personal computer, a server computer, a series of server computers, a mini computer, and a mainframe computer.
  • Server computer 12 may be a web server (or a series of servers) running a network operating system, examples of which may include but are not limited to: Microsoft® Windows® Server; Novell® NetWare®; or Red Hat® Linux®, for example.
  • Microsoft and Windows are registered trademarks of Microsoft Corporation in the United States, other countries or both
  • Novell and NetWare are registered trademarks of Novell Corporation in the United States, other countries or both
  • Red Hat is a registered trademark of Red Hat Corporation in the United States, other countries or both
  • Linux is a registered trademark of Linus Torvalds in the United States, other countries or both.
  • key frame extraction process 10 may reside on and be executed, in whole or in part, by an AUV, a UAV, a client electronic device, such as a personal computer, notebook computer, personal digital assistant, or the like.
  • the instruction sets and subroutines of key frame extraction process 10 may include one or more software modules, and which may be stored on storage device 16 coupled to server computer 12, may be executed by one or more processors (not shown) and one or more memory modules (not shown) incorporated into server computer 12.
  • Storage device 16 may include but is not limited to: a hard disk drive; a solid state drive, a tape drive; an optical drive; a RAID array; a random access memory (RAM); and a read-only memory (ROM).
  • Storage device 16 may include various types of files and file types.
  • Server computer 12 may execute a web server application, examples of which may include but are not limited to: Microsoft IIS, Novell WebserverTM, or Apache® Webserver, that allows for HTTP (i.e., HyperText Transfer Protocol) access to server computer 12 via network 14
  • Webserver is a trademark of Novell Corporation in the United States, other countries, or both
  • Apache is a registered trademark of Apache Software Foundation in the United States, other countries, or both
  • Network 14 may be connected to one or more secondary networks (e.g., network 18), examples of which may include but are not limited to: a local area network; a wide area network; or an intranet, for example.
  • Key frame extraction process 10 may be a standalone application, or may be an applet / application / script that may interact with and/or be executed within application 20.
  • key frame extraction process 10 may be a client- side process (not shown) that may reside on a client electronic device (described below) and may interact with a client application (e.g., one or more of client applications 22, 24, 26, 28).
  • key frame extraction process 10 may be a hybrid server-side / client-side process that may interact with application 20 and a client application (e.g., one or more of client applications 22, 24, 26, 28). As such, key frame extraction process 10 may reside, in whole, or in part, on server computer 12 and/or one or more client electronic devices.
  • the instruction sets and subroutines of application 20, which may be stored on storage device 16 coupled to server computer 12 may be executed by one or more processors (not shown) and one or more memory modules (not shown) incorporated into server computer 12.
  • the instruction sets and subroutines of client applications 22, 24, 26, 28, which may be stored on storage devices 30, 32, 34, 36 (respectively) coupled to client electronic devices 38, 40, 42, 44 (respectively) may be executed by one or more processors (not shown) and one or more memory modules (not shown) incorporated into client electronic devices 38, 40, 42, 44 (respectively).
  • Storage devices 30, 32, 34, 36 may include but are not limited to: hard disk drives; solid state drives, tape drives; optical drives; RAID arrays; random access memories (RAM); read-only memories (ROM), compact flash (CF) storage devices, secure digital (SD) storage devices, and a memory stick storage devices.
  • client electronic devices 38, 40, 42, 44 may include, but are not limited to, personal computer 38, laptop computer 40, mobile computing device 42 (such as a smart phone, netbook, or the like), notebook computer 44, for example.
  • client applications 22, 24, 26, 28, users 46, 48, 50, 52 may access key frame extraction process 10.
  • Users 46, 48, 50, 52 may access key frame extraction process 10 and/or other applications associated with server computer 12 directly through the device on which the client application (e.g., client applications 22, 24, 26, 28) is executed, namely client electronic devices 38, 40, 42, 44, for example.
  • Users 46, 48, 50, 52 may access process 10 and/or other applications directly through network 14 or through secondary network 18.
  • server computer 12 i.e., the computer that executes these applications
  • the various client electronic devices may be directly or indirectly coupled to network 14 (or network 18).
  • personal computer 38 is shown directly coupled to network 14 via a hardwired network connection.
  • notebook computer 44 is shown directly coupled to network 18 via a hardwired network connection.
  • Laptop computer 40 is shown wirelessly coupled to network 14 via wireless communication channel 66 established between laptop computer 40 and wireless access point (i.e., WAP) 68, which is shown directly coupled to network 14.
  • WAP 68 may be, for example, an IEEE 802.11a, 802.11b, 802. llg, Wi-Fi, and/or Bluetooth device that is capable of establishing wireless communication channel 66 between laptop computer 40 and WAP 68.
  • Mobile computing device 42 is shown wirelessly coupled to network 14 via wireless communication channel 70 established between mobile computing device 42 and cellular network / bridge 72, which is shown directly coupled to network 14.
  • All of the IEEE 802.1 lx specifications may use Ethernet protocol and carrier sense multiple access with collision avoidance (i.e., CSMA/CA) for path sharing.
  • the various 802.1 lx specifications may use phase-shift keying (i.e., PSK) modulation or complementary code keying (i.e., CCK) modulation, for example.
  • PSK phase-shift keying
  • CCK complementary code keying
  • Bluetooth is a telecommunications industry specification that allows e.g., mobile phones, computers, and personal digital assistants to be interconnected using a short-range wireless connection.
  • Client electronic devices 38, 40, 42, 44 may each execute an operating system, examples of which may include but are not limited to Microsoft Windows, Microsoft Windows CE®, Red Hat Linux, or other suitable operating system.
  • Microsoft Windows is a registered trademark of Microsoft Corporation in the United States, other countries, or both.
  • Windows CE is a registered trademark of Microsoft Corporation in the United States, other countries, or both.
  • key frame extraction process 10 may generate an output that may be delivered to one or more onsite tools such as reservoir tool 74, which may be configured to perform one or more reservoir operations.
  • Reservoir tool 74 may include, but is not limited to, those available from the Assignee of the present disclosure.
  • reservoir tool 74 may include one or more processors configured to receive an output from key frame extraction process 10 and alter the operations of reservoir tool 74.
  • Flowchart 200 includes a system including a key frame extraction module 202 configured to receive a video feed and generate one or more key frames.
  • Embodiments may include a transmission module 204 configured to receive the one or more key frames and transmit the one or more key frames to a monitoring module and a monitoring module 206 configured to receive the one or more key frames and perform post-processing on the one or more key frames.
  • Embodiments of the present disclosure may be used with AUVs, UAVs and numerous other use cases. Embodiments included herein may perform real-time key frame extraction from a live video feed based on video frames global features. It extracts in real-time and on the fly key frames aided by an anti-blur component that prevents from selecting key frames with a great amount of motion blur. These key frames can then be transmitted through a low-bandwidth transmission channel in lieu of the video feed while preserving useful information. [0030] In some embodiments, the present disclosure may perform real-time key frame extraction from a live video stream based on global features and extracts key frames on the fly.
  • Embodiments included herein may efficiently remove redundant and blurred information from a video stream, may function well in low feature environments with poor semantic content, detect anomalies within repetitive video frames, and may be used along with an active learning framework to minimize the sampling of redundant images, thus, reduce overall labeling costs.
  • embodiments included herein may acquire a live video stream from the underwater vehicle that needs to be transmitted to a surface vehicle with a limited bandwidth. Instead of periodically picking video frames and compressing them our invention enables to carefully and dynamically select images that provide useful information about the inspection mission and will be transmitted to a surface vehicle.
  • videos are continuous shots that can usually last several hours with very little semantic content and information.
  • some frames might contain pipe anomalies or marine debris that must be identified and reported. Extracting only useful information from the video feed allows the operator on a surface boat looking for anomalies to focus on several informative key frames rather than a long video feed.
  • the present disclosure may constitute the first module of an end-to-end Active Learning framework starting from raw videos from which key frames are extracted. Active learning can then directly be performed on these key frames resulting in a labeling time gain.
  • Video summarization techniques aim at providing a condensed summary of a video footage, keeping meaningful information from the video footage while reducing the overall data size. These techniques are mostly based on global features, local features or neural networks. In underwater environments, brightness and contrast variations can be very limited, but due to the nature of the different assets, global features, and more specifically color histograms, manage to capture effectively relevant information about the current objects on the video frame. In addition, using global features as a base for key frame extraction enables fast video summarization. Where most of the existing video summarization techniques use intensive computational resources to perform video summarization, embodiments included herein may extract key frames on the fly, allowing further use of these key frames including data transmission through a low-bandwidth transmission channel and real-time anomaly inspection.
  • key frame extraction consists in extracting key frames from videos that summarize the video.
  • now know techniques or future known techniques Different methods exist, part of them are based on neural networks
  • now know techniques or future known techniques The results can be compared from global features-based approaches and local features-based approaches using now know techniques or future known techniques.
  • Most video summarization techniques need to perform intensive computations that do not allow them to perform real time video summarization using now know techniques or future known techniques.
  • Embodiments of the present disclosure may be configured to take a video stream or video files (hereafter denoted as video feed) as an input and extracts on the fly key frames based on global features. These key frames can then be compressed then transmitted to another device instead of transmitting the entire video stream.
  • video feed a video stream or video files
  • the framework described in FIG. 3, includes three modules.
  • the first module is a key frame extraction module ;
  • the second module is a transmission module that transmits key frames yielded by the previous module to a monitoring module ;
  • the third module is a monitoring module that receives the key frame sent through the transmission canal and performs post processing on the key frame and if necessary additional steps of automated monitoring.
  • the key frame extraction module is made up of three components.
  • the first component is a sampling component that extracts frames from a video feed at a predefined frame rate.
  • the second component is a selection component that decides whether the current frame should be kept as a key frame or not based on a similarity score between features of the current frame and previous frames.
  • the third component is an update component that updates parameters of the algorithm including thresholds.
  • a Sampling component may be included.
  • a video file or a live video stream hereafter denoted as video feed is processed through this component at a predefined frame rate and yields in real time frames.
  • FIG. 4 illustrates the workflow of this component.
  • a selection component may be included.
  • the selection component is illustrated in FIG. 4.
  • global features are computed on the current frame (e.g. RGB color histograms, HSV color histograms, average, maximum or minimum value of a color channel, intensity). These features are then compared to global features from the previous frame (previous features) and from the previous key frame according to a comparison metric (e.g. Chi- square or Bhattacharyya distance, see, e.g., H.Tong et ak, Blur detection for digital images using wavelet transform, IEEE International Conference on Multimedia and Expo , 2004]), that reflects their similarity.
  • a comparison metric e.g. Chi- square or Bhattacharyya distance, see, e.g., H.Tong et ak, Blur detection for digital images using wavelet transform, IEEE International Conference on Multimedia and Expo , 2004.
  • an anti-blur component is added in the selection component to prevent from selecting blurred frames as key frames.
  • This component can for example use Laplacian calculation, e.g., see R. Bansal et ak, Blur image detection using Laplacian operator and Open-C V, International Conference System Modeling & Advancement in Research Trends (SMART), 2016, as a base metric to determine the current blurriness score of an image and compare it to a threshold that can be dynamically updated to take into account the properties of the current video feed.
  • a decision unit decides whether the current frame should be kept as a key frame or rejected.
  • This decision unit may for example keep the current frame as a key frame if its current blurriness score is below a blurriness threshold and its current similarity score is below a certain threshold - meaning that the current frame is not blurred and not similar to the previous frame , or if the previous frame was supposed to be selected as a key frame but due to its blurriness was not.
  • FIG. 6 provides an illustration of such a decision unit.
  • an Update component may be included.
  • the threshold is updated based on statistics of the previous and current similarity scores. It may for example be updated based on a weighted sum of the mean and the standard deviation of the up-to-date similarity scores.
  • FIG. 5 illustrates the workflow of this component.
  • the general algorithm is described in FIG. 7.
  • the transmission module is described in FIG. 8. Each time a key frame is extracted by the key frame extraction module, it is sent to the transmission module that may compress the image and send it through a transmission channel.
  • the monitoring module is described in FIG. 8. It receives a key frame sent through the transmission channel, performs, if necessary, post processing steps to retrieve the original key frame that was extracted by the key frame extraction module. The key frame can then be used for monitoring purposes or stored on a hard drive. Monitoring purposes include performing real-time anomaly detection.
  • Figure 9 depicts another embodiment of a framework for using key frames.
  • the transmission module can be in communication with at least one of the monitoring module or an analysis module.
  • the transmission module can be in communication with both the analysis module and monitoring module at the same time and can send extracted key frames to simultaneously to the monitoring module and to the analysis module.
  • the monitoring module can be the same or substantially similar to monitoring modules described herein.
  • the analysis module can include storage, an annotation module, and an active learning module.
  • the key frames can be communicated to each at the same time. Put in storage for later use in analysis or annotation.
  • a user can choose to send the key frames to one or more of the active learning, the annotation, and storage.
  • the extracted key frames can be used as inputs into reports or documentation.
  • the key frames can be used for the learning, an example of active learning is found in, International Patent Application Publication WO 2021/007514, entitled “ Active Learning for Inspection Tool, published on January 14, 2021, and which is incorporated herein in its entirety.
  • the key frames can be live streamed to the active learning.
  • the key frames can be stored in storage and later loaded into the active learning as inputs.
  • first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms may be used to distinguish one element from another.
  • a first object or step could be termed a second object or step, and, similarly, a second object or step could be termed a first object or step, without departing from the scope of the disclosure.
  • the first object or step, and the second object or step are both objects or steps, respectively, but they are not to be considered a same object or step.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • processor can be performed by a processor.
  • the term “processor” should not be construed to limit the embodiments disclosed herein to any particular device type or system.
  • the processor may include a computer system.
  • the computer system may also include a computer processor (e.g., a microprocessor, microcontroller, digital signal processor, or general-purpose computer) for executing any of the methods and processes described above.
  • a computer processor e.g., a microprocessor, microcontroller, digital signal processor, or general-purpose computer
  • the computer system may further include a memory such as a semiconductor memory device (e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM), a magnetic memory device (e.g., a diskette or fixed disk), an optical memory device (e.g., a CD-ROM), a PC card (e.g., PCMCIA card), or other memory device.
  • a semiconductor memory device e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM
  • a magnetic memory device e.g., a diskette or fixed disk
  • an optical memory device e.g., a CD-ROM
  • PC card e.g., PCMCIA card
  • the computer program logic may be embodied in various forms, including a source code form or a computer executable form.
  • Source code may include a series of computer program instructions in a variety of programming languages (e.g., an object code, an assembly language, or a high-level language such as C, C++, or JAVA).
  • Such computer instructions can be stored in a non-transitory computer readable medium (e.g., memory) and executed by the computer processor.
  • the computer instructions may be distributed in any form as a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over a communication system (e.g., the Internet or World Wide Web).
  • a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over a communication system (e.g., the Internet or World Wide Web).
  • a communication system e.g., the Internet or World Wide Web
  • the processor may include discrete electronic components coupled to a printed circuit board, integrated circuitry (e.g., Application Specific Integrated Circuits (ASIC)), and/or programmable logic devices (e.g., a Field Programmable Gate Arrays (FPGA)). Any of the methods and processes described above can be implemented using such logic devices.
  • ASIC Application Specific Integrated Circuits
  • FPGA Field Programmable Gate Arrays

Abstract

Embodiments of the present disclosure are directed towards a system including a key frame extraction module configured to receive a video feed and generate one or more key frames. Embodiments may include a transmission module configured to receive the one or more key frames and transmit the one or more key frames to a monitoring module and a monitoring module configured to receive the one or more key frames and perform post-processing on the one or more key frames.

Description

KEY FRAME EXTRACTION FOR UNDERWATER TELEMETRY AND ANOMALY
DETECTION
CROSS-REFERENCE TO RELATED APPLICATIONS
The present application claims priority to United States Provisional Patent Application 62/966,129 filed January 27, 2020, the entirety of which is incorporated by reference.
FIELD
[0001] This application relates to systems and methods for using key frame extraction for underwater telemetry and anomaly detection.
BACKGROUND
[0002] With the rapid growth of multimedia information, large amounts of videos have now become available online opening up new prospects for deep learning experiments. However, raw video footage can include blurred or redundant frames and is therefore not directly usable. Video summarization techniques aim at providing a condensed summary of a video footage, keeping meaningful information from the video footage while reducing the overall data size.
[0003] These techniques are mostly based on global features, local features or neural networks. In underwater environments, brightness and contrast variations can be very limited, but due to the nature of the different assets, global features, and more specifically color histograms, manage to capture effectively relevant information about the current objects on the video frame.
BRIEF DESCRIPTION OF THE DRAWINGS
[0004] The present invention is illustrated by way of example, and not limitation, in the figures of the accompanying drawings in which like references indicate similar elements and in which: [0005] FIG. 1 illustrates a block diagram of a system for key frame extraction process in accordance with embodiments of the present disclosure;
[0006] FIG. 2 illustrates a flowchart showing operations consistent with embodiments of the present disclosure;
[0007] FIG. 3 depicts an example showing a general framework overview consistent with embodiments of the present disclosure;
[0008] FIG. 4 depicts a description of the key frame extraction module consistent with embodiments of the present disclosure;
[0009] FIG 5 depicts a description of the key frame extraction module consistent with embodiments of the present disclosure;
[0010] FIG. 6 depicts an example of a decision unit consistent with embodiments of the present disclosure;
[0011] FIG. 7 shows a flowchart consistent with embodiments of the present disclosure; and [0012] FIG. 8 shows transmission and monitoring modules consistent with embodiments of the present disclosure.
DETAILED DESCRIPTION
[0013] Embodiments of the present disclosure are directed towards a system and method for real-time key frame extraction from a live video feed based on video frames global features. It extracts in real-time and on the fly key frames aided by an anti-blur component that prevents from selecting key frames with a great amount of motion blur. These key frames can then be transmitted through a low-bandwidth transmission channel in lieu of the video feed while preserving useful information.
[0014] The discussion below is directed to certain implementations and/or embodiments. It is to be understood that the discussion below may be used for the purpose of enabling a person with ordinary skill in the art to make and use any subject matter defined now or later by the patent “claims” found in any issued patent herein.
[0015] It is specifically intended that the claimed combinations of features not be limited to the implementations and illustrations contained herein but include modified forms of those implementations including portions of the implementations and combinations of elements of different implementations as come within the scope of the following claims. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions may be made to achieve the developers' specific goals, such as compliance with system-related and business related constraints, which may vary from one implementation to another. Moreover, it should be appreciated that such a development effort might be complex and time consuming, but would nevertheless be a routine undertaking of design, fabrication, and manufacture for those of ordinary skill having the benefit of this disclosure. Nothing in this application is considered critical or essential to the claimed invention unless explicitly indicated as being "critical" or "essential."
[0016] It will also be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms may be used to distinguish one element from another. For example, a first object or step could be termed a second object or step, and, similarly, a second object or step could be termed a first object or step, without departing from the scope of the disclosure. The first object or step, and the second object or step, are both objects or steps, respectively, but they are not to be considered a same object or step.
[0017] Referring to FIG. 1, there is shown a key frame extraction process 10 that may reside on and or be associated with an autonomous underwater vehicle (“AUV”), an unmanned aerial vehicle (“UAV”) or any other computing device. Aspects of process 10 may be (wholly or partly) executed by server computer 12, which may be connected to network 14 (e.g., the Internet or a local area network). Examples of server computer 12 may include, but are not limited to: a personal computer, a server computer, a series of server computers, a mini computer, and a mainframe computer. Server computer 12 may be a web server (or a series of servers) running a network operating system, examples of which may include but are not limited to: Microsoft® Windows® Server; Novell® NetWare®; or Red Hat® Linux®, for example. (Microsoft and Windows are registered trademarks of Microsoft Corporation in the United States, other countries or both; Novell and NetWare are registered trademarks of Novell Corporation in the United States, other countries or both; Red Hat is a registered trademark of Red Hat Corporation in the United States, other countries or both; and Linux is a registered trademark of Linus Torvalds in the United States, other countries or both.) Additionally / alternatively, key frame extraction process 10 may reside on and be executed, in whole or in part, by an AUV, a UAV, a client electronic device, such as a personal computer, notebook computer, personal digital assistant, or the like.
[0018] The instruction sets and subroutines of key frame extraction process 10, which may include one or more software modules, and which may be stored on storage device 16 coupled to server computer 12, may be executed by one or more processors (not shown) and one or more memory modules (not shown) incorporated into server computer 12. Storage device 16 may include but is not limited to: a hard disk drive; a solid state drive, a tape drive; an optical drive; a RAID array; a random access memory (RAM); and a read-only memory (ROM). Storage device 16 may include various types of files and file types.
[0019] Server computer 12 may execute a web server application, examples of which may include but are not limited to: Microsoft IIS, Novell Webserver™, or Apache® Webserver, that allows for HTTP (i.e., HyperText Transfer Protocol) access to server computer 12 via network 14 (Webserver is a trademark of Novell Corporation in the United States, other countries, or both; and Apache is a registered trademark of Apache Software Foundation in the United States, other countries, or both). Network 14 may be connected to one or more secondary networks (e.g., network 18), examples of which may include but are not limited to: a local area network; a wide area network; or an intranet, for example.
[0020] Key frame extraction process 10 may be a standalone application, or may be an applet / application / script that may interact with and/or be executed within application 20. In addition / as an alternative to being a server-side process, key frame extraction process 10 may be a client- side process (not shown) that may reside on a client electronic device (described below) and may interact with a client application (e.g., one or more of client applications 22, 24, 26, 28).
Further, key frame extraction process 10 may be a hybrid server-side / client-side process that may interact with application 20 and a client application (e.g., one or more of client applications 22, 24, 26, 28). As such, key frame extraction process 10 may reside, in whole, or in part, on server computer 12 and/or one or more client electronic devices.
[0021] The instruction sets and subroutines of application 20, which may be stored on storage device 16 coupled to server computer 12 may be executed by one or more processors (not shown) and one or more memory modules (not shown) incorporated into server computer 12. [0022] The instruction sets and subroutines of client applications 22, 24, 26, 28, which may be stored on storage devices 30, 32, 34, 36 (respectively) coupled to client electronic devices 38, 40, 42, 44 (respectively), may be executed by one or more processors (not shown) and one or more memory modules (not shown) incorporated into client electronic devices 38, 40, 42, 44 (respectively). Storage devices 30, 32, 34, 36 may include but are not limited to: hard disk drives; solid state drives, tape drives; optical drives; RAID arrays; random access memories (RAM); read-only memories (ROM), compact flash (CF) storage devices, secure digital (SD) storage devices, and a memory stick storage devices. Examples of client electronic devices 38, 40, 42, 44 may include, but are not limited to, personal computer 38, laptop computer 40, mobile computing device 42 (such as a smart phone, netbook, or the like), notebook computer 44, for example. Using client applications 22, 24, 26, 28, users 46, 48, 50, 52 may access key frame extraction process 10.
[0023] Users 46, 48, 50, 52 may access key frame extraction process 10 and/or other applications associated with server computer 12 directly through the device on which the client application (e.g., client applications 22, 24, 26, 28) is executed, namely client electronic devices 38, 40, 42, 44, for example. Users 46, 48, 50, 52 may access process 10 and/or other applications directly through network 14 or through secondary network 18. Further, server computer 12 (i.e., the computer that executes these applications) may be connected to network 14 through secondary network 18, as illustrated with phantom link line 54.
[0024] The various client electronic devices may be directly or indirectly coupled to network 14 (or network 18). For example, personal computer 38 is shown directly coupled to network 14 via a hardwired network connection. Further, notebook computer 44 is shown directly coupled to network 18 via a hardwired network connection. Laptop computer 40 is shown wirelessly coupled to network 14 via wireless communication channel 66 established between laptop computer 40 and wireless access point (i.e., WAP) 68, which is shown directly coupled to network 14. WAP 68 may be, for example, an IEEE 802.11a, 802.11b, 802. llg, Wi-Fi, and/or Bluetooth device that is capable of establishing wireless communication channel 66 between laptop computer 40 and WAP 68. Mobile computing device 42 is shown wirelessly coupled to network 14 via wireless communication channel 70 established between mobile computing device 42 and cellular network / bridge 72, which is shown directly coupled to network 14. [0025] As is known in the art, all of the IEEE 802.1 lx specifications may use Ethernet protocol and carrier sense multiple access with collision avoidance (i.e., CSMA/CA) for path sharing. The various 802.1 lx specifications may use phase-shift keying (i.e., PSK) modulation or complementary code keying (i.e., CCK) modulation, for example. As is known in the art, Bluetooth is a telecommunications industry specification that allows e.g., mobile phones, computers, and personal digital assistants to be interconnected using a short-range wireless connection.
[0026] Client electronic devices 38, 40, 42, 44 may each execute an operating system, examples of which may include but are not limited to Microsoft Windows, Microsoft Windows CE®, Red Hat Linux, or other suitable operating system. (Windows CE is a registered trademark of Microsoft Corporation in the United States, other countries, or both.).
[0027] In some embodiments, key frame extraction process 10 may generate an output that may be delivered to one or more onsite tools such as reservoir tool 74, which may be configured to perform one or more reservoir operations. Reservoir tool 74 may include, but is not limited to, those available from the Assignee of the present disclosure. In some embodiments, reservoir tool 74 may include one or more processors configured to receive an output from key frame extraction process 10 and alter the operations of reservoir tool 74.
[0028] Referring now to FIG. 2, a flowchart 200 consistent with embodiments of key frame extraction process 10 is provided. Flowchart 200 includes a system including a key frame extraction module 202 configured to receive a video feed and generate one or more key frames. Embodiments may include a transmission module 204 configured to receive the one or more key frames and transmit the one or more key frames to a monitoring module and a monitoring module 206 configured to receive the one or more key frames and perform post-processing on the one or more key frames.
[0029] Embodiments of the present disclosure may be used with AUVs, UAVs and numerous other use cases. Embodiments included herein may perform real-time key frame extraction from a live video feed based on video frames global features. It extracts in real-time and on the fly key frames aided by an anti-blur component that prevents from selecting key frames with a great amount of motion blur. These key frames can then be transmitted through a low-bandwidth transmission channel in lieu of the video feed while preserving useful information. [0030] In some embodiments, the present disclosure may perform real-time key frame extraction from a live video stream based on global features and extracts key frames on the fly. Embodiments included herein may efficiently remove redundant and blurred information from a video stream, may function well in low feature environments with poor semantic content, detect anomalies within repetitive video frames, and may be used along with an active learning framework to minimize the sampling of redundant images, thus, reduce overall labeling costs. [0031] In the specific case of inspection with an autonomous underwater vehicle (AUV) of subsea asset integrity, embodiments included herein may acquire a live video stream from the underwater vehicle that needs to be transmitted to a surface vehicle with a limited bandwidth. Instead of periodically picking video frames and compressing them our invention enables to carefully and dynamically select images that provide useful information about the inspection mission and will be transmitted to a surface vehicle.
[0032] In some embodiments, for pipe-tracking missions, videos are continuous shots that can usually last several hours with very little semantic content and information. In those videos, some frames might contain pipe anomalies or marine debris that must be identified and reported. Extracting only useful information from the video feed allows the operator on a surface boat looking for anomalies to focus on several informative key frames rather than a long video feed. [0033] In some embodiments, in the specific case of Active Learning, the present disclosure may constitute the first module of an end-to-end Active Learning framework starting from raw videos from which key frames are extracted. Active learning can then directly be performed on these key frames resulting in a labeling time gain.
[0034] As discussed above, with the rapid growth of multimedia information, large amounts of videos have now become available online opening up new prospects for deep learning experiments. However, raw video footage can include blurred or redundant frames and is therefore not directly usable. Video summarization techniques aim at providing a condensed summary of a video footage, keeping meaningful information from the video footage while reducing the overall data size. These techniques are mostly based on global features, local features or neural networks. In underwater environments, brightness and contrast variations can be very limited, but due to the nature of the different assets, global features, and more specifically color histograms, manage to capture effectively relevant information about the current objects on the video frame. In addition, using global features as a base for key frame extraction enables fast video summarization. Where most of the existing video summarization techniques use intensive computational resources to perform video summarization, embodiments included herein may extract key frames on the fly, allowing further use of these key frames including data transmission through a low-bandwidth transmission channel and real-time anomaly inspection.
[0035] In some embodiments, key frame extraction consists in extracting key frames from videos that summarize the video. Using now know techniques or future known techniques. Different methods exist, part of them are based on neural networks Using now know techniques or future known techniques. The results can be compared from global features-based approaches and local features-based approaches using now know techniques or future known techniques. [0036] Most video summarization techniques need to perform intensive computations that do not allow them to perform real time video summarization using now know techniques or future known techniques.
[0037] Embodiments of the present disclosure may be configured to take a video stream or video files (hereafter denoted as video feed) as an input and extracts on the fly key frames based on global features. These key frames can then be compressed then transmitted to another device instead of transmitting the entire video stream.
[0038] In some embodiments, the framework, described in FIG. 3, includes three modules. The first module is a key frame extraction module ; the second module is a transmission module that transmits key frames yielded by the previous module to a monitoring module ; the third module is a monitoring module that receives the key frame sent through the transmission canal and performs post processing on the key frame and if necessary additional steps of automated monitoring.
[0039] In some embodiments, the key frame extraction module is made up of three components. The first component is a sampling component that extracts frames from a video feed at a predefined frame rate. The second component is a selection component that decides whether the current frame should be kept as a key frame or not based on a similarity score between features of the current frame and previous frames. The third component is an update component that updates parameters of the algorithm including thresholds.
[0040] The three components are discussed in further detail below. In some embodiments, a Sampling component may be included. Here, a video file or a live video stream, hereafter denoted as video feed is processed through this component at a predefined frame rate and yields in real time frames. We refer as the current frame the last frame extracted by the sampling component. FIG. 4 illustrates the workflow of this component.
[0041] In some embodiments a selection component may be included. The selection component is illustrated in FIG. 4. First, global features are computed on the current frame (e.g. RGB color histograms, HSV color histograms, average, maximum or minimum value of a color channel, intensity...). These features are then compared to global features from the previous frame (previous features) and from the previous key frame according to a comparison metric (e.g. Chi- square or Bhattacharyya distance, see, e.g., H.Tong et ak, Blur detection for digital images using wavelet transform, IEEE International Conference on Multimedia and Expo , 2004]), that reflects their similarity. We hereafter denote the result as current similarity score.
[0042] On top of that, an anti-blur component is added in the selection component to prevent from selecting blurred frames as key frames. This component can for example use Laplacian calculation, e.g., see R. Bansal et ak, Blur image detection using Laplacian operator and Open-C V, International Conference System Modeling & Advancement in Research Trends (SMART), 2016, as a base metric to determine the current blurriness score of an image and compare it to a threshold that can be dynamically updated to take into account the properties of the current video feed.
[0043] Based on the current blurriness score, the current similarity score, blurriness threshold and similarity threshold, a decision unit decides whether the current frame should be kept as a key frame or rejected. This decision unit may for example keep the current frame as a key frame if its current blurriness score is below a blurriness threshold and its current similarity score is below a certain threshold - meaning that the current frame is not blurred and not similar to the previous frame , or if the previous frame was supposed to be selected as a key frame but due to its blurriness was not. FIG. 6 provides an illustration of such a decision unit.
[0044] In some embodiments an Update component may be included. Here, the threshold is updated based on statistics of the previous and current similarity scores. It may for example be updated based on a weighted sum of the mean and the standard deviation of the up-to-date similarity scores. FIG. 5 illustrates the workflow of this component. The general algorithm is described in FIG. 7. [0045] In some embodiments, the transmission module is described in FIG. 8. Each time a key frame is extracted by the key frame extraction module, it is sent to the transmission module that may compress the image and send it through a transmission channel. The monitoring module is described in FIG. 8. It receives a key frame sent through the transmission channel, performs, if necessary, post processing steps to retrieve the original key frame that was extracted by the key frame extraction module. The key frame can then be used for monitoring purposes or stored on a hard drive. Monitoring purposes include performing real-time anomaly detection.
[0046] It is specifically intended that the claimed combinations of features not be limited to the implementations and illustrations contained herein but include modified forms of those implementations including portions of the implementations and combinations of elements of different implementations as come within the scope of the following claims. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions may be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another. Moreover, it should be appreciated that such a development effort might be complex and time consuming, but would nevertheless be a routine undertaking of design, fabrication, and manufacture for those of ordinary skill having the benefit of this disclosure. Nothing in this application is considered critical or essential to the claimed invention unless explicitly indicated as being "critical" or "essential."
[0047] Figure 9 depicts another embodiment of a framework for using key frames. In the system 900 the transmission module can be in communication with at least one of the monitoring module or an analysis module. In one or more embodiments the transmission module can be in communication with both the analysis module and monitoring module at the same time and can send extracted key frames to simultaneously to the monitoring module and to the analysis module. The monitoring module can be the same or substantially similar to monitoring modules described herein. The analysis module can include storage, an annotation module, and an active learning module. The key frames can be communicated to each at the same time. Put in storage for later use in analysis or annotation. In one or more embodiments a user can choose to send the key frames to one or more of the active learning, the annotation, and storage. In annotation the extracted key frames can be used as inputs into reports or documentation. In the active learning the key frames can be used for the learning, an example of active learning is found in, International Patent Application Publication WO 2021/007514, entitled “ Active Learning for Inspection Tool, published on January 14, 2021, and which is incorporated herein in its entirety. In one or more embodiments, the key frames can be live streamed to the active learning. In another embodiment, the key frames can be stored in storage and later loaded into the active learning as inputs.
[0048] It will also be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms may be used to distinguish one element from another. For example, a first object or step could be termed a second object or step, and, similarly, a second object or step could be termed a first object or step, without departing from the scope of the disclosure. The first object or step, and the second object or step, are both objects or steps, respectively, but they are not to be considered a same object or step.
[0049] The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods and according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
[0050] The terminology used herein is for the purpose of describing particular embodiments and is not intended to be limiting of the disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
[0051] The corresponding structures, materials, acts, and equivalents of means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed.
The description of the present disclosure has been presented for purposes of illustration and description but is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the disclosure. The embodiment was chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.
[0052] Although a few example embodiments have been described in detail above, those skilled in the art will readily appreciate that many modifications are possible in the example embodiments without materially departing from the scope of the present disclosure, described herein. Accordingly, such modifications are intended to be included within the scope of this disclosure as defined in the following claims. In the claims, means-plus-function clauses are intended to cover the structures described herein as performing the recited function and not only structural equivalents, but also equivalent structures. Thus, although a nail and a screw may not be structural equivalents in that a nail employs a cylindrical surface to secure wooden parts together, whereas a screw employs a helical surface, in the environment of fastening wooden parts, a nail and a screw may be equivalent structures. It is the express intention of the applicant not to invoke 35 U.S.C. § 112, paragraph 6 for any limitations of any of the claims herein, except for those in which the claim expressly uses the words ‘means for’ together with an associated function.
[0053] Some of the methods and processes described above, can be performed by a processor. The term “processor” should not be construed to limit the embodiments disclosed herein to any particular device type or system. The processor may include a computer system. The computer system may also include a computer processor (e.g., a microprocessor, microcontroller, digital signal processor, or general-purpose computer) for executing any of the methods and processes described above. [0054] The computer system may further include a memory such as a semiconductor memory device (e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM), a magnetic memory device (e.g., a diskette or fixed disk), an optical memory device (e.g., a CD-ROM), a PC card (e.g., PCMCIA card), or other memory device.
[0055] Some of the methods and processes described above, can be implemented as computer program logic for use with the computer processor. The computer program logic may be embodied in various forms, including a source code form or a computer executable form. Source code may include a series of computer program instructions in a variety of programming languages (e.g., an object code, an assembly language, or a high-level language such as C, C++, or JAVA). Such computer instructions can be stored in a non-transitory computer readable medium (e.g., memory) and executed by the computer processor. The computer instructions may be distributed in any form as a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over a communication system (e.g., the Internet or World Wide Web).
[0056] Alternatively or additionally, the processor may include discrete electronic components coupled to a printed circuit board, integrated circuitry (e.g., Application Specific Integrated Circuits (ASIC)), and/or programmable logic devices (e.g., a Field Programmable Gate Arrays (FPGA)). Any of the methods and processes described above can be implemented using such logic devices.
[0057] Having thus described the disclosure of the present application in detail and by reference to embodiments thereof, it will be apparent that modifications and variations are possible without departing from the scope of the disclosure defined in the appended claims.

Claims

1. A system comprising: a key frame extraction module configured to receive a video feed and generate one or more key frames; a transmission module configured to receive the one or more key frames and transmit the one or more key frames to a monitoring module; and a monitoring module configured to receive the one or more key frames and perform post-processing on the one or more key frames.
2. The system of claim 1, wherein the monitoring module performs automated monitoring.
3. The system of claim 1, wherein the key frame extraction module is performed in real time.
4. The system of claim 1, wherein the key frame extraction module is configured to remove redundant or blurred images.
5. The system of claim 1, wherein the monitoring module is configured to detect one or more anomalies within a plurality of repetitive video frames.
6. The system of claim 1, further comprising: an active learning module configured to perform active learning on the one or more key frames and minimize sampling of redundant images.
7. The system of claim 1, wherein the system is associated with an automated underwater vehicle.
PCT/US2021/015300 2020-01-27 2021-01-27 Key frame extraction for underwater telemetry and anomaly detection WO2021154861A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202062966129P 2020-01-27 2020-01-27
US62/966,129 2020-01-27

Publications (1)

Publication Number Publication Date
WO2021154861A1 true WO2021154861A1 (en) 2021-08-05

Family

ID=77079937

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/015300 WO2021154861A1 (en) 2020-01-27 2021-01-27 Key frame extraction for underwater telemetry and anomaly detection

Country Status (1)

Country Link
WO (1) WO2021154861A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050228849A1 (en) * 2004-03-24 2005-10-13 Tong Zhang Intelligent key-frame extraction from a video
US20090225169A1 (en) * 2006-06-29 2009-09-10 Jin Wang Method and system of key frame extraction
CN110096945A (en) * 2019-02-28 2019-08-06 中国地质大学(武汉) Indoor Video key frame of video real time extracting method based on machine learning
KR102054153B1 (en) * 2019-07-11 2019-12-12 가온플랫폼 주식회사 Artificial intelligence automatic identification system by fusion of deep run based submarine sonar data and periscope image data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050228849A1 (en) * 2004-03-24 2005-10-13 Tong Zhang Intelligent key-frame extraction from a video
US20090225169A1 (en) * 2006-06-29 2009-09-10 Jin Wang Method and system of key frame extraction
CN110096945A (en) * 2019-02-28 2019-08-06 中国地质大学(武汉) Indoor Video key frame of video real time extracting method based on machine learning
KR102054153B1 (en) * 2019-07-11 2019-12-12 가온플랫폼 주식회사 Artificial intelligence automatic identification system by fusion of deep run based submarine sonar data and periscope image data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
EJAZ NAVEED, TAYYAB BIN TARIQ, SUNG WOOK BAIK: "ADAPTIVE KEY FRAME EXTRACTION FOR VIDEO SUMMARIZATION USING AN AGGREGATION MECHANISM", JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, vol. 23, no. 7, 6 July 2012 (2012-07-06), pages 1031 - 1040, XP055832525, DOI: 10.1016/j.jvcir.2012.06.013 *

Similar Documents

Publication Publication Date Title
US8463025B2 (en) Distributed artificial intelligence services on a cell phone
US20220027634A1 (en) Video processing method, electronic device and storage medium
EP2806374B1 (en) Method and system for automatic selection of one or more image processing algorithm
US11468680B2 (en) Shuffle, attend, and adapt: video domain adaptation by clip order prediction and clip attention alignment
CN108629284A (en) The method and device of Real- time Face Tracking and human face posture selection based on embedded vision system
US10726335B2 (en) Generating compressed representation neural networks having high degree of accuracy
US11593596B2 (en) Object prediction method and apparatus, and storage medium
WO2009154861A9 (en) Annotating images
JP7286013B2 (en) Video content recognition method, apparatus, program and computer device
CN109063581A (en) Enhanced Face datection and face tracking method and system for limited resources embedded vision system
US10445586B2 (en) Deep learning on image frames to generate a summary
CN112836676A (en) Abnormal behavior detection method and device, electronic equipment and storage medium
CN110298296B (en) Face recognition method applied to edge computing equipment
US11798254B2 (en) Bandwidth limited context based adaptive acquisition of video frames and events for user defined tasks
CN112949456B (en) Video feature extraction model training and video feature extraction method and device
Du et al. Classifying cutting volume at shale shakers in real-time via video streaming using deep-learning techniques
US11532158B2 (en) Methods and systems for customized image and video analysis
WO2021154861A1 (en) Key frame extraction for underwater telemetry and anomaly detection
US10462490B2 (en) Efficient video data representation and content based video retrieval framework
WO2023005760A1 (en) Systems and methods for performing computer vision task using sequence of frames
US11816181B2 (en) Blur classification and blur map estimation
McBride et al. Design and construction of a hybrid edge-cloud smart surveillance system with object detection
CN111639599B (en) Object image mining method, device, equipment and storage medium
Wang et al. Object recognition offloading in augmented reality assisted UAV-UGV systems
Benbarrad et al. Impact of standard image compression on the performance of image classification with deep learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21747978

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21747978

Country of ref document: EP

Kind code of ref document: A1