WO2022116545A1 - Interaction method and apparatus based on multi-feature recognition, and computer device - Google Patents

Interaction method and apparatus based on multi-feature recognition, and computer device Download PDF

Info

Publication number
WO2022116545A1
WO2022116545A1 PCT/CN2021/106342 CN2021106342W WO2022116545A1 WO 2022116545 A1 WO2022116545 A1 WO 2022116545A1 CN 2021106342 W CN2021106342 W CN 2021106342W WO 2022116545 A1 WO2022116545 A1 WO 2022116545A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
video stream
key frame
stream data
dimensional model
Prior art date
Application number
PCT/CN2021/106342
Other languages
French (fr)
Chinese (zh)
Inventor
侯战胜
彭林
王鹤
徐敏
于海
王刚
鲍兴川
朱亮
何志敏
宋金根
孙世军
Original Assignee
全球能源互联网研究院有限公司
国家电网有限公司
国网浙江省电力有限公司
国网山东省电力公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 全球能源互联网研究院有限公司, 国家电网有限公司, 国网浙江省电力有限公司, 国网山东省电力公司 filed Critical 全球能源互联网研究院有限公司
Publication of WO2022116545A1 publication Critical patent/WO2022116545A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality

Definitions

  • the present application relates to the fields of streaming media and information communication, and in particular, to an interaction method, apparatus and computer equipment based on multi-feature identification.
  • the audio and video streaming call technology has mainly experienced local analog signal audio and video systems, based on personal computer (Personal Computer, PC ) multimedia remote assistance system, remote audio and video collaboration system based on web page (Web) server, and audio and video collaboration system based on mobile terminal.
  • PC Personal Computer
  • Web web page
  • audio and video collaboration system based on mobile terminal.
  • audio and video streaming call technology is in the stage of a mobile terminal-based audio and video collaboration system (audio and video call).
  • the embodiments of the present application provide an interaction method, device, and computer equipment based on multi-feature identification, so as to at least solve the problem that remote maintenance operations in the related art cannot be very accurate and effective.
  • an embodiment of the present application provides an interaction method based on multi-feature identification, the method comprising: acquiring target video stream data of a target device; calling a three-dimensional model of the target device according to the target video stream data; Sending the data of the three-dimensional model to a remote device; and receiving the incremental change information of the three-dimensional model fed back by the remote device.
  • the method further includes: displaying the change of the three-dimensional model according to the change increment information; and controlling the target device according to the change of the three-dimensional model.
  • the acquiring target video stream data of the target device includes: collecting and sending initial video stream data of the target device in the target area; receiving data sent by the remote device initial key frame; according to the initial key frame, determine the target key frame; according to the target key frame, obtain the target video stream data of the target device.
  • determining the target key frame according to the initial key frame includes: extracting color feature information and texture features in the initial video key frame information and motion feature information; fuse the color feature information, texture feature information and motion feature information to calculate the similarity of each initial video key frame; determine candidate video key frames according to the similarity of each initial video key frame; Preset adaptive algorithms to determine target keyframes.
  • the obtaining the target video stream data of the target device according to the target key frame includes: obtaining the first video stream data;
  • the flow method identifies the first feature point in the first video stream data, and identifies the second feature point in the target key frame; when the similarity between the first feature point and the second feature point is greater than the predetermined When the similarity threshold is set, it is determined that the first video stream data matches the target key frame; when the first video stream data matches the target key frame, the first video stream is determined to be the target device's Target video stream data.
  • the method further includes: determining a first center position of the target key frame according to the first feature point and a preset relative distance; The feature point and the preset relative distance are used to determine the second center position of the first video stream data; and the target video stream data of the target device is obtained by tracking according to the first center position and the second center position.
  • an embodiment of the present application further provides an interaction method based on multi-feature identification, the method comprising: receiving data of a three-dimensional model sent by a field device; generating a three-dimensional model according to the data of the three-dimensional model; The three-dimensional model and the preset database are used to determine the change increment information; and the change increment information is fed back to the field device.
  • the method before receiving the data of the three-dimensional model sent by the field device, the method further includes: receiving initial video stream data of the target area sent by the field device; The initial video stream data is determined, the problem area is determined, and an initial key frame is generated according to the problem area; the initial key frame is sent to the field device.
  • an embodiment of the present application provides an interaction device based on multi-feature identification, including: a target video stream data acquisition module configured to acquire target video stream data of a target device; a calling module configured to obtain target video stream data according to the target The video stream data calls the three-dimensional model of the target device; the data sending module is configured to send the data of the three-dimensional model to the remote device; the change increment information receiving module is configured to receive the feedback from the remote device. Change increment information of the 3D model.
  • an embodiment of the present application provides an interaction device based on multi-feature identification, including: a data receiving module configured to receive data of a three-dimensional model sent by a field device; a three-dimensional model generating module configured to receive data according to the three-dimensional model The data of the model generates a three-dimensional model; the determination module is configured to determine the change increment information according to the three-dimensional model and the preset database; the data transmission module is configured to feed back the change increment information to the field device.
  • an embodiment of the present application provides a computer device, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores data that can be used by the one Instructions executed by a processor, the instructions being executed by the at least one processor to cause the at least one processor to perform the multi-feature identification-based interaction described in the first aspect or any one of the embodiments of the first aspect.
  • the steps of the method, or the steps of the interaction method based on multi-feature recognition described in the second aspect or any one of the implementation manners of the second aspect.
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the first aspect or any one of the implementations of the first aspect The steps of the interaction method based on multi-feature recognition, or the steps of the interaction method based on multi-feature recognition described in the second aspect or any implementation manner of the second aspect.
  • a kind of interaction method, device and computer equipment based on multi-feature identification provided by the embodiment of the present application, wherein, the method comprises: obtaining target video stream data of target device; calling the three-dimensional model of target device according to target video stream data; Send the data of the 3D model to the remote device; receive the change increment information of the 3D model fed back by the remote device; control the target device according to the change increment information.
  • the field equipment can obtain accurate guidance information, and the augmented reality method and the 3D power equipment can be realized. Model the interaction of remote virtual reality fusion.
  • An interaction method, device, and computer equipment based on multi-feature identification provided by the embodiments of the present application, wherein the method comprises: receiving data of a three-dimensional model sent by a field device; generating a three-dimensional model according to the data of the three-dimensional model ; According to the three-dimensional model and the preset database, determine the change increment information; and feed back the change increment information to the field device.
  • the remote device can mark the target device from the first perspective, and then accurately guide the field operators to perform operations, which is efficient and accurate, and realizes the augmented reality method and power equipment. Interaction of remote virtual reality fusion of 3D models.
  • a kind of interaction method based on multi-feature identification provided by the embodiment of the present application, combined with the collaborative labeling and identification matching between the remote device and the field device, the target position is determined by the relative distance of the feature points, and the target device can be continuously detected. feature, so as to continuously obtain the real-time location of the target device and achieve accurate tracking of the target device, that is, combine the real-time detection of multi-feature point information of the target device and match with the temporarily stored video key frames to realize the monitoring of the target device. , identification, matching and tracking.
  • FIG. 1 is a schematic structural diagram of communication between a field device and a remote device in an interaction method based on multi-feature identification in an embodiment of the application;
  • FIG. 2 is a flowchart of a specific example of a field device end in an interaction method based on multi-feature identification in an embodiment of the present application;
  • FIG. 3 is a flowchart of a specific example of acquiring target video stream data in an interaction method based on multi-feature identification in an embodiment of the present application
  • FIG. 4 is a flowchart of a specific example of a remote device end in an interaction method based on multi-feature identification in an embodiment of the present application
  • FIG. 5 is a flowchart of another specific example of a remote device end in an interaction method based on multi-feature identification in an embodiment of the present application
  • FIG. 6 is a structural block diagram of a specific embodiment of an interaction method based on multi-feature identification in an embodiment of the present application
  • FIG. 7 is a schematic structural diagram of an example of an interaction device based on multi-feature identification in an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an example of an interaction device based on multi-feature identification in an embodiment of the present application.
  • FIG. 9 is a diagram of a specific example of a computer device in an embodiment of the present application.
  • Collaboration is the development trend of modern society.
  • Traditional face-to-face collaboration methods (such as video conferencing) have huge limitations in both time and space, and are far from meeting people's requirements for collaboration.
  • remote collaboration refers to the process of helping geographically dispersed organizations and individuals to complete collaboration with the support of computer and communication technologies.
  • the platform In order to support effective collaboration, the platform must be able to support live streaming media such as real-time video and audio, as well as other multimedia information, such as graphic annotations, static images, text, etc., as well as comprehensive processing of these multimedia information.
  • the embodiments of the present application provide an interaction method, device and computer equipment based on multi-feature identification, which aim to communicate with remote equipment through field equipment.
  • the real-time audio and video interaction between them can efficiently and accurately guide the field equipment to perform operations.
  • the field device communicates with the remote device through a wireless channel.
  • the field device can be provided with image acquisition devices such as wearable terminal equipment, camera equipment or a control ball, as well as wireless communication devices; the remote device can be set up There are wireless communication modules for communicating with other devices.
  • the remote device may be a remote expert device or a remote device.
  • the field device can send the collected live video stream data to the remote expert device, and the remote expert device can receive the video stream data through the wireless communication module and send back corresponding feedback information.
  • the embodiment of the present application provides an interaction method based on multi-feature identification, which is specifically applied to the field device side. As shown in FIG. 2 , the method includes:
  • Step S11 Acquire target video stream data of the target device.
  • the target device may be any device set in an actual application scenario.
  • the target device when applied to a power grid transmission scenario, the target device may be an electronic device, such as an oil temperature gauge, an electronic switch, and the like.
  • Video stream data can be a data form in which a series of continuous image information is stored and recorded. The continuous image records specific events in one or more continuous periods of time. When the continuous images are played sequentially at a faster frame rate, A continuous picture will be displayed, that is, the video stream data.
  • the target video stream data may be the video stream data obtained according to the problem area marked by the remote device, or the video stream data corresponding to the problem area obtained again.
  • the field device acquires the video stream data of the problem area, that is, the target video stream data.
  • a remote device for example, a technical support expert device, a remote device
  • Stream data is the target video stream data.
  • the on-site device may acquire video stream data through a wearable terminal device, a camera device, or a control ball.
  • Step S12 calling the three-dimensional model of the target device according to the target video stream data.
  • the three-dimensional model may refer to a three-dimensional three-dimensional model, which is used to represent structural feature information and the like of the target device.
  • the three-dimensional model may refer to a three-dimensional three-dimensional model, which is used to represent structural feature information and the like of the target device.
  • the target device when the target device is determined to be an oil temperature gauge according to the target video stream data, first determine the device model of the oil temperature gauge, such as xxx-1, and then call the device model from the preset 3D model data as xxx-1
  • the three-dimensional model of the oil temperature gauge is displayed on the field device side.
  • Step S13 Send the data of the three-dimensional model to the remote device.
  • the field device communicates with the remote device through wireless channels, transmits data, etc.; the field device sends the data of the generated three-dimensional model to the remote device (or remote device, remote expert device).
  • Step S14 Receive the incremental change information of the three-dimensional model fed back by the remote device.
  • the three-dimensional model of the target device is displayed on the remote device based on the data of the three-dimensional model.
  • the incremental change information of the 3D model may be an information record of operations performed on the 3D model of the target device in order to solve problems existing in the target device when an expert or professional on the remote device side observes the target device from a first perspective, For example, when experts or professionals on the remote device side confirm that there is a problem with the oil temperature gauge, they will perform corresponding operations on the 3D model of the remote device. For example, move the oil temperature gauge to the left by 0.6 cm.
  • the incremental change information is "move the oil temperature gauge to the left by 0.6 cm". Through the wireless channel, the above-mentioned incremental change information "move the oil temperature gauge to the left by 0.6 cm" is transmitted from the remote device to the field device.
  • An interaction method based on multi-feature identification includes: acquiring target video stream data of a target device; calling a three-dimensional model of the target device according to the target video stream data; sending the data of the three-dimensional model to a remote device; Receive the change increment information of the 3D model fed back by the remote device; display the change of the 3D model according to the change increment information.
  • the method further includes: displaying the change of the three-dimensional model according to the change increment information; and controlling the target device according to the change of the three-dimensional model.
  • the field device may directly display the change increment information on the 3D model, for example, adjust the 3D model accordingly according to the received change increment information, for example, when the received change increment information For "move the oil temperature gauge to the left by 0.6 cm", the oil temperature gauge in the 3D model of the field device is directly moved to the left by 0.6 cm.
  • the operation and maintenance personnel can control the target equipment according to the change of the 3D model. For example, the operation and maintenance personnel can change the 3D model and then control the oil temperature gauge in the actual equipment, so that the oil temperature gauge moves 0.6 cm to the left.
  • the target video stream data of the target device is acquired, including:
  • Step S21 Collect and send the initial video stream data of the target device in the target area.
  • the target area can be any area in the actual application scene
  • the initial video stream data can be the video stream initially collected by the field device.
  • the video stream data on the device side, and the collected initial video stream data is transmitted to the remote device in real time, that is, the remote expert device, the remote device, etc.
  • Step S22 Receive the initial key frame sent by the remote device.
  • the initial key frame may be the remote device drawing and marking the problematic area in the initial video stream data from the first perspective.
  • the remote device is, for example, text marking and image marking.
  • the clip is the initial keyframe.
  • the field device can receive the initial key frame fed back by the remote device.
  • Step S23 Determine the target key frame according to the initial key frame.
  • the target key frame may be a frame containing all the feature information in the initial key frame extracted after the field device optimizes and aggregates the initial key frame, that is, the target key frame.
  • the field device extracts the key information in the initial key frame, removes redundant information in the initial key frame, and extracts the key information through the image saliency detection method, the candidate key frame extraction method, the adaptive hierarchical clustering and other methods.
  • the frame of information is used as the target key frame, and the target key frame is stored in a structured manner.
  • Step S24 Acquire target video stream data of the target device according to the target key frame.
  • the target video stream data of the target device is determined according to the target key frame, which may be matched with the re-collected video stream data according to the target key frame; when the re-collected video stream data matches the target key frame When matching, it can be determined that the re-collected video stream data is the target video stream data.
  • the text annotation, image annotation and 3D model of the remote device on the target key frame can be displayed on the re-collected video stream data. Labeling etc.
  • An interactive method based on multi-feature recognition provided by the embodiment of the present application, combined with the video stream key frame technology, extracts key information in the video, eliminates redundant information in the video, and performs image saliency detection, candidate key frame extraction, automatic Adapting to methods such as hierarchical clustering, determining target key frames and storing the above target key frames in a structured manner can efficiently and accurately determine target key frames and target video stream data, minimizing resource consumption and maximizing key information storage .
  • the above step S23 determines the execution process of the target key frame, including: extracting color feature information, texture feature information and motion feature information in the initial video key frame; The color feature information, texture feature information and motion feature information are fused to calculate the similarity of each initial video key frame respectively; according to the similarity of each initial video key frame, the candidate video key frame is determined; according to the preset adaptive algorithm, the target is determined Keyframe.
  • the color feature information is the most prominent feature in the image, which is based on the feature of pixel points, and different electrical devices display different colors.
  • the process of extracting color feature information may include: extracting color feature information in an initial video key frame, and describing the foregoing color feature information with a histogram.
  • the field device can generate a color histogram according to different color feature information of each power device.
  • the texture feature may represent the global feature of the image, and describe the surface properties of the scene corresponding to the image or the image area.
  • the process of extracting texture feature information may include: performing statistical calculation to obtain texture feature information according to multiple regions including multiple pixel points; Divide into multiple images, obtain texture feature information such as the number of different regions, pixel positions, pixel value sets, etc., and determine the texture feature information of the initial video key frame.
  • the process of extracting motion feature information may include: first extracting the saliency image of the initial video key frame, specifically through the saliency detection algorithm (SDSP), and based on the CIE L*a*b* Color characteristics (wherein, CIE L*a*b* is a three-dimensional color space based on human color perception, which is the most widely used color space by the International Commission on Illumination (CIE, Commission Internationale De L'E' clairage). Its three-dimensional space L* represents brightness, a* represents red-green axis, b* represents blue-yellow axis), contrast principle, the core rules of saliency calculation, determine the saliency target in the initial video key frame, and can save the original image most information.
  • the motion estimation of the saliency image is performed to generate the motion feature information of the saliency image.
  • An interaction method based on multi-feature recognition provided by the embodiment of the present application, combined with the SDSP algorithm, that is, combined with three kinds of prior knowledge: human vision always detects the behavior of prominent objects in the scene, which can be simulated by log-Gabor filter; Human vision tends to focus on the center of the image, modeled with a Gaussian map; warm colors attract more visual attention than cool colors.
  • the algorithm can exclude the influence of color, complex texture and changing background, and obtain saliency images quickly and accurately.
  • the color feature information, texture feature information and motion feature information are fused, and the similarity of each initial video key frame is calculated separately, and the similarity between each initial video key frame may be a representation of the value of each initial video key frame.
  • the color feature information, texture feature information and motion feature information are normalized to generate a fusion feature vector, and according to the above-mentioned fusion feature vector, the Euclidean distance between two adjacent initial video key frames is calculated, and then according to Euclidean distance determines the similarity between two adjacent initial video keyframes. Among them, the smaller the Euclidean distance, the higher the similarity between two adjacent frames.
  • candidate video key frames are determined according to the similarity of each initial video key frame; and target key frames are determined according to a preset adaptive algorithm.
  • the clustering threshold is determined, and the mutual information (Mutual Information, MI) between the saliency images of each initial video key frame is determined, and the mutual information is used to characterize the correlation between the two variables.
  • MI Magnetic Information
  • the process of determining the target key frame by the adaptive hierarchical clustering algorithm may be: according to the initial video key frame, determine the candidate video key frame, that is, the candidate key frame sequence; calculate the saliency of each adjacent candidate key frame sequence.
  • the mutual information between images is calculated to determine the mutual information sequence; the joint probability is calculated according to the normalized overlapping area of each adjacent image and the histogram, and the clustering threshold is determined according to the joint probability; the mutual information sequence is in descending order of mutual information value Then, according to the original time sequence of candidate key frames, the first frame is regarded as the first cluster, and if the mutual information value between the following two frames is less than or equal to the threshold, a new cluster is generated. On the contrary, subsequent frames are divided into the current cluster, thereby determining the target key frame, and the target key frame is an ordered cluster cluster and the frames in each cluster are also ordered according to the relevance of the original video content.
  • An interactive method based on multi-feature recognition provided by the embodiment of the present application, combined with the rotational invariance of texture feature information, can have strong resistance to noise, thereby realizing the realization of the object information contained in the image at the micro level. distinguish.
  • the SDSP algorithm can be applied to the original video sequence.
  • the saliency information of eye attention it can quantitatively describe the data information contained in the video frame, and then obtain candidate key frames with less redundancy; using clustering to adaptively determine the threshold, it can better solve the inaccurate selection of initial boundary points
  • the problem of unstable clustering results is that the final clusters obtained after adaptive hierarchical clustering are arranged in the chronological order of the original video content, and the extracted key frames maintain the timing of the original input video.
  • step S24 the execution process of acquiring the target video stream data of the target device according to the target key frame includes: acquiring first video stream data; identifying the first video stream data according to a preset optical flow method the first feature point in the video stream data, and identify the second feature point in the target key frame; when the similarity between the first feature point and the second feature point is greater than a preset similarity threshold, determine the The first video stream data matches the target key frame; when the first video stream data matches the target key frame, the first video stream is determined as the target video stream data of the target device.
  • the first video stream data may be video stream data collected when the wearable terminal device, the camera device or the control ball moves to the target area again.
  • the forward optical flow method or the backward optical flow method can be used to determine the first feature point in the first video stream data and the second feature point in the target key frame.
  • the first feature point may be a plurality of feature points including multiple features in the first video stream data, for example, may be a target feature point in a pixel; then the target key frame is identified in a manner similar to the above, And extract multiple target feature points in target keyframes.
  • the degree of similarity between the first feature point in the first video stream data and the second feature point in the target key frame in terms of position, quantity, etc. is calculated, and compared with a preset similarity threshold, when the calculated When the similarity degree is greater than the preset similarity threshold, it is determined that the first video stream data and the target key frame are successfully matched.
  • the on-site device when the first video stream data is successfully matched with the target key frame data, it means that the on-site device has re-captured the video stream containing the target device marked with the problem from the remote device through the wearable terminal at this time. data, that is, the target video stream data.
  • the embodiment of the present application provides an interaction method based on multi-feature recognition.
  • the video stream of the field device can be collected through a wearable terminal, a camera device or a deployment ball, and then the video can be read.
  • the remote expert draws and annotates the collaborative target of the video stream collected by the field operators from the first perspective, identifies the target feature points through the forward/backward optical flow method, calculates the characteristics and feature description of the current frame, and queries
  • the key frame temporary storage set is to match the feature points in the current frame with the feature points of the key frame temporary storage set.
  • the remote collaborative labeling target recognition match is successful, preparing for the interaction of augmented reality information overlay, otherwise the Keyframe incremental updates are stored to the keyframe temporary storage collection. That is to say, when the first video stream data is successfully matched with the target key frame, a 3D model can be generated according to the power equipment based on the augmented reality service platform and the augmented reality method, and text and image annotations can be superimposed on the 3D model. The positional relationship, angle, operation behavior and model feedback results between the cooperating personnel, power equipment and the model, the incremental change information is transmitted, and the distributed terminals are individually encoded and decoded. Tracking interaction between field devices with multi-feature recognition and remote experts.
  • the method further includes: determining the first center position of the target key frame according to the first feature point and the preset relative distance; determining the first center position of the target key frame according to the second feature point and the preset relative distance
  • the second central position of the video stream data; the target video stream data of the target device is obtained by tracking according to the first central position and the second central position.
  • the preset relative distance is the distance between the target feature point and the center position. Since the relative distance between the feature point and the center position of the same image remains unchanged during scaling and rotation, according to the first feature point And the preset relative distance determines the first center position of the target key frame, and according to the second feature point, determines the second center position of the first video stream data; According to the detected first center position and the second center position, continue to obtain The target video stream data of the target device to achieve continuous tracking of the target device.
  • the embodiment of the present application provides an interactive method based on multi-feature identification, which combines the cluster voting on the center to determine the center position and the relative distance of each feature point to determine the position of the target device. Since the distance of each feature point relative to the center position It is determined under the scaling and rotation ratio, so the real-time tracking of the position of the object can be realized by continuously detecting the characteristics of the object. By detecting the multi-feature point information of the object in real time and matching it with the temporary storage set of the structured key frame of the video stream, the monitoring, identification, matching and tracking of the object is realized.
  • the embodiment of the present application provides an interaction method based on multi-feature identification, which is specifically applied to a remote device or a remote device, as shown in FIG. 4 , including:
  • Step S31 Receive the data of the three-dimensional model sent by the field device.
  • the remote device receives the data of the three-dimensional model sent by the field device.
  • Step S32 Generate a three-dimensional model according to the data of the three-dimensional model.
  • the remote device constructs the three-dimensional model according to the received data of the three-dimensional model.
  • Step S33 Determine the change increment information according to the three-dimensional model and the preset database.
  • the remote device adjusts the problem area in the three-dimensional model according to the preset database. For example, when the remote device determines that there is a problem with the oil temperature gauge in the three-dimensional model, it will adjust the oil temperature according to the preset database Adjust the temperature gauge. For example, move the oil temperature gauge to the left by 10 cm or 0.6 cm to the left, and the above adjustment information is the change increment information.
  • Step S34 Feed back the change increment information to the field device.
  • the adjustment information "move the oil temperature gauge to 0.6 cm" is transmitted to the field device as the change increment information.
  • An interaction method based on multi-feature identification includes: receiving data of a three-dimensional model sent by a field device; generating a three-dimensional model according to the data of the three-dimensional model; Quantity information; feedback the incremental change information to the field device.
  • the remote device can mark the target device from the first perspective, and then accurately guide the on-site operators to perform the operation, which is efficient and accurate, and realizes the augmented reality method and the three-dimensional power equipment. Model the interaction of remote virtual reality fusion.
  • step S31 before receiving the data of the three-dimensional model sent by the field device, the method further includes:
  • Step S301 Receive initial video stream data of the target area sent by the field device.
  • Step S302 Determine a problem area according to the initial video stream data, and generate an initial key frame according to the problem area.
  • the problem area may be an area where experts or technicians on the remote device side consider that some devices and wiring methods have problems.
  • the remote expert may mark the problem area in the form of text or in the form of images, and then generate initial key frames.
  • Step S303 Send the initial key frame to the field device.
  • the remote device after the remote expert marks the initial video stream data sent by the field device, the remote device generates an initial key frame, and then sends the generated initial key frame to the field device.
  • the embodiment of the present application provides an interaction method based on multi-feature recognition, which combines remote experts to draw and label the video stream data collected by field operators from a first perspective, and then generates initial key video stream segments, which can be collected efficiently and accurately And store the keyframes of the video stream.
  • video frames are the most basic components of video streams.
  • the video frames with the most abundant information are extracted, and the main components in the video frames are extracted.
  • the content is converted into high-level semantic information for structured information storage.
  • the information contained in the video stream is divided into low-level feature information, key image frame information and high-level semantic information.
  • the underlying feature information refers to the extraction of global features, local features and structural features of the image.
  • the global features are the basic features of the image, such as shape, color, texture, etc.; the local features extract the feature point set of the video image for feature matching; the structural features reflect the geometric and spatial-temporal relationships between the image features.
  • Key image frame information refers to extracting key frames according to the underlying features of the image and target information. By fusing a variety of underlying feature information, the information difference between frames or the information richness of video frames is represented, and then representative videos are screened out. frame.
  • High-level semantic information refers to semantically logical description and feature expression according to the target and content contained in the video.
  • a targeted model is trained, and target semantics, scene semantics, image semantics, etc. are extracted, and the extracted semantic information is synthesized, and text sentences are extracted to logically describe the events reflected in the video. , which is convenient for users to intuitively understand, store and retrieve.
  • the mobile intelligent terminal and the background server communicate through a wireless network.
  • the background server can register the information of multiple power devices, and the power devices can be pre-associated with text annotations, 3D models, etc.
  • the background server pre-classifies and stores text annotations and 3D models, pre-determines the rendering parameters of the 3D models, and pre-lightens the 3D models.
  • the mobile intelligent terminal can download text annotations and 3D models of multiple power equipment from the background server. After the download is complete, the 3D model is rendered on the mobile intelligent terminal, and the virtual scene of the 3D model and the actual scene of the power equipment are integrated, and then Display the 3D model that has been superimposed with text annotations and keep track of the corresponding electrical equipment.
  • the embodiments of the present application also provide an interaction device based on multi-feature identification, which is applied to field devices.
  • the device includes:
  • the target video stream data acquisition module 41 is configured to acquire target video stream data of the target device
  • the calling module 42 is configured to call the three-dimensional model of the target device according to the target video stream data
  • the data sending module 43 is configured to send the data of the three-dimensional model to the remote device;
  • the change increment information receiving module 44 is configured to receive the change increment information of the three-dimensional model fed back by the remote device.
  • the on-site device can obtain accurate guidance information, and the augmented reality method and power The interaction of remote virtual reality fusion of 3D models of equipment.
  • the apparatus further includes:
  • a display module configured to display the change of the three-dimensional model according to the change increment information received by the change increment information receiving module 44;
  • the control module is configured to control the target device according to the change of the three-dimensional model.
  • the target video stream data acquisition module 41 is configured to collect and send the initial video stream data of the target device in the target area; receive the initial key frame sent by the remote device; For the initial key frame, a target key frame is determined; according to the target key frame, the target video stream data of the target device is acquired.
  • the target video stream data acquisition module 41 is configured to extract color feature information, texture feature information and motion feature information in the initial video key frame; , texture feature information and motion feature information are fused to calculate the similarity of each initial video key frame respectively; according to the similarity of each initial video key frame, the candidate video key frame is determined; according to the preset adaptive algorithm, the target key frame is determined.
  • the target video stream data acquisition module 41 is configured to acquire first video stream data; identify the first feature point in the first video stream data according to a preset optical flow method , and identify the second feature point in the target key frame; when the similarity between the first feature point and the second feature point is greater than a preset similarity threshold, determine that the first video stream data and the Target key frame matching; when the first video stream data matches the target key frame, the first video stream is determined as the target video stream data of the target device.
  • the apparatus further includes a tracking acquisition module configured to determine a first center position of the target key frame according to the first feature point and a preset relative distance; The second feature point and the preset relative distance determine the second center position of the first video stream data; and the target video stream data of the target device is obtained by tracking according to the first center position and the second center position.
  • a tracking acquisition module configured to determine a first center position of the target key frame according to the first feature point and a preset relative distance; The second feature point and the preset relative distance determine the second center position of the first video stream data; and the target video stream data of the target device is obtained by tracking according to the first center position and the second center position.
  • the embodiment of the present application also provides an interaction device based on multi-feature identification, which is applied to a remote device.
  • the device includes:
  • the data receiving module 51 is configured to receive the data of the three-dimensional model sent by the field device;
  • the three-dimensional model generation module 52 is configured to generate a three-dimensional model according to the data of the three-dimensional model
  • the determination module 53 is configured to determine the change increment information according to the three-dimensional model and the preset database
  • the data sending module 54 is configured to feed back the change increment information to the field device.
  • the remote device can mark the target device from the first perspective, thereby accurately guiding the field operators to perform operations, efficiently and accurately, and realizes the augmented reality method and Interaction of remote virtual-real fusion of 3D models of power equipment.
  • the apparatus further includes an initial key frame generation module
  • the data receiving module 51 is further configured to receive the initial video stream data of the target area sent by the field device;
  • the initial key frame generation module is configured to determine a problem area according to the initial video stream data, and generate an initial key frame according to the problem area;
  • the data sending module 54 is further configured to send the initial key frame to the field device.
  • the interaction device based on multi-feature recognition performs the interaction based on multi-feature recognition
  • only the division of the above-mentioned program modules is used as an example.
  • the allocation is done by different program modules, that is, the internal structure of the device is divided into different program modules, so as to complete all or part of the processing described above.
  • the interaction device based on multi-feature identification provided in the above embodiment and the embodiment of the interaction method based on multi-feature identification belong to the same concept, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.
  • the computer device may include a processor 61 and a memory 62 , where the processor 61 and the memory 62 may be connected through a bus 60 or in other ways.
  • the The connection via the bus 60 is for example.
  • the processor 61 may be a central processing unit (Central Processing Unit, CPU).
  • the processor 61 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), application specific integrated circuits (Application Specific Integrated Circuits, ASICs), Field-Programmable Gate Arrays (Field-Programmable Gate Arrays, FPGAs) or Other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components and other chips, or a combination of the above types of chips.
  • DSPs Digital Signal Processors
  • ASICs Application Specific Integrated Circuits
  • FPGAs Field-Programmable Gate Arrays
  • Other programmable logic devices discrete gate or transistor logic devices, discrete hardware components and other chips, or a combination of the above types of chips.
  • the memory 62 can be used to store non-transitory software programs, non-transitory computer-executable programs and modules, such as programs corresponding to the interaction method based on multi-feature identification in the embodiments of the present application Directive/Module.
  • the processor 61 executes various functional applications and data processing of the processor by running the non-transitory software programs, instructions and modules stored in the memory 62, that is, to implement the multi-feature identification-based interaction method in the above method embodiments.
  • the memory 62 may include a storage program area and a storage data area, wherein the storage program area may store an operating system and an application program required by at least one function; the storage data area may store data created by the processor 61 and the like. Additionally, memory 62 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 62 may optionally include memory located remotely from processor 61, which may be connected to processor 61 via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof. The one or more modules are stored in the memory 62, and when executed by the processor 61, execute the multi-feature identification-based interaction method in this embodiment of the present application.
  • Embodiments of the present application also provide a non-transitory computer-readable storage medium, where the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions are used to cause the computer to execute the multi-feature-based method described in any of the foregoing embodiments.
  • An interactive method for identification wherein the storage medium can be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a flash memory (Flash Memory), Hard Disk Drive (Hard Disk Drive, abbreviation: HDD) or Solid-State Drive (Solid-State Drive, SSD), etc.; the storage medium may also include a combination of the above-mentioned types of memories.
  • the disclosed apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the coupling, or direct coupling, or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical or other forms. of.
  • the unit described above as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit, that is, it may be located in one place or distributed to multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may all be integrated into one processing unit, or each unit may be separately used as a unit, or two or more units may be integrated into one unit; the above integration
  • the unit can be implemented either in the form of hardware or in the form of hardware plus software functional units.
  • the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence, or the parts that make contributions to the prior art.
  • the computer software products are stored in a storage medium and include several instructions to make a computer device (It may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic disk or an optical disk and other mediums that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed are an interaction method and apparatus based on multi-feature recognition, and a computer device. The method comprises: acquiring target video stream data of a target device; calling a three-dimensional model of the target device according to the target video stream data; sending data of the three-dimensional model to a remote device; receiving incremental change information of the three-dimensional model that is fed back by the remote device; and displaying changes of the three-dimensional model according to the incremental change information.

Description

一种基于多特征识别的交互方法、装置及计算机设备A kind of interactive method, device and computer equipment based on multi-feature recognition
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请基于申请号为202011416912.9、申请日为2020年12月4日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本申请。This application is based on the Chinese patent application with the application number of 202011416912.9 and the filing date of December 4, 2020, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated by reference into this application.
技术领域;technical field;
本申请涉及流媒体以及信息通信领域,具体涉及一种基于多特征识别的交互方法、装置及计算机设备。The present application relates to the fields of streaming media and information communication, and in particular, to an interaction method, apparatus and computer equipment based on multi-feature identification.
背景技术Background technique
随着互联网、信息通信及第五代移动通信技术(5th Generation Mobile Communication Technology,5G)通信技术的发展,音视频流通话技术主要经历了本地模拟信号音视频系统、基于个人计算机(Personal Computer,PC)的多媒体远程协助系统、基于网页(Web)服务器的远程音视频协作系统、基于移动端的音视频协作系统等四个发展阶段。目前音视频流通话技术发展正处于基于移动端的音视频协作系统(音视频通话)的阶段。With the development of the Internet, information communication and the 5th Generation Mobile Communication Technology (5G) communication technology, the audio and video streaming call technology has mainly experienced local analog signal audio and video systems, based on personal computer (Personal Computer, PC ) multimedia remote assistance system, remote audio and video collaboration system based on web page (Web) server, and audio and video collaboration system based on mobile terminal. At present, the development of audio and video streaming call technology is in the stage of a mobile terminal-based audio and video collaboration system (audio and video call).
而随着科技发展,对协作方式也提出了更高的要求。相关技术中远程视频通话的协作方式仍存在较大局限性,具体来说,电力企业作为关系能源安全和国计民生的公共服务企业,具备电网输变配巡视、检修、应急抢修等作业现场,电力设备类型多且操作复杂、新问题不断涌现且识别处理难度大,处理时需要跨班组及跨工区协同作业以及需要技术专家或设备制造厂商远程支持,而相关技术中,提供远程支持的方式多为通过普通视频通话实现,由于远端使用视频通话设备所采集的视频的视角相对有限,很 可能无法实现以相同视角对现场进行实时指导操作,从而导致远程维修维护的作业并不能十分准确、有效。With the development of science and technology, higher requirements have also been put forward for collaboration methods. There are still great limitations in the collaboration method of remote video calls in related technologies. Specifically, as a public service enterprise related to energy security and the national economy and people's livelihood, power companies have power grid transmission and distribution inspections, maintenance, emergency repairs and other job sites, and power equipment. There are many types and complex operations, new problems are constantly emerging, and identification and processing are difficult. The processing requires cross-team and cross-workspace collaborative operations, as well as remote support from technical experts or equipment manufacturers. In related technologies, the way to provide remote support is mostly through In the realization of ordinary video calls, due to the relatively limited viewing angle of the video collected by the remote video calling equipment, it is likely that real-time guidance operations on the scene from the same viewing angle cannot be achieved, resulting in remote repair and maintenance operations that are not very accurate and effective.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供了一种基于多特征识别的交互方法、装置及计算机设备,以至少解决相关技术中远程维修维护的作业并不能十分准确、有效的问题。The embodiments of the present application provide an interaction method, device, and computer equipment based on multi-feature identification, so as to at least solve the problem that remote maintenance operations in the related art cannot be very accurate and effective.
第一方面,本申请实施例提供了一种基于多特征识别的交互方法,所述方法包括:获取目标设备的目标视频流数据;根据所述目标视频流数据调用所述目标设备的三维模型;发送所述三维模型的数据至远端设备;接收所述远端设备反馈的所述三维模型的变化增量信息。In a first aspect, an embodiment of the present application provides an interaction method based on multi-feature identification, the method comprising: acquiring target video stream data of a target device; calling a three-dimensional model of the target device according to the target video stream data; Sending the data of the three-dimensional model to a remote device; and receiving the incremental change information of the three-dimensional model fed back by the remote device.
结合第一方面,在第一方面第一实施方式中,该方法还包括:根据所述变化增量信息显示所述三维模型的变化;根据所述三维模型的变化控制所述目标设备。With reference to the first aspect, in the first embodiment of the first aspect, the method further includes: displaying the change of the three-dimensional model according to the change increment information; and controlling the target device according to the change of the three-dimensional model.
结合第一方面,在第一方面第二实施方式中,所述获取目标设备的目标视频流数据,包括:采集并发送目标区域内目标设备的初始视频流数据;接收所述远端设备发送的初始关键帧;根据所述初始关键帧,确定目标关键帧;根据所述目标关键帧,获取所述目标设备的目标视频流数据。With reference to the first aspect, in the second implementation manner of the first aspect, the acquiring target video stream data of the target device includes: collecting and sending initial video stream data of the target device in the target area; receiving data sent by the remote device initial key frame; according to the initial key frame, determine the target key frame; according to the target key frame, obtain the target video stream data of the target device.
结合第一方面第二实施方式,在第一方面第三实施方式中,所述根据所述初始关键帧,确定目标关键帧,包括:提取所述初始视频关键帧中的颜色特征信息、纹理特征信息以及运动特征信息;将所述颜色特征信息、纹理特征信息以及运动特征信息融合,分别计算各初始视频关键帧的相似度;根据各初始视频关键帧的相似度,确定候选视频关键帧;根据预设自适应算法,确定目标关键帧。With reference to the second embodiment of the first aspect, in the third embodiment of the first aspect, determining the target key frame according to the initial key frame includes: extracting color feature information and texture features in the initial video key frame information and motion feature information; fuse the color feature information, texture feature information and motion feature information to calculate the similarity of each initial video key frame; determine candidate video key frames according to the similarity of each initial video key frame; Preset adaptive algorithms to determine target keyframes.
结合第一方面第三实施方式,在第一方面第四实施方式中,所述根据所述目标关键帧,获取目标设备的目标视频流数据,包括:获取第一视频 流数据;根据预设光流法识别所述第一视频流数据中的第一特征点,以及识别所述目标关键帧中的第二特征点;当所述第一特征点与所述第二特征点的相似度大于预设相似阈值时,确定所述第一视频流数据与所述目标关键帧匹配;当所述第一视频流数据与所述目标关键帧匹配时,将所述第一视频流确定为目标设备的目标视频流数据。With reference to the third embodiment of the first aspect, in the fourth embodiment of the first aspect, the obtaining the target video stream data of the target device according to the target key frame includes: obtaining the first video stream data; The flow method identifies the first feature point in the first video stream data, and identifies the second feature point in the target key frame; when the similarity between the first feature point and the second feature point is greater than the predetermined When the similarity threshold is set, it is determined that the first video stream data matches the target key frame; when the first video stream data matches the target key frame, the first video stream is determined to be the target device's Target video stream data.
结合第一方面,在第一方面第五实施方式中,该方法还包括:根据所述第一特征点以及预设相对距离,确定所述目标关键帧的第一中心位置;根据所述第二特征点以及预设相对距离,确定所述第一视频流数据的第二中心位置;根据所述第一中心位置及第二中心位置跟踪获取目标设备的目标视频流数据。With reference to the first aspect, in a fifth implementation manner of the first aspect, the method further includes: determining a first center position of the target key frame according to the first feature point and a preset relative distance; The feature point and the preset relative distance are used to determine the second center position of the first video stream data; and the target video stream data of the target device is obtained by tracking according to the first center position and the second center position.
第二方面,本申请实施例还提供了一种基于多特征识别的交互方法,所述方法包括:接收现场设备发送的三维模型的数据;根据所述三维模型的数据,生成三维模型;根据所述三维模型以及预设数据库,确定变化增量信息;将所述变化增量信息反馈至现场设备。In a second aspect, an embodiment of the present application further provides an interaction method based on multi-feature identification, the method comprising: receiving data of a three-dimensional model sent by a field device; generating a three-dimensional model according to the data of the three-dimensional model; The three-dimensional model and the preset database are used to determine the change increment information; and the change increment information is fed back to the field device.
结合第二方面,在第二方面第一实施方式中,在接收现场设备发送的三维模型的数据之前,所述方法还包括:接收所述现场设备发送的目标区域的初始视频流数据;根据所述初始视频流数据,确定问题区域,并根据所述问题区域生成初始关键帧;发送所述初始关键帧至现场设备。With reference to the second aspect, in the first embodiment of the second aspect, before receiving the data of the three-dimensional model sent by the field device, the method further includes: receiving initial video stream data of the target area sent by the field device; The initial video stream data is determined, the problem area is determined, and an initial key frame is generated according to the problem area; the initial key frame is sent to the field device.
根据第三方面,本申请实施例提供了一种基于多特征识别的交互装置,包括:目标视频流数据获取模块,配置为获取目标设备的目标视频流数据;调用模块,配置为根据所述目标视频流数据调用所述目标设备的三维模型;数据发送模块,配置为将所述三维模型的数据发送至远端设备;变化增量信息接收模块,配置为接收所述远端设备反馈的所述三维模型的变化增量信息。According to a third aspect, an embodiment of the present application provides an interaction device based on multi-feature identification, including: a target video stream data acquisition module configured to acquire target video stream data of a target device; a calling module configured to obtain target video stream data according to the target The video stream data calls the three-dimensional model of the target device; the data sending module is configured to send the data of the three-dimensional model to the remote device; the change increment information receiving module is configured to receive the feedback from the remote device. Change increment information of the 3D model.
根据第四方面,本申请实施例提供了一种基于多特征识别的交互装置, 包括:数据接收模块,配置为接收现场设备发送的三维模型的数据;三维模型生成模块,配置为根据所述三维模型的数据,生成三维模型;确定模块,配置为根据所述三维模型以及预设数据库,确定变化增量信息;数据发送模块,配置为将所述变化增量信息反馈至现场设备。According to a fourth aspect, an embodiment of the present application provides an interaction device based on multi-feature identification, including: a data receiving module configured to receive data of a three-dimensional model sent by a field device; a three-dimensional model generating module configured to receive data according to the three-dimensional model The data of the model generates a three-dimensional model; the determination module is configured to determine the change increment information according to the three-dimensional model and the preset database; the data transmission module is configured to feed back the change increment information to the field device.
根据第五方面,本申请实施例提供了一种计算机设备,包括:包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器执行第一方面或者第一方面的任意一种实施方式中所述的基于多特征识别的交互方法的步骤,或第二方面或者第二方面的任意一种实施方式中所述的基于多特征识别的交互方法的步骤。According to a fifth aspect, an embodiment of the present application provides a computer device, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores data that can be used by the one Instructions executed by a processor, the instructions being executed by the at least one processor to cause the at least one processor to perform the multi-feature identification-based interaction described in the first aspect or any one of the embodiments of the first aspect The steps of the method, or the steps of the interaction method based on multi-feature recognition described in the second aspect or any one of the implementation manners of the second aspect.
根据第六方面,本申请实施例提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现第一方面或者第一方面的任意一种实施方式中所述的基于多特征识别的交互方法的步骤,或第二方面或者第二方面的任意一种实施方式中所述的基于多特征识别的交互方法的步骤。According to a sixth aspect, an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the first aspect or any one of the implementations of the first aspect The steps of the interaction method based on multi-feature recognition, or the steps of the interaction method based on multi-feature recognition described in the second aspect or any implementation manner of the second aspect.
本申请实施例的技术方案,具有如下优点:The technical solutions of the embodiments of the present application have the following advantages:
1.本申请实施例提供的一种基于多特征识别的交互方法、装置以及计算机设备,其中,该方法包括:获取目标设备的目标视频流数据;根据目标视频流数据调用目标设备的三维模型;将三维模型的数据发送至远端设备;接收远端设备反馈的三维模型的变化增量信息;根据变化增量信息控制目标设备。通过实施本申请,结合根据目标视频流数据生成的三维模型,以及接收到的远端设备反馈的变化增量信息,可以使现场设备获取到准确的指导信息,实现了增强现实方式和电力设备三维模型的远程虚实融合的交互。1. A kind of interaction method, device and computer equipment based on multi-feature identification provided by the embodiment of the present application, wherein, the method comprises: obtaining target video stream data of target device; calling the three-dimensional model of target device according to target video stream data; Send the data of the 3D model to the remote device; receive the change increment information of the 3D model fed back by the remote device; control the target device according to the change increment information. By implementing this application, combined with the 3D model generated according to the target video stream data and the received incremental change information fed back by the remote equipment, the field equipment can obtain accurate guidance information, and the augmented reality method and the 3D power equipment can be realized. Model the interaction of remote virtual reality fusion.
2.本申请实施例提供的一种基于多特征识别的交互方法、装置以及计算机设备,其中,该方法包括:接收现场设备发送的三维模型的数据;根据所述三维模型的数据,生成三维模型;根据所述三维模型以及预设数据库,确定变化增量信息;将所述变化增量信息反馈至现场设备。通过实施本申请,结合生成的三维模型,可以使远端设备以第一视角的方式对目标设备进行标注,进而精确指导了现场作业人员进行作业,高效且准确,实现了增强现实方式和电力设备三维模型的远程虚实融合的交互。2. An interaction method, device, and computer equipment based on multi-feature identification provided by the embodiments of the present application, wherein the method comprises: receiving data of a three-dimensional model sent by a field device; generating a three-dimensional model according to the data of the three-dimensional model ; According to the three-dimensional model and the preset database, determine the change increment information; and feed back the change increment information to the field device. By implementing the present application, combined with the generated three-dimensional model, the remote device can mark the target device from the first perspective, and then accurately guide the field operators to perform operations, which is efficient and accurate, and realizes the augmented reality method and power equipment. Interaction of remote virtual reality fusion of 3D models.
3.本申请实施例提供的一种基于多特征识别的交互方法,结合远端设备与现场设备之间协作标注以及识别匹配,通过特征点的相对距离确定目标位置,可以持续检测到目标设备的特征,从而持续获得目标设备的实时位置,实现对目标设备的精确跟踪,也就是说,结合实时检测目标设备的多特征点信息,与临时存储的视频关键帧进行匹配,实现对目标设备的监测、识别、匹配及跟踪。3. A kind of interaction method based on multi-feature identification provided by the embodiment of the present application, combined with the collaborative labeling and identification matching between the remote device and the field device, the target position is determined by the relative distance of the feature points, and the target device can be continuously detected. feature, so as to continuously obtain the real-time location of the target device and achieve accurate tracking of the target device, that is, combine the real-time detection of multi-feature point information of the target device and match with the temporarily stored video key frames to realize the monitoring of the target device. , identification, matching and tracking.
附图说明Description of drawings
为了更清楚地说明本申请具体实施方式或相关技术中的技术方案,下面将对具体实施方式或相关技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the specific embodiments of the present application or related technologies, the following briefly introduces the accompanying drawings required in the description of the specific embodiments or related technologies. Obviously, the accompanying drawings in the following description are For some embodiments of the present application, for those of ordinary skill in the art, other drawings can also be obtained based on these drawings without any creative effort.
图1为本申请实施例中基于多特征识别的交互方法中现场设备与远程设备之间进行通信的结构示意图;1 is a schematic structural diagram of communication between a field device and a remote device in an interaction method based on multi-feature identification in an embodiment of the application;
图2为本申请实施例中基于多特征识别的交互方法中现场设备端的一个具体示例的流程图;FIG. 2 is a flowchart of a specific example of a field device end in an interaction method based on multi-feature identification in an embodiment of the present application;
图3为本申请实施例中基于多特征识别的交互方法中获取目标视频流数据的一个具体示例的流程图;3 is a flowchart of a specific example of acquiring target video stream data in an interaction method based on multi-feature identification in an embodiment of the present application;
图4为本申请实施例中基于多特征识别的交互方法中远程设备端的一个具体示例的流程图;4 is a flowchart of a specific example of a remote device end in an interaction method based on multi-feature identification in an embodiment of the present application;
图5为本申请实施例中基于多特征识别的交互方法中远程设备端的另一个具体示例的流程图;5 is a flowchart of another specific example of a remote device end in an interaction method based on multi-feature identification in an embodiment of the present application;
图6为本申请实施例中基于多特征识别的交互方法中的一个具体实施例的结构框图;6 is a structural block diagram of a specific embodiment of an interaction method based on multi-feature identification in an embodiment of the present application;
图7为本申请实施例中基于多特征识别的交互装置的一个示例的结构示意图;7 is a schematic structural diagram of an example of an interaction device based on multi-feature identification in an embodiment of the present application;
图8为本申请实施例中基于多特征识别的交互装置的一个示例的结构示意图;8 is a schematic structural diagram of an example of an interaction device based on multi-feature identification in an embodiment of the present application;
图9为本申请实施例中计算机设备的一个具体示例图。FIG. 9 is a diagram of a specific example of a computer device in an embodiment of the present application.
具体实施方式Detailed ways
下面将结合附图对本申请的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。下面所描述的本申请不同实施方式中所涉及的技术特征只要彼此之间未构成冲突就可以相互结合。The technical solutions of the present application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of this application. The technical features involved in the different embodiments of the present application described below can be combined with each other as long as there is no conflict with each other.
协作是现代化社会发展的趋势,传统的面对面协作方式(例如视频会议)在时间和空间两方面都存在着巨大局限性,远不能满足人们对于协作的要求。具体地,远程协作是指在计算机和通信技术的支持下,帮助地理上分散的组织和个人完成协作的过程。为了支持有效协作,该平台必须能支持实时视频、音频这种具有现场感的流媒体,还必须支持其他多媒体信息,如图形标注、静态图像、文本等,以及对这些多媒体信息的综合处理。在实际应用场景中,例如,电网输变配巡视、检修、应急抢修等作业现场,电力设备类型多且操作复杂、新问题持续涌现且识别处理难度大,需要跨 班组及跨工区协同作业,以及需要技术专家或设备制造厂商远程的技术支持,而当前沟通方式效率较低、出错概率较高,亟待解决。Collaboration is the development trend of modern society. Traditional face-to-face collaboration methods (such as video conferencing) have huge limitations in both time and space, and are far from meeting people's requirements for collaboration. Specifically, remote collaboration refers to the process of helping geographically dispersed organizations and individuals to complete collaboration with the support of computer and communication technologies. In order to support effective collaboration, the platform must be able to support live streaming media such as real-time video and audio, as well as other multimedia information, such as graphic annotations, static images, text, etc., as well as comprehensive processing of these multimedia information. In practical application scenarios, such as power grid transmission, transformation and distribution inspections, maintenance, emergency repairs and other job sites, there are many types of power equipment and complex operations, new problems continue to emerge, and identification and processing are difficult, requiring cross-team and cross-work area collaborative work, and Remote technical support from technical experts or equipment manufacturers is required, and the current communication methods are inefficient and have a high probability of errors, which need to be solved urgently.
而为了解决相关技术中存在的沟通效率较低以及出错概率较高的问题,本申请实施例提供了一种基于多特征识别的交互方法、装置以及计算机设备,目的是通过现场设备与远端设备之间的实时音视频交互,可以高效且准确地指导现场设备进行作业。In order to solve the problems of low communication efficiency and high error probability existing in the related art, the embodiments of the present application provide an interaction method, device and computer equipment based on multi-feature identification, which aim to communicate with remote equipment through field equipment. The real-time audio and video interaction between them can efficiently and accurately guide the field equipment to perform operations.
如图1所示,现场设备与远程设备通过无线信道进行通信,其中,现场设备上可设置有可穿戴终端设备、摄像设备或布控球等图像采集装置,以及无线通信装置;远程设备上可设置有无线通信模块,用于与其他设备进行通信。示例性的,远程设备可以是远程专家设备或远端设备。具体地,现场设备可以将采集的现场的视频流数据发送至远程专家设备,远程专家设备可以通过无线通信模块接收该视频流数据,并回送相应的反馈信息。As shown in Figure 1, the field device communicates with the remote device through a wireless channel. The field device can be provided with image acquisition devices such as wearable terminal equipment, camera equipment or a control ball, as well as wireless communication devices; the remote device can be set up There are wireless communication modules for communicating with other devices. Illustratively, the remote device may be a remote expert device or a remote device. Specifically, the field device can send the collected live video stream data to the remote expert device, and the remote expert device can receive the video stream data through the wireless communication module and send back corresponding feedback information.
本申请实施例提供了一种基于多特征识别的交互方法,具体应用于现场设备端,如图2所示,所述方法包括:The embodiment of the present application provides an interaction method based on multi-feature identification, which is specifically applied to the field device side. As shown in FIG. 2 , the method includes:
步骤S11:获取目标设备的目标视频流数据。Step S11: Acquire target video stream data of the target device.
在本实施例中,目标设备可以是实际应用场景中设置有的任意设备,例如,当应用于电网输送场景时,目标设备可以是电子器件,比如油温表、电子开关等。视频流数据可以是将一系列连续的图像信息进行存储和记录的一种数据形式,连续图像记录着一段或多段连续时间内的特定事件,当连续的图像以较快的帧率顺序播放时,就会展现出连续的画面,即为视频流数据。目标视频流数据可以是根据远程设备标注的问题区域获取的视频流数据,或者再次获取到的对应问题区域的视频流数据。In this embodiment, the target device may be any device set in an actual application scenario. For example, when applied to a power grid transmission scenario, the target device may be an electronic device, such as an oil temperature gauge, an electronic switch, and the like. Video stream data can be a data form in which a series of continuous image information is stored and recorded. The continuous image records specific events in one or more continuous periods of time. When the continuous images are played sequentially at a faster frame rate, A continuous picture will be displayed, that is, the video stream data. The target video stream data may be the video stream data obtained according to the problem area marked by the remote device, or the video stream data corresponding to the problem area obtained again.
示例性的,现场设备获取问题区域的视频流数据,即为目标视频流数据。例如,当远程设备(例如,技术支持专家端设备、远端设备)标注存在问题的设备为油温表时,当现场设备的摄像头移动至指向油温表时,可 以获取包含油温表的视频流数据,即为目标视频流数据。Exemplarily, the field device acquires the video stream data of the problem area, that is, the target video stream data. For example, when a remote device (for example, a technical support expert device, a remote device) marks the device in question as an oil temperature gauge, when the camera of the field device moves to point to the oil temperature gauge, a video containing the oil temperature gauge can be obtained. Stream data is the target video stream data.
示例性的,现场设备可以通过可穿戴终端设备、摄像设备或布控球等获取视频流数据。Exemplarily, the on-site device may acquire video stream data through a wearable terminal device, a camera device, or a control ball.
步骤S12:根据目标视频流数据调用目标设备的三维模型。Step S12: calling the three-dimensional model of the target device according to the target video stream data.
在本实施例中,三维模型可以是指三维立体模型,用以表征目标设备的结构特征信息等。示例性的,在现场设备端,根据获取到的包含有目标设备的目标视频流数据,确定目标视频流数据中的目标设备的标识信息,根据所述标识信息确定目标设备的型号,根据所述目标设备的型号在预先设置的三维模型数据库中调用相应的三维模型。例如,当根据目标视频流数据确定目标设备为油温表时,首先确定油温表的设备型号,例如为xxx-1,此时再从预先设置的三维模型数据中调用设备型号为xxx-1的油温表的三维模型,在现场设备端进行展示。In this embodiment, the three-dimensional model may refer to a three-dimensional three-dimensional model, which is used to represent structural feature information and the like of the target device. Exemplarily, on the field device side, according to the obtained target video stream data containing the target device, determine the identification information of the target device in the target video stream data, determine the model of the target device according to the identification information, and according to the The model of the target device calls the corresponding 3D model in the preset 3D model database. For example, when the target device is determined to be an oil temperature gauge according to the target video stream data, first determine the device model of the oil temperature gauge, such as xxx-1, and then call the device model from the preset 3D model data as xxx-1 The three-dimensional model of the oil temperature gauge is displayed on the field device side.
步骤S13:发送三维模型的数据至远端设备。Step S13: Send the data of the three-dimensional model to the remote device.
在本实施例中,现场设备与远程设备通过无线信道进行通信,传输数据等;现场设备将生成的三维立体模型的数据发送至远端设备(或远程设备、远程专家设备)。In this embodiment, the field device communicates with the remote device through wireless channels, transmits data, etc.; the field device sends the data of the generated three-dimensional model to the remote device (or remote device, remote expert device).
步骤S14:接收远端设备反馈的三维模型的变化增量信息。Step S14: Receive the incremental change information of the three-dimensional model fed back by the remote device.
在本实施例中,通过步骤S13,在远端设备上基于该三维模型的数据,将目标设备的三维模型进行展示。该三维模型的变化增量信息可以是远端设备侧的专家或专业人员以第一视角观察目标设备时,为了解决目标设备中存在的问题而对该目标设备的三维模型进行操作的信息记录,例如,当远端设备侧的专家或专业人员确认油温表出现问题,会在远端设备的三维模型上对油温表进行相应操作,例如,将油温表向左移动0.6厘米,此时变化增量信息即为“将油温表向左移动0.6厘米”。通过无线信道,将上述变化增量信息“将油温表向左移动0.6厘米”从远端设备传输至现场设备。In this embodiment, through step S13, the three-dimensional model of the target device is displayed on the remote device based on the data of the three-dimensional model. The incremental change information of the 3D model may be an information record of operations performed on the 3D model of the target device in order to solve problems existing in the target device when an expert or professional on the remote device side observes the target device from a first perspective, For example, when experts or professionals on the remote device side confirm that there is a problem with the oil temperature gauge, they will perform corresponding operations on the 3D model of the remote device. For example, move the oil temperature gauge to the left by 0.6 cm. The incremental change information is "move the oil temperature gauge to the left by 0.6 cm". Through the wireless channel, the above-mentioned incremental change information "move the oil temperature gauge to the left by 0.6 cm" is transmitted from the remote device to the field device.
本申请实施例提供的一种基于多特征识别的交互方法,包括:获取目标设备的目标视频流数据;根据目标视频流数据调用目标设备的三维模型;将三维模型的数据发送至远端设备;接收远端设备反馈的三维模型的变化增量信息;根据变化增量信息显示三维模型的变化。通过实施本申请实施例,结合根据目标视频流数据生成的三维模型,以及接收到的远端设备反馈的变化增量信息,可以使现场设备获取到准确的指导信息,实现了增强现实方式和电力设备三维模型的远程虚实融合的交互。An interaction method based on multi-feature identification provided by an embodiment of the present application includes: acquiring target video stream data of a target device; calling a three-dimensional model of the target device according to the target video stream data; sending the data of the three-dimensional model to a remote device; Receive the change increment information of the 3D model fed back by the remote device; display the change of the 3D model according to the change increment information. By implementing the embodiments of the present application, combined with the 3D model generated according to the target video stream data and the received incremental change information fed back by the remote device, the on-site device can obtain accurate guidance information, and the augmented reality method and power The interaction of remote virtual reality fusion of 3D models of equipment.
作为本申请的一个可选实施方式,该方法还包括:根据变化增量信息显示三维模型的变化;根据三维模型的变化控制目标设备。As an optional embodiment of the present application, the method further includes: displaying the change of the three-dimensional model according to the change increment information; and controlling the target device according to the change of the three-dimensional model.
在本实施例中,现场设备可以将变化增量信息直接于三维模型上显示,例如,根据接收到的变化增量信息,对三维模型进行相应地调整,比如,当接收到的变化增量信息为“将油温表向左移动0.6厘米”时,现场设备的三维模型中的油温表直接向左移动0.6厘米。进而,运维人员可以根据三维模型的变化控制目标设备,例如,运维人员可以三维模型的变化,进而控制实际设备中的油温表,使油温表向左移动0.6厘米。In this embodiment, the field device may directly display the change increment information on the 3D model, for example, adjust the 3D model accordingly according to the received change increment information, for example, when the received change increment information For "move the oil temperature gauge to the left by 0.6 cm", the oil temperature gauge in the 3D model of the field device is directly moved to the left by 0.6 cm. Furthermore, the operation and maintenance personnel can control the target equipment according to the change of the 3D model. For example, the operation and maintenance personnel can change the 3D model and then control the oil temperature gauge in the actual equipment, so that the oil temperature gauge moves 0.6 cm to the left.
作为本申请的一个可选实施方式,如图3所示,上述步骤S11,获取目标设备的目标视频流数据,包括:As an optional implementation manner of the present application, as shown in FIG. 3 , in the above step S11, the target video stream data of the target device is acquired, including:
步骤S21:采集并发送目标区域内目标设备的初始视频流数据。Step S21: Collect and send the initial video stream data of the target device in the target area.
在本实施例中,目标区域可以是实际应用场景中的任意区域,初始视频流数据可以是现场设备初始采集到的视频流,此时,可以通过可穿戴终端设备、摄像设备或布控球采集现场设备端的视频流数据,并将采集到的初始视频流数据实时传输至远程设备,即远程专家设备、远端设备等。In this embodiment, the target area can be any area in the actual application scene, and the initial video stream data can be the video stream initially collected by the field device. The video stream data on the device side, and the collected initial video stream data is transmitted to the remote device in real time, that is, the remote expert device, the remote device, etc.
步骤S22:接收远端设备发送的初始关键帧。Step S22: Receive the initial key frame sent by the remote device.
在本实施例中,初始关键帧可以是远端设备以第一视角对初始视频流数据中存在问题的区域进行绘图标注,远端设备例如是文字标注以及图像 标注等,存在绘图标注的视频流片段即为初始关键帧。具体地,现场设备在将初始视频流数据发送至远端设备后,可以接收到远端设备反馈的初始关键帧。In this embodiment, the initial key frame may be the remote device drawing and marking the problematic area in the initial video stream data from the first perspective. The remote device is, for example, text marking and image marking. The clip is the initial keyframe. Specifically, after sending the initial video stream data to the remote device, the field device can receive the initial key frame fed back by the remote device.
步骤S23:根据初始关键帧,确定目标关键帧。Step S23: Determine the target key frame according to the initial key frame.
在本实施例中,目标关键帧可以是现场设备对初始关键帧进行优化以及聚合之后,提取出的包含初始关键帧中全部特征信息的帧,即为目标关键帧。示例性的,现场设备提取初始关键帧中的关键信息,剔除初始关键帧中的冗余信息,通过图像显著性检测方式、候选关键帧提取方法、自适应层次聚类等方法,提取出具有关键信息的帧作为目标关键帧,并将目标关键帧进行结构化存储。In this embodiment, the target key frame may be a frame containing all the feature information in the initial key frame extracted after the field device optimizes and aggregates the initial key frame, that is, the target key frame. Exemplarily, the field device extracts the key information in the initial key frame, removes redundant information in the initial key frame, and extracts the key information through the image saliency detection method, the candidate key frame extraction method, the adaptive hierarchical clustering and other methods. The frame of information is used as the target key frame, and the target key frame is stored in a structured manner.
步骤S24:根据目标关键帧,获取目标设备的目标视频流数据。Step S24: Acquire target video stream data of the target device according to the target key frame.
在本实施例中,根据目标关键帧,确定目标设备的目标视频流数据,可以是根据目标关键帧,与再次采集到的视频流数据进行匹配;当再次采集到的视频流数据与目标关键帧匹配时,可以确定再次采集到的视频流数据即为目标视频流数据,此时可以在再次采集到的视频流数据上显示远端设备在目标关键帧上的文字标注、图像标注以及三维立体模型标注等。In this embodiment, the target video stream data of the target device is determined according to the target key frame, which may be matched with the re-collected video stream data according to the target key frame; when the re-collected video stream data matches the target key frame When matching, it can be determined that the re-collected video stream data is the target video stream data. At this time, the text annotation, image annotation and 3D model of the remote device on the target key frame can be displayed on the re-collected video stream data. Labeling etc.
本申请实施例提供的一种基于多特征识别的交互方法,结合视频流关键帧技术,提取视频中的关键信息,剔除视频中的冗余信息,通过图像显著性检测、候选关键帧提取、自适应层次聚类等方法,确定目标关键帧以及将上述目标关键帧进行结构化存储,可以高效且准确地确定目标关键帧以及目标视频流数据,实现资源消耗的最小化以及关键信息存储的最大化。An interactive method based on multi-feature recognition provided by the embodiment of the present application, combined with the video stream key frame technology, extracts key information in the video, eliminates redundant information in the video, and performs image saliency detection, candidate key frame extraction, automatic Adapting to methods such as hierarchical clustering, determining target key frames and storing the above target key frames in a structured manner can efficiently and accurately determine target key frames and target video stream data, minimizing resource consumption and maximizing key information storage .
作为本申请的一个可选实施方式,上述步骤S23,根据初始关键帧,确定目标关键帧的执行过程,包括:提取初始视频关键帧中的颜色特征信息、纹理特征信息以及运动特征信息;将所述颜色特征信息、纹理特征信息以及运动特征信息融合,分别计算各初始视频关键帧的相似度;根据各初始 视频关键帧的相似度,确定候选视频关键帧;根据预设自适应算法,确定目标关键帧。As an optional embodiment of the present application, the above step S23, according to the initial key frame, determines the execution process of the target key frame, including: extracting color feature information, texture feature information and motion feature information in the initial video key frame; The color feature information, texture feature information and motion feature information are fused to calculate the similarity of each initial video key frame respectively; according to the similarity of each initial video key frame, the candidate video key frame is determined; according to the preset adaptive algorithm, the target is determined Keyframe.
在本实施例中,颜色特征信息是图像中最显著的特征,是基于像素点的特征,不同的电气设备显示不同的颜色。示例性的,提取颜色特征信息的过程可以包括:提取初始视频关键帧中的颜色特征信息,并以直方图描述上述颜色特征信息。实际应用中,现场设备可以根据各个电力设备不同的颜色特征信息,生成颜色直方图。In this embodiment, the color feature information is the most prominent feature in the image, which is based on the feature of pixel points, and different electrical devices display different colors. Exemplarily, the process of extracting color feature information may include: extracting color feature information in an initial video key frame, and describing the foregoing color feature information with a histogram. In practical applications, the field device can generate a color histogram according to different color feature information of each power device.
在本实施例中,纹理特征可以表示图像的全局特征,描述了图像或图像区域所对应景物的表面性质。示例性的,提取纹理特征信息的过程,可以包括:根据包含多个像素点的多个区域,进行统计计算得到纹理特征信息;具体地,可通过马尔科夫随机场模型将初始视频关键帧切分为多个图像,得到不同区域数目、像素位置、像素值集合等纹理特征信息,确定初始视频关键帧的纹理特征信息。In this embodiment, the texture feature may represent the global feature of the image, and describe the surface properties of the scene corresponding to the image or the image area. Exemplarily, the process of extracting texture feature information may include: performing statistical calculation to obtain texture feature information according to multiple regions including multiple pixel points; Divide into multiple images, obtain texture feature information such as the number of different regions, pixel positions, pixel value sets, etc., and determine the texture feature information of the initial video key frame.
在本实施例中,示例性的,提取运动特征信息的过程可以包括:首先提取初始视频关键帧的显著性图像,具体可以通过显著性检测算法(SDSP),以及基于CIE L*a*b*颜色特征(其中,CIE L*a*b*是基于人类色感的三度色彩空间,为国际照明委员会(CIE,Commission Internationale De L′E′ clairage)最广泛使用的色彩空间,其三度空间的L*代表亮度,a*代表红-绿色轴,b*代表蓝-黄色轴)、对比度原理、显著性计算的核心规则,确定初始视频关键帧中的显著性目标,且可以保存原始图像的大部分信息。其次通过基于金字塔的Lucas-Kanade光流法,对显著性图像进行运动估计,生成显著性图像的运动特征信息。In this embodiment, exemplary, the process of extracting motion feature information may include: first extracting the saliency image of the initial video key frame, specifically through the saliency detection algorithm (SDSP), and based on the CIE L*a*b* Color characteristics (wherein, CIE L*a*b* is a three-dimensional color space based on human color perception, which is the most widely used color space by the International Commission on Illumination (CIE, Commission Internationale De L'E' clairage). Its three-dimensional space L* represents brightness, a* represents red-green axis, b* represents blue-yellow axis), contrast principle, the core rules of saliency calculation, determine the saliency target in the initial video key frame, and can save the original image most information. Secondly, through the Lucas-Kanade optical flow method based on the pyramid, the motion estimation of the saliency image is performed to generate the motion feature information of the saliency image.
本申请实施例提供的一种基于多特征识别的交互方法,结合SDSP算法,即结合三种先验知识:人的视觉在场景总检测突出物体的行为,可以用log-Gabor滤波器来模拟;人的视觉倾向于注意力集中在图像的中心,利 用高斯映射来建模;暖色比冷色更能吸引视觉的关注。可以通过数学建模使算法排除颜色、复杂的纹理和多变背景的影响,快速准确地获得显著性图像。An interaction method based on multi-feature recognition provided by the embodiment of the present application, combined with the SDSP algorithm, that is, combined with three kinds of prior knowledge: human vision always detects the behavior of prominent objects in the scene, which can be simulated by log-Gabor filter; Human vision tends to focus on the center of the image, modeled with a Gaussian map; warm colors attract more visual attention than cool colors. Through mathematical modeling, the algorithm can exclude the influence of color, complex texture and changing background, and obtain saliency images quickly and accurately.
在本实施例中,将颜色特征信息、纹理特征信息以及运动特征信息融合,分别计算各初始视频关键帧的相似度,各初始视频关键帧之间的相似度可以是表示各初始视频关键帧所包含内容的相同程度。示例性的,将颜色特征信息、纹理特征信息以及运动特征信息进行归一化,生成融合特征向量,根据上述融合特征向量,计算相邻两个初始视频关键帧之间的欧氏距离,继而根据欧氏距离确定相邻两个初始视频关键帧之间的相似度。其中,欧式距离越小,相邻两帧之间的相似性越高。In this embodiment, the color feature information, texture feature information and motion feature information are fused, and the similarity of each initial video key frame is calculated separately, and the similarity between each initial video key frame may be a representation of the value of each initial video key frame. Contains the same degree of content. Exemplarily, the color feature information, texture feature information and motion feature information are normalized to generate a fusion feature vector, and according to the above-mentioned fusion feature vector, the Euclidean distance between two adjacent initial video key frames is calculated, and then according to Euclidean distance determines the similarity between two adjacent initial video keyframes. Among them, the smaller the Euclidean distance, the higher the similarity between two adjacent frames.
在本实施例中,根据各初始视频关键帧的相似度,确定候选视频关键帧;根据预设自适应算法,确定目标关键帧。通过自适应层次聚类算法,确定聚类阈值,以及确定各初始视频关键帧的显著性图像之间的互信息(Mutual Information,MI),互信息用以表征两个变量之间的相关性。示例性的,通过自适应层次聚类算法确定目标关键帧的过程可以是:根据初始视频关键帧,确定候选视频关键帧,即为候选关键帧序列;计算各相邻候选关键帧序列的显著性图像之间的互信息,计算确定互信息序列;根据归一化的各相邻图像重叠区域以及直方图,计算联合概率,根据联合概率,确定聚类阈值;将互信息序列按照互信息值降序排列,继而根据候选关键帧的原始时间顺序,将第一帧作为第一个簇,如果后续两帧之间的互信息值小于等于阈值,则产生一个新的簇。相反,后续帧则被划分到当前簇,从而确定目标关键帧,且目标关键帧为有序的聚类簇且每个簇中的帧也按原始视频内容的相关性有序排列。In this embodiment, candidate video key frames are determined according to the similarity of each initial video key frame; and target key frames are determined according to a preset adaptive algorithm. Through the adaptive hierarchical clustering algorithm, the clustering threshold is determined, and the mutual information (Mutual Information, MI) between the saliency images of each initial video key frame is determined, and the mutual information is used to characterize the correlation between the two variables. Exemplarily, the process of determining the target key frame by the adaptive hierarchical clustering algorithm may be: according to the initial video key frame, determine the candidate video key frame, that is, the candidate key frame sequence; calculate the saliency of each adjacent candidate key frame sequence. The mutual information between images is calculated to determine the mutual information sequence; the joint probability is calculated according to the normalized overlapping area of each adjacent image and the histogram, and the clustering threshold is determined according to the joint probability; the mutual information sequence is in descending order of mutual information value Then, according to the original time sequence of candidate key frames, the first frame is regarded as the first cluster, and if the mutual information value between the following two frames is less than or equal to the threshold, a new cluster is generated. On the contrary, subsequent frames are divided into the current cluster, thereby determining the target key frame, and the target key frame is an ordered cluster cluster and the frames in each cluster are also ordered according to the relevance of the original video content.
本申请实施例提供的一种基于多特征识别的交互方法,结合纹理特征信息的旋转不变性,可以对于噪声有较强的抵抗能力,从而实现了在微观 层次实现对图像中包含的物体信息进行区分。结合图像显著性检测、提取候选关键帧以及通过聚类自适应算法确定的关键帧,可以将SDSP算法应用于原始视频序列,结合三种先验知识的显著性检测方法,提取出视频中引起人眼注意的显著性信息;可以定量描述视频帧中所包含的数据信息,继而得到冗余度较小的候选关键帧;采用聚类自适应确定阈值,较好的解决了初始边界点选择不准确而导致聚类结果不稳定的问题,经过自适应层次聚类之后得到的最终簇之间是按照原始视频内容的时间顺序排列,提取的关键帧保持了原始输入视频的时序性。An interactive method based on multi-feature recognition provided by the embodiment of the present application, combined with the rotational invariance of texture feature information, can have strong resistance to noise, thereby realizing the realization of the object information contained in the image at the micro level. distinguish. Combined with image saliency detection, extraction of candidate key frames, and key frames determined by the clustering adaptive algorithm, the SDSP algorithm can be applied to the original video sequence. The saliency information of eye attention; it can quantitatively describe the data information contained in the video frame, and then obtain candidate key frames with less redundancy; using clustering to adaptively determine the threshold, it can better solve the inaccurate selection of initial boundary points However, the problem of unstable clustering results is that the final clusters obtained after adaptive hierarchical clustering are arranged in the chronological order of the original video content, and the extracted key frames maintain the timing of the original input video.
作为本申请的一个可选实施方式,步骤S24,根据目标关键帧,获取目标设备的目标视频流数据的执行过程,包括:获取第一视频流数据;根据预设光流法识别所述第一视频流数据中的第一特征点,以及识别所述目标关键帧中的第二特征点;当所述第一特征点与所述第二特征点的相似度大于预设相似阈值时,确定所述第一视频流数据与所述目标关键帧匹配;当所述第一视频流数据与所述目标关键帧匹配时,将所述第一视频流确定为目标设备的目标视频流数据。As an optional implementation manner of the present application, in step S24, the execution process of acquiring the target video stream data of the target device according to the target key frame includes: acquiring first video stream data; identifying the first video stream data according to a preset optical flow method the first feature point in the video stream data, and identify the second feature point in the target key frame; when the similarity between the first feature point and the second feature point is greater than a preset similarity threshold, determine the The first video stream data matches the target key frame; when the first video stream data matches the target key frame, the first video stream is determined as the target video stream data of the target device.
其中,第一视频流数据可以是可穿戴终端设备、摄像设备或布控球再次移动至目标区域时,采集到的视频流数据。Wherein, the first video stream data may be video stream data collected when the wearable terminal device, the camera device or the control ball moves to the target area again.
在本实施例中,可通过前向光流法或者后向光流法,确定第一视频流数据中的第一特征点以及目标关键帧中的第二特征点。示例性的,第一特征点可以是第一视频流数据中的包括多种特征的多个特征点,例如,可以是像素中的目标特征点;继而采用与上述相似的方式识别目标关键帧,以及提取目标关键帧中的多个目标特征点。In this embodiment, the forward optical flow method or the backward optical flow method can be used to determine the first feature point in the first video stream data and the second feature point in the target key frame. Exemplarily, the first feature point may be a plurality of feature points including multiple features in the first video stream data, for example, may be a target feature point in a pixel; then the target key frame is identified in a manner similar to the above, And extract multiple target feature points in target keyframes.
在本实施例中,计算第一视频流数据中的第一特征点与目标关键帧中的第二特征点在位置、数量等方面的相似程度,并与预设相似阈值比较,当计算出来的相似程度大于预设相似阈值时,确定第一视频流数据与目标 关键帧匹配成功。In this embodiment, the degree of similarity between the first feature point in the first video stream data and the second feature point in the target key frame in terms of position, quantity, etc. is calculated, and compared with a preset similarity threshold, when the calculated When the similarity degree is greater than the preset similarity threshold, it is determined that the first video stream data and the target key frame are successfully matched.
在本实施例中,当第一视频流数据与目标关键帧数据匹配成功时,说明此时现场设备通过可穿戴终端,重新捕获到包含有来自远端设备的标注有问题的目标设备的视频流数据,即为目标视频流数据。In this embodiment, when the first video stream data is successfully matched with the target key frame data, it means that the on-site device has re-captured the video stream containing the target device marked with the problem from the remote device through the wearable terminal at this time. data, that is, the target video stream data.
本申请实施例提供了一种基于多特征识别的交互方法,当应用于电力现场及应急抢修作业场景中,可以通过可穿戴终端、摄像设备或布控球采集现场设备的视频流,继而读取视频流中多个帧,远程专家以第一视角对现场作业人员采集视频流的协作目标进行绘图标注,通过前向/后向光流法识别目标特征点,计算当前帧的特点和特征描述,查询关键帧临时存储集合,将当前帧中的特征点与关键帧临时存储集合的特征点进行匹配,若匹配成功,则远程协作标注目标识别匹配成功,为增强现实信息叠加交互做准备,否则将该关键帧增量更新存储到关键帧临时存储集合。也就是说,当第一视频流数据与目标关键帧匹配成功时,可以基于增强现实服务平台以及增强现实方式,根据电力设备生成三维立体模型,并在三维模型上叠加文字以及图像标注,通过实时协作人员、电力设备与模型间位置关系、角度、操作行为及模型反馈结果,传输变化增量信息,在分布式终端各自进行编解码,同时采用校核机制确保模型与信息的同步变化,实现基于多特征识别的现场设备与远程专家端的跟踪交互。The embodiment of the present application provides an interaction method based on multi-feature recognition. When applied to power field and emergency repair operation scenarios, the video stream of the field device can be collected through a wearable terminal, a camera device or a deployment ball, and then the video can be read. For multiple frames in the stream, the remote expert draws and annotates the collaborative target of the video stream collected by the field operators from the first perspective, identifies the target feature points through the forward/backward optical flow method, calculates the characteristics and feature description of the current frame, and queries The key frame temporary storage set is to match the feature points in the current frame with the feature points of the key frame temporary storage set. If the match is successful, the remote collaborative labeling target recognition match is successful, preparing for the interaction of augmented reality information overlay, otherwise the Keyframe incremental updates are stored to the keyframe temporary storage collection. That is to say, when the first video stream data is successfully matched with the target key frame, a 3D model can be generated according to the power equipment based on the augmented reality service platform and the augmented reality method, and text and image annotations can be superimposed on the 3D model. The positional relationship, angle, operation behavior and model feedback results between the cooperating personnel, power equipment and the model, the incremental change information is transmitted, and the distributed terminals are individually encoded and decoded. Tracking interaction between field devices with multi-feature recognition and remote experts.
作为本申请的一个可选实施方式,该方法还包括:根据第一特征点以及预设相对距离,确定目标关键帧的第一中心位置;根据第二特征点以及预设相对距离,确定第一视频流数据的第二中心位置;根据第一中心位置及第二中心位置跟踪获取目标设备的目标视频流数据。As an optional implementation manner of the present application, the method further includes: determining the first center position of the target key frame according to the first feature point and the preset relative distance; determining the first center position of the target key frame according to the second feature point and the preset relative distance The second central position of the video stream data; the target video stream data of the target device is obtained by tracking according to the first central position and the second central position.
在本实施例中,预设相对距离为目标特征点相对于中心位置的距离,由于相同图像在缩放、旋转中,特征点与中心位置之间的相对距离不变,因此,根据第一特征点以及预设相对距离确定目标关键帧的第一中心位置, 以及根据第二特征点,确定第一视频流数据的第二中心位置;根据检测到的第一中心位置和第二中心位置,持续获取目标设备的目标视频流数据,实现对目标设备的持续跟踪。In this embodiment, the preset relative distance is the distance between the target feature point and the center position. Since the relative distance between the feature point and the center position of the same image remains unchanged during scaling and rotation, according to the first feature point And the preset relative distance determines the first center position of the target key frame, and according to the second feature point, determines the second center position of the first video stream data; According to the detected first center position and the second center position, continue to obtain The target video stream data of the target device to achieve continuous tracking of the target device.
本申请实施例提供了一种基于多特征识别的交互方法,结合对中心的聚类投票确定中心位置,以及各特征点的相对距离,确定目标设备的位置,由于各特征点相对中心位置的距离在缩放、旋转比例下是确定的,因此可以通过对物体特征的不断检测到,实现对物体位置的实时跟踪。通过实时的检测物体的多特征点信息,与视频流结构化关键帧临时存储集合进行匹配的方法来实现物体的监测识别匹配跟踪。The embodiment of the present application provides an interactive method based on multi-feature identification, which combines the cluster voting on the center to determine the center position and the relative distance of each feature point to determine the position of the target device. Since the distance of each feature point relative to the center position It is determined under the scaling and rotation ratio, so the real-time tracking of the position of the object can be realized by continuously detecting the characteristics of the object. By detecting the multi-feature point information of the object in real time and matching it with the temporary storage set of the structured key frame of the video stream, the monitoring, identification, matching and tracking of the object is realized.
本申请实施例提供了一种基于多特征识别的交互方法,具体应用于远程设备或远端设备,如图4所示,包括:The embodiment of the present application provides an interaction method based on multi-feature identification, which is specifically applied to a remote device or a remote device, as shown in FIG. 4 , including:
步骤S31:接收现场设备发送的三维模型的数据。Step S31: Receive the data of the three-dimensional model sent by the field device.
在本实施例中,远端设备接收现场设备发送的三维模型的数据。In this embodiment, the remote device receives the data of the three-dimensional model sent by the field device.
步骤S32:根据三维模型的数据,生成三维模型。Step S32: Generate a three-dimensional model according to the data of the three-dimensional model.
在本实施例中,远端设备根据接收到的三维模型的数据,构建三维模型。In this embodiment, the remote device constructs the three-dimensional model according to the received data of the three-dimensional model.
步骤S33:根据三维模型以及预设数据库,确定变化增量信息。Step S33: Determine the change increment information according to the three-dimensional model and the preset database.
在本实施例中,远端设备根据预设数据库对上述三维模型中存在问题的区域进行调整,例如,当远端设备确定三维模型中的油温表存在问题时,会根据预设数据库对油温表进行调整,例如,将油温表向左移动10厘米或者向左移动0.6厘米,上述调整信息即为变化增量信息。In this embodiment, the remote device adjusts the problem area in the three-dimensional model according to the preset database. For example, when the remote device determines that there is a problem with the oil temperature gauge in the three-dimensional model, it will adjust the oil temperature according to the preset database Adjust the temperature gauge. For example, move the oil temperature gauge to the left by 10 cm or 0.6 cm to the left, and the above adjustment information is the change increment information.
步骤S34:将变化增量信息反馈至现场设备。Step S34: Feed back the change increment information to the field device.
在本实施例中,当变化增量信为,向左移动了0.6厘米时,将调整信息“将油温表向0.6厘米”作为变化增量信息,传输至现场设备。In this embodiment, when the change increment information is moved to the left by 0.6 cm, the adjustment information "move the oil temperature gauge to 0.6 cm" is transmitted to the field device as the change increment information.
本申请实施例所提供的一种基于多特征识别的交互方法,包括:接收 现场设备发送的三维模型的数据;根据三维模型的数据,生成三维模型;根据三维模型以及预设数据库,确定变化增量信息;将变化增量信息反馈至现场设备。通过实施本申请,结合生成的三维模型,可以使远端设备以第一视角的方式对目标设备进行标注,进而精确指导现场作业人员进行作业,高效且准确,实现了增强现实方式和电力设备三维模型的远程虚实融合的交互。An interaction method based on multi-feature identification provided by an embodiment of the present application includes: receiving data of a three-dimensional model sent by a field device; generating a three-dimensional model according to the data of the three-dimensional model; Quantity information; feedback the incremental change information to the field device. By implementing the present application, combined with the generated three-dimensional model, the remote device can mark the target device from the first perspective, and then accurately guide the on-site operators to perform the operation, which is efficient and accurate, and realizes the augmented reality method and the three-dimensional power equipment. Model the interaction of remote virtual reality fusion.
作为本申请的一个可选实施方式,如图5所示,在步骤S31,接收现场设备发送的三维模型的数据之前,该方法还包括:As an optional implementation manner of the present application, as shown in FIG. 5 , in step S31, before receiving the data of the three-dimensional model sent by the field device, the method further includes:
步骤S301:接收现场设备发送的目标区域的初始视频流数据。Step S301: Receive initial video stream data of the target area sent by the field device.
步骤S302:根据初始视频流数据,确定问题区域,并根据问题区域生成初始关键帧。Step S302: Determine a problem area according to the initial video stream data, and generate an initial key frame according to the problem area.
在本实施例中,问题区域可以是远端设备侧的专家或技术人员认为其中某些设备以及布线方式存在问题的区域。示例性的,远程专家可以是在问题区域上以文字形式或者图像形式进行标注,继而生成初始关键帧。In this embodiment, the problem area may be an area where experts or technicians on the remote device side consider that some devices and wiring methods have problems. Exemplarily, the remote expert may mark the problem area in the form of text or in the form of images, and then generate initial key frames.
步骤S303:发送初始关键帧至现场设备。Step S303: Send the initial key frame to the field device.
在本实施例中,远程专家在对现场设备发送的初始视频流数据进行标注后,远端设备生成初始关键帧,继而将生成的初始关键帧发送至现场设备。In this embodiment, after the remote expert marks the initial video stream data sent by the field device, the remote device generates an initial key frame, and then sends the generated initial key frame to the field device.
本申请实施例提供一种基于多特征识别的交互方法,结合远程专家以第一视角对现场作业人员采集到的视频流数据进行绘图标注,进而生成初始关键视频流片段,可以高效且准确地采集并存储视频流的关键帧。The embodiment of the present application provides an interaction method based on multi-feature recognition, which combines remote experts to draw and label the video stream data collected by field operators from a first perspective, and then generates initial key video stream segments, which can be collected efficiently and accurately And store the keyframes of the video stream.
以下结合一具体实施方式,详细描述上述实施例的基于多特征识别的交互,具体地,视频帧是视频流最基础的组成成分,提取蕴含信息程度最丰富的视频帧,将视频帧中的主要内容转化为高级语义信息,进行结构化信息存储。视频流中包含的信息分为底层特征信息、关键图像帧信息和高 级语义信息。底层特征信息是指对图像的全局特征,局部特征和结构特征进行提取。全局特征即图像的形状、颜色、纹理等基本特征;局部特征提取出视频图像的特征点集,用来特征匹配;结构特征反映图像特征之间的几何和空时域关系。关键图像帧信息是指根据图像的底层特征和目标信息进行关键帧提取,通过将多种底层特征信息融合后来表示帧间信息差异性或者视频帧的信息丰富程度,再筛选出具有代表性的视频帧。高级语义信息是指根据视频中蕴含的目标和内容进行语义上的逻辑描述和特征表达。采用深度学习技术,根据适量的图片集,训练出针对性的模型,提取出目标语义、场景语义、图像语义等,综合提取出的语义信息,提炼出文本语句对视频中所反映事件进行逻辑描述,方便用户的直观理解、存储和检索。结合对提取的底层特征、关键图像帧和高级语义等信息进行特征分析与描述、逻辑表达、结构化存储,实现视频流的结构化及数字化存储,为视频关键帧的提取和多特征点识别匹配提供基础服务。The interaction based on multi-feature recognition in the above-mentioned embodiment is described in detail below with reference to a specific implementation manner. Specifically, video frames are the most basic components of video streams. The video frames with the most abundant information are extracted, and the main components in the video frames are extracted. The content is converted into high-level semantic information for structured information storage. The information contained in the video stream is divided into low-level feature information, key image frame information and high-level semantic information. The underlying feature information refers to the extraction of global features, local features and structural features of the image. The global features are the basic features of the image, such as shape, color, texture, etc.; the local features extract the feature point set of the video image for feature matching; the structural features reflect the geometric and spatial-temporal relationships between the image features. Key image frame information refers to extracting key frames according to the underlying features of the image and target information. By fusing a variety of underlying feature information, the information difference between frames or the information richness of video frames is represented, and then representative videos are screened out. frame. High-level semantic information refers to semantically logical description and feature expression according to the target and content contained in the video. Using deep learning technology, according to an appropriate amount of picture sets, a targeted model is trained, and target semantics, scene semantics, image semantics, etc. are extracted, and the extracted semantic information is synthesized, and text sentences are extracted to logically describe the events reflected in the video. , which is convenient for users to intuitively understand, store and retrieve. Combined with the extracted low-level features, key image frames and high-level semantics and other information, feature analysis and description, logical expression, and structured storage are used to realize structured and digital storage of video streams, which is the extraction of video key frames and multi-feature point identification and matching. Provide basic services.
如图6所示,移动智能终端与后台服务端之间通过无线网络进行通信,后台服务端内可以注册有多个电力设备的信息,电力设备可以与文字标注、三维模型等预先关联,可以在后台服务端预先进行分类存储文字标注以及三维模型,预先确定三维模型的渲染参数以及预先对三维模型进行轻量化。As shown in Figure 6, the mobile intelligent terminal and the background server communicate through a wireless network. The background server can register the information of multiple power devices, and the power devices can be pre-associated with text annotations, 3D models, etc. The background server pre-classifies and stores text annotations and 3D models, pre-determines the rendering parameters of the 3D models, and pre-lightens the 3D models.
移动智能终端可以从后台服务器端下载文字标注以及多个电力设备的三维模型,下载完成后,在移动智能终端对三维模型进行渲染,以及三维模型的虚拟场景、电力设备的实际场景相融合,继而进行已经叠加文字标注的三维模型的展示以及对相应电力设备的持续跟踪。The mobile intelligent terminal can download text annotations and 3D models of multiple power equipment from the background server. After the download is complete, the 3D model is rendered on the mobile intelligent terminal, and the virtual scene of the 3D model and the actual scene of the power equipment are integrated, and then Display the 3D model that has been superimposed with text annotations and keep track of the corresponding electrical equipment.
本申请实施例还提供一种基于多特征识别的交互装置,应用于现场设备中。如图7所示,所述装置包括:The embodiments of the present application also provide an interaction device based on multi-feature identification, which is applied to field devices. As shown in Figure 7, the device includes:
目标视频流数据获取模块41,配置为获取目标设备的目标视频流数据;The target video stream data acquisition module 41 is configured to acquire target video stream data of the target device;
调用模块42,配置为根据目标视频流数据调用目标设备的三维模型;The calling module 42 is configured to call the three-dimensional model of the target device according to the target video stream data;
数据发送模块43,配置为将三维模型的数据发送至远端设备;The data sending module 43 is configured to send the data of the three-dimensional model to the remote device;
变化增量信息接收模块44,配置为接收远端设备反馈的三维模型的变化增量信息。The change increment information receiving module 44 is configured to receive the change increment information of the three-dimensional model fed back by the remote device.
通过实施本申请实施例,结合根据目标视频流数据生成的三维模型,以及接收到的远端设备反馈的变化增量信息,可以使现场设备获取到准确的指导信息,实现了增强现实方式和电力设备三维模型的远程虚实融合的交互。By implementing the embodiments of the present application, combined with the 3D model generated according to the target video stream data and the received incremental change information fed back by the remote device, the on-site device can obtain accurate guidance information, and the augmented reality method and power The interaction of remote virtual reality fusion of 3D models of equipment.
在本申请的一些可选实施例中,所述装置还包括:In some optional embodiments of the present application, the apparatus further includes:
显示模块,配置为根据所述变化增量信息接收模块44接收的变化增量信息显示所述三维模型的变化;a display module, configured to display the change of the three-dimensional model according to the change increment information received by the change increment information receiving module 44;
控制模块,配置为根据所述三维模型的变化控制所述目标设备。The control module is configured to control the target device according to the change of the three-dimensional model.
在本申请的一些可选实施例中,所述目标视频流数据获取模块41,配置为采集并发送目标区域内目标设备的初始视频流数据;接收所述远端设备发送的初始关键帧;根据所述初始关键帧,确定目标关键帧;根据所述目标关键帧,获取所述目标设备的目标视频流数据。In some optional embodiments of the present application, the target video stream data acquisition module 41 is configured to collect and send the initial video stream data of the target device in the target area; receive the initial key frame sent by the remote device; For the initial key frame, a target key frame is determined; according to the target key frame, the target video stream data of the target device is acquired.
在本申请的一些可选实施例中,所述目标视频流数据获取模块41,配置为提取所述初始视频关键帧中的颜色特征信息、纹理特征信息以及运动特征信息;将所述颜色特征信息、纹理特征信息以及运动特征信息融合,分别计算各初始视频关键帧的相似度;根据各初始视频关键帧的相似度,确定候选视频关键帧;根据预设自适应算法,确定目标关键帧。In some optional embodiments of the present application, the target video stream data acquisition module 41 is configured to extract color feature information, texture feature information and motion feature information in the initial video key frame; , texture feature information and motion feature information are fused to calculate the similarity of each initial video key frame respectively; according to the similarity of each initial video key frame, the candidate video key frame is determined; according to the preset adaptive algorithm, the target key frame is determined.
在本申请的一些可选实施例中,所述目标视频流数据获取模块41,配置为获取第一视频流数据;根据预设光流法识别所述第一视频流数据中的第一特征点,以及识别所述目标关键帧中的第二特征点;当所述第一特征点与所述第二特征点的相似度大于预设相似阈值时,确定所述第一视频流数据与所述目标关键帧匹配;当所述第一视频流数据与所述目标关键帧匹 配时,将所述第一视频流确定为目标设备的目标视频流数据。In some optional embodiments of the present application, the target video stream data acquisition module 41 is configured to acquire first video stream data; identify the first feature point in the first video stream data according to a preset optical flow method , and identify the second feature point in the target key frame; when the similarity between the first feature point and the second feature point is greater than a preset similarity threshold, determine that the first video stream data and the Target key frame matching; when the first video stream data matches the target key frame, the first video stream is determined as the target video stream data of the target device.
在本申请的一些可选实施例中,所述装置还包括跟踪获取模块,配置为根据所述第一特征点以及预设相对距离,确定所述目标关键帧的第一中心位置;根据所述第二特征点以及预设相对距离,确定所述第一视频流数据的第二中心位置;根据所述第一中心位置及第二中心位置跟踪获取目标设备的目标视频流数据。In some optional embodiments of the present application, the apparatus further includes a tracking acquisition module configured to determine a first center position of the target key frame according to the first feature point and a preset relative distance; The second feature point and the preset relative distance determine the second center position of the first video stream data; and the target video stream data of the target device is obtained by tracking according to the first center position and the second center position.
本申请实施例还提供一种基于多特征识别的交互装置,应用于远端设备中。如图8所示,所述装置包括:The embodiment of the present application also provides an interaction device based on multi-feature identification, which is applied to a remote device. As shown in Figure 8, the device includes:
数据接收模块51,配置为接收现场设备发送的三维模型的数据;The data receiving module 51 is configured to receive the data of the three-dimensional model sent by the field device;
三维模型生成模块52,配置为根据三维模型的数据,生成三维模型;The three-dimensional model generation module 52 is configured to generate a three-dimensional model according to the data of the three-dimensional model;
确定模块53,配置为根据三维模型以及预设数据库,确定变化增量信息;The determination module 53 is configured to determine the change increment information according to the three-dimensional model and the preset database;
数据发送模块54,配置为将变化增量信息反馈至现场设备。The data sending module 54 is configured to feed back the change increment information to the field device.
通过实施本申请实施例,结合生成的三维模型,可以使远端设备以第一视角的方式对目标设备进行标注,进而精确指导了现场作业人员进行作业,高效且准确,实现了增强现实方式和电力设备三维模型的远程虚实融合的交互。By implementing the embodiment of the present application, combined with the generated three-dimensional model, the remote device can mark the target device from the first perspective, thereby accurately guiding the field operators to perform operations, efficiently and accurately, and realizes the augmented reality method and Interaction of remote virtual-real fusion of 3D models of power equipment.
在本申请的一些可选实施例中,所述装置还包括初始关键帧生成模块;In some optional embodiments of the present application, the apparatus further includes an initial key frame generation module;
所述数据接收模块51,还配置为接收所述现场设备发送的目标区域的初始视频流数据;The data receiving module 51 is further configured to receive the initial video stream data of the target area sent by the field device;
所述初始关键帧生成模块,配置为根据所述初始视频流数据,确定问题区域,并根据所述问题区域生成初始关键帧;The initial key frame generation module is configured to determine a problem area according to the initial video stream data, and generate an initial key frame according to the problem area;
所述数据发送模块54,还配置为发送所述初始关键帧至所述现场设备。The data sending module 54 is further configured to send the initial key frame to the field device.
需要说明的是:上述实施例提供的基于多特征识别的交互装置在进行基于多特征识别的交互时,仅以上述各程序模块的划分进行举例说明,实 际应用中,可以根据需要而将上述处理分配由不同的程序模块完成,即将装置的内部结构划分成不同的程序模块,以完成以上描述的全部或者部分处理。另外,上述实施例提供的基于多特征识别的交互装置与基于多特征识别的交互方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。It should be noted that when the interaction device based on multi-feature recognition provided by the above-mentioned embodiment performs the interaction based on multi-feature recognition, only the division of the above-mentioned program modules is used as an example. The allocation is done by different program modules, that is, the internal structure of the device is divided into different program modules, so as to complete all or part of the processing described above. In addition, the interaction device based on multi-feature identification provided in the above embodiment and the embodiment of the interaction method based on multi-feature identification belong to the same concept, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.
本申请实施例还提供了一种计算机设备,如图9所示,该计算机设备可以包括处理器61和存储器62,其中处理器61和存储器62可以通过总线60或者其他方式连接,图9中以通过总线60连接为例。This embodiment of the present application also provides a computer device. As shown in FIG. 9 , the computer device may include a processor 61 and a memory 62 , where the processor 61 and the memory 62 may be connected through a bus 60 or in other ways. In FIG. 9 , the The connection via the bus 60 is for example.
处理器61可以为中央处理器(Central Processing Unit,CPU)。处理器61还可以为其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等芯片,或者上述各类芯片的组合。The processor 61 may be a central processing unit (Central Processing Unit, CPU). The processor 61 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), application specific integrated circuits (Application Specific Integrated Circuits, ASICs), Field-Programmable Gate Arrays (Field-Programmable Gate Arrays, FPGAs) or Other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components and other chips, or a combination of the above types of chips.
存储器62作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序、非暂态计算机可执行程序以及模块,如本申请实施例中的基于多特征识别的交互方法对应的程序指令/模块。处理器61通过运行存储在存储器62中的非暂态软件程序、指令以及模块,从而执行处理器的各种功能应用以及数据处理,即实现上述方法实施例中的基于多特征识别的交互方法。As a non-transitory computer-readable storage medium, the memory 62 can be used to store non-transitory software programs, non-transitory computer-executable programs and modules, such as programs corresponding to the interaction method based on multi-feature identification in the embodiments of the present application Directive/Module. The processor 61 executes various functional applications and data processing of the processor by running the non-transitory software programs, instructions and modules stored in the memory 62, that is, to implement the multi-feature identification-based interaction method in the above method embodiments.
存储器62可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储处理器61所创建的数据等。此外,存储器62可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施例中,存储器62可选包括相对于处理器61远程设置的存储器,这些远程存储器可以通过网络连接至处理器61。上述 网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。所述一个或者多个模块存储在所述存储器62中,当被所述处理器61执行时,执行本申请实施例中的基于多特征识别的交互方法。The memory 62 may include a storage program area and a storage data area, wherein the storage program area may store an operating system and an application program required by at least one function; the storage data area may store data created by the processor 61 and the like. Additionally, memory 62 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 62 may optionally include memory located remotely from processor 61, which may be connected to processor 61 via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof. The one or more modules are stored in the memory 62, and when executed by the processor 61, execute the multi-feature identification-based interaction method in this embodiment of the present application.
上述计算机设备具体细节可以对应参阅本申请上述实施例中对应的相关描述和效果进行理解,此处不再赘述。The specific details of the above-mentioned computer equipment can be understood by referring to the corresponding descriptions and effects in the above-mentioned embodiments of the present application, and details are not repeated here.
本申请实施例还提供了一种非暂态计算机可读存储介质,非暂态计算机可读存储介质存储计算机指令,计算机指令用于使计算机执行如上述实施例中任意一项描述的基于多特征识别的交互方法,其中,存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)、随机存储记忆体(Random Access Memory,RAM)、快闪存储器(Flash Memory)、硬盘(Hard Disk Drive,缩写:HDD)或固态硬盘(Solid-State Drive,SSD)等;存储介质还可以包括上述种类的存储器的组合。Embodiments of the present application also provide a non-transitory computer-readable storage medium, where the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions are used to cause the computer to execute the multi-feature-based method described in any of the foregoing embodiments. An interactive method for identification, wherein the storage medium can be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a flash memory (Flash Memory), Hard Disk Drive (Hard Disk Drive, abbreviation: HDD) or Solid-State Drive (Solid-State Drive, SSD), etc.; the storage medium may also include a combination of the above-mentioned types of memories.
本申请所提供的几个方法或装置实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或装置实施例。The features disclosed in several method or apparatus embodiments provided in this application may be combined arbitrarily under the condition of no conflict to obtain new method embodiments or apparatus embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored, or not implemented. In addition, the coupling, or direct coupling, or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical or other forms. of.
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元,即可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。The unit described above as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit, that is, it may be located in one place or distributed to multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本申请各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may all be integrated into one processing unit, or each unit may be separately used as a unit, or two or more units may be integrated into one unit; the above integration The unit can be implemented either in the form of hardware or in the form of hardware plus software functional units.
本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。The technical solutions of the embodiments of the present application can be embodied in the form of software products in essence, or the parts that make contributions to the prior art. The computer software products are stored in a storage medium and include several instructions to make a computer device (It may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic disk or an optical disk and other mediums that can store program codes.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific embodiments of the present application, but the protection scope of the present application is not limited to this. should be covered within the scope of protection of this application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims (12)

  1. 一种基于多特征识别的交互方法,所述方法包括:An interaction method based on multi-feature identification, the method comprising:
    获取目标设备的目标视频流数据;Obtain the target video stream data of the target device;
    根据所述目标视频流数据调用所述目标设备的三维模型;Invoke the three-dimensional model of the target device according to the target video stream data;
    发送所述三维模型的数据至远端设备;sending the data of the three-dimensional model to a remote device;
    接收所述远端设备反馈的所述三维模型的变化增量信息。Incremental change information of the three-dimensional model fed back by the remote device is received.
  2. 根据权利要求1所述的方法,其中,所述方法还包括:The method of claim 1, wherein the method further comprises:
    根据所述变化增量信息显示所述三维模型的变化;Display the change of the three-dimensional model according to the change increment information;
    根据所述三维模型的变化控制所述目标设备。The target device is controlled according to changes in the three-dimensional model.
  3. 根据权利要求1所述的方法,其中,所述获取目标设备的目标视频流数据,包括:The method according to claim 1, wherein the acquiring target video stream data of the target device comprises:
    采集并发送目标区域内目标设备的初始视频流数据;Collect and send the initial video stream data of the target device in the target area;
    接收所述远端设备发送的初始关键帧;receiving the initial key frame sent by the remote device;
    根据所述初始关键帧,确定目标关键帧;According to the initial key frame, determine the target key frame;
    根据所述目标关键帧,获取所述目标设备的目标视频流数据。Acquire target video stream data of the target device according to the target key frame.
  4. 根据权利要求3所述的方法,其中,所述根据所述初始关键帧,确定目标关键帧,包括:The method according to claim 3, wherein the determining the target key frame according to the initial key frame comprises:
    提取所述初始视频关键帧中的颜色特征信息、纹理特征信息以及运动特征信息;Extracting color feature information, texture feature information and motion feature information in the initial video key frame;
    将所述颜色特征信息、纹理特征信息以及运动特征信息融合,分别计算各初始视频关键帧的相似度;Fusion of the color feature information, texture feature information and motion feature information to calculate the similarity of each initial video key frame;
    根据各初始视频关键帧的相似度,确定候选视频关键帧;Determine candidate video key frames according to the similarity of each initial video key frame;
    根据预设自适应算法,确定目标关键帧。Determine the target key frame according to the preset adaptive algorithm.
  5. 根据权利要求4所述的方法,其中,所述根据所述目标关键帧,获取目标设备的目标视频流数据,包括:The method according to claim 4, wherein the acquiring the target video stream data of the target device according to the target key frame comprises:
    获取第一视频流数据;Obtain the first video stream data;
    根据预设光流法识别所述第一视频流数据中的第一特征点,以及识别所述目标关键帧中的第二特征点;Identify the first feature point in the first video stream data according to the preset optical flow method, and identify the second feature point in the target key frame;
    当所述第一特征点与所述第二特征点的相似度大于预设相似阈值时,确定所述第一视频流数据与所述目标关键帧匹配;When the similarity between the first feature point and the second feature point is greater than a preset similarity threshold, determining that the first video stream data matches the target key frame;
    当所述第一视频流数据与所述目标关键帧匹配时,将所述第一视频流确定为目标设备的目标视频流数据。When the first video stream data matches the target key frame, the first video stream is determined as the target video stream data of the target device.
  6. 根据权利要求5所述的方法,其中,所述方法还包括:The method of claim 5, wherein the method further comprises:
    根据所述第一特征点以及预设相对距离,确定所述目标关键帧的第一中心位置;determining the first center position of the target key frame according to the first feature point and a preset relative distance;
    根据所述第二特征点以及预设相对距离,确定所述第一视频流数据的第二中心位置;determining a second center position of the first video stream data according to the second feature point and a preset relative distance;
    根据所述第一中心位置及第二中心位置跟踪获取目标设备的目标视频流数据。Track and obtain target video stream data of the target device according to the first center position and the second center position.
  7. 一种基于多特征识别的交互方法,所述方法包括:An interaction method based on multi-feature identification, the method comprising:
    接收现场设备发送的三维模型的数据;Receive the data of the 3D model sent by the field device;
    根据所述三维模型的数据,生成三维模型;generating a three-dimensional model according to the data of the three-dimensional model;
    根据所述三维模型以及预设数据库,确定变化增量信息;determining the change increment information according to the three-dimensional model and the preset database;
    将所述变化增量信息反馈至现场设备。The change increment information is fed back to the field device.
  8. 根据权利要求7所述的方法,其中,在接收现场设备发送的三维模型的数据之前,所述方法还包括:The method according to claim 7, wherein before receiving the data of the three-dimensional model sent by the field device, the method further comprises:
    接收所述现场设备发送的目标区域的初始视频流数据;receiving the initial video stream data of the target area sent by the field device;
    根据所述初始视频流数据,确定问题区域,并根据所述问题区域生成初始关键帧;determining a problem area according to the initial video stream data, and generating an initial key frame according to the problem area;
    发送所述初始关键帧至所述现场设备。Send the initial key frame to the field device.
  9. 一种基于多特征识别的交互装置,包括:An interactive device based on multi-feature identification, comprising:
    目标视频流数据获取模块,配置为获取目标设备的目标视频流数据;a target video stream data acquisition module, configured to acquire target video stream data of the target device;
    调用模块,配置为根据所述目标视频流数据调用所述目标设备的三维模型;a calling module, configured to call the three-dimensional model of the target device according to the target video stream data;
    数据发送模块,配置为将所述三维模型的数据发送至远端设备;a data sending module, configured to send the data of the three-dimensional model to a remote device;
    变化增量信息接收模块,配置为接收所述远端设备反馈的所述三维模型的变化增量信息。The change increment information receiving module is configured to receive the change increment information of the three-dimensional model fed back by the remote device.
  10. 一种基于多特征识别的交互装置,包括:An interactive device based on multi-feature identification, comprising:
    数据接收模块,配置为接收现场设备发送的三维模型的数据;a data receiving module, configured to receive the data of the three-dimensional model sent by the field device;
    三维模型生成模块,配置为根据所述三维模型的数据,生成三维模型;a three-dimensional model generation module, configured to generate a three-dimensional model according to the data of the three-dimensional model;
    确定模块,配置为根据所述三维模型以及预设数据库,确定变化增量信息;a determination module, configured to determine change increment information according to the three-dimensional model and the preset database;
    数据发送模块,配置为将所述变化增量信息反馈至现场设备。The data sending module is configured to feed back the change increment information to the field device.
  11. 一种计算机设备,包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器执行权利要求1-6中任一项所述的基于多特征识别的交互方法的步骤;或者,所述指令被所述至少一个处理器执行,以使所述至少一个处理器执行权利要求7或8所述的基于多特征识别的交互方法的步骤。A computer device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the one processor, the instructions being executed by the at least one processor executed by one processor, so that the at least one processor executes the steps of the multi-feature identification-based interaction method according to any one of claims 1-6; or, the instruction is executed by the at least one processor, so that the at least one processor executes the steps of the multi-feature recognition-based interaction method according to claim 7 or 8.
  12. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1-6中任一项所述的基于多特征识别的交互方法的步骤;或者,所述计算机程序被处理器执行时实现如权利要求7或8所述的基于多特征识别的交互方法的步骤。A computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the multi-feature identification-based interaction method according to any one of claims 1-6 are implemented; or, The computer program, when executed by the processor, implements the steps of the multi-feature recognition-based interaction method as claimed in claim 7 or 8.
PCT/CN2021/106342 2020-12-04 2021-07-14 Interaction method and apparatus based on multi-feature recognition, and computer device WO2022116545A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011416912.9 2020-12-04
CN202011416912.9A CN112509148A (en) 2020-12-04 2020-12-04 Interaction method and device based on multi-feature recognition and computer equipment

Publications (1)

Publication Number Publication Date
WO2022116545A1 true WO2022116545A1 (en) 2022-06-09

Family

ID=74970301

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/106342 WO2022116545A1 (en) 2020-12-04 2021-07-14 Interaction method and apparatus based on multi-feature recognition, and computer device

Country Status (2)

Country Link
CN (1) CN112509148A (en)
WO (1) WO2022116545A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115424353A (en) * 2022-09-07 2022-12-02 杭银消费金融股份有限公司 AI model-based service user feature identification method and system

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112509148A (en) * 2020-12-04 2021-03-16 全球能源互联网研究院有限公司 Interaction method and device based on multi-feature recognition and computer equipment
CN113595867A (en) * 2021-06-22 2021-11-02 青岛海尔科技有限公司 Equipment operation method and device based on remote interaction

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105049875A (en) * 2015-07-24 2015-11-11 上海上大海润信息系统有限公司 Accurate key frame extraction method based on mixed features and sudden change detection
CN105739704A (en) * 2016-02-02 2016-07-06 上海尚镜信息科技有限公司 Remote guidance method and system based on augmented reality
CN105759960A (en) * 2016-02-02 2016-07-13 上海尚镜信息科技有限公司 Augmented reality remote guidance method and system in combination with 3D camera
CN106339094A (en) * 2016-09-05 2017-01-18 山东万腾电子科技有限公司 Interactive remote expert cooperation maintenance system and method based on augmented reality technology
CN110047150A (en) * 2019-04-24 2019-07-23 大唐环境产业集团股份有限公司 It is a kind of based on augmented reality complex device operation operate in bit emulator system
CN110505464A (en) * 2019-08-21 2019-11-26 佳都新太科技股份有限公司 A kind of number twinned system, method and computer equipment
US20200084251A1 (en) * 2018-09-10 2020-03-12 Aveva Software, Llc Visualization and interaction of 3d models via remotely rendered video stream system and method
CN110929560A (en) * 2019-10-11 2020-03-27 杭州电子科技大学 Video semi-automatic target labeling method integrating target detection and tracking
CN111754543A (en) * 2019-03-29 2020-10-09 杭州海康威视数字技术股份有限公司 Image processing method, device and system
CN112509148A (en) * 2020-12-04 2021-03-16 全球能源互联网研究院有限公司 Interaction method and device based on multi-feature recognition and computer equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109947991A (en) * 2017-10-31 2019-06-28 腾讯科技(深圳)有限公司 A kind of extraction method of key frame, device and storage medium
CN107844779B (en) * 2017-11-21 2021-03-23 重庆邮电大学 Video key frame extraction method
CN109118515B (en) * 2018-06-26 2022-04-01 全球能源互联网研究院有限公司 Video tracking method and device for power equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105049875A (en) * 2015-07-24 2015-11-11 上海上大海润信息系统有限公司 Accurate key frame extraction method based on mixed features and sudden change detection
CN105739704A (en) * 2016-02-02 2016-07-06 上海尚镜信息科技有限公司 Remote guidance method and system based on augmented reality
CN105759960A (en) * 2016-02-02 2016-07-13 上海尚镜信息科技有限公司 Augmented reality remote guidance method and system in combination with 3D camera
CN106339094A (en) * 2016-09-05 2017-01-18 山东万腾电子科技有限公司 Interactive remote expert cooperation maintenance system and method based on augmented reality technology
US20200084251A1 (en) * 2018-09-10 2020-03-12 Aveva Software, Llc Visualization and interaction of 3d models via remotely rendered video stream system and method
CN111754543A (en) * 2019-03-29 2020-10-09 杭州海康威视数字技术股份有限公司 Image processing method, device and system
CN110047150A (en) * 2019-04-24 2019-07-23 大唐环境产业集团股份有限公司 It is a kind of based on augmented reality complex device operation operate in bit emulator system
CN110505464A (en) * 2019-08-21 2019-11-26 佳都新太科技股份有限公司 A kind of number twinned system, method and computer equipment
CN110929560A (en) * 2019-10-11 2020-03-27 杭州电子科技大学 Video semi-automatic target labeling method integrating target detection and tracking
CN112509148A (en) * 2020-12-04 2021-03-16 全球能源互联网研究院有限公司 Interaction method and device based on multi-feature recognition and computer equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115424353A (en) * 2022-09-07 2022-12-02 杭银消费金融股份有限公司 AI model-based service user feature identification method and system
CN115424353B (en) * 2022-09-07 2023-05-05 杭银消费金融股份有限公司 Service user characteristic identification method and system based on AI model

Also Published As

Publication number Publication date
CN112509148A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
WO2022116545A1 (en) Interaction method and apparatus based on multi-feature recognition, and computer device
Muhammad et al. Cost-effective video summarization using deep CNN with hierarchical weighted fusion for IoT surveillance networks
US12094209B2 (en) Video data processing method and apparatus, device, and medium
Chen et al. What comprises a good talking-head video generation?: A survey and benchmark
US20200012888A1 (en) Image annotating method and electronic device
WO2020228766A1 (en) Target tracking method and system based on real scene modeling and intelligent recognition, and medium
CN107004271B (en) Display method, display apparatus, electronic device, computer program product, and storage medium
US9047376B2 (en) Augmenting video with facial recognition
US8863183B2 (en) Server system for real-time moving image collection, recognition, classification, processing, and delivery
CN113010703B (en) Information recommendation method and device, electronic equipment and storage medium
CN112861575A (en) Pedestrian structuring method, device, equipment and storage medium
WO2016028813A1 (en) Dynamically targeted ad augmentation in video
US20130243307A1 (en) Object identification in images or image sequences
US9606975B2 (en) Apparatus and method for automatically generating visual annotation based on visual language
WO2021225608A1 (en) Fully automated post-production editing for movies, tv shows and multimedia contents
CN104041063B (en) The related information storehouse of video makes and method, platform and the system of video playback
CN107992937B (en) Unstructured data judgment method and device based on deep learning
CN112417947B (en) Method and device for optimizing key point detection model and detecting face key points
US11683453B2 (en) Overlaying metadata on video streams on demand for intelligent video analysis
Liu et al. 3d action recognition using data visualization and convolutional neural networks
Shuai et al. Large scale real-world multi-person tracking
CN113391617A (en) Vehicle remote diagnosis method, storage, device and system based on 5G network
CN117115917A (en) Teacher behavior recognition method, device and medium based on multi-modal feature fusion
CN106778449B (en) Object identification method of dynamic image and interactive film establishment method for automatically capturing target image
KR20220108668A (en) Method for Analyzing Video

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21899586

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21899586

Country of ref document: EP

Kind code of ref document: A1