CN112732956A - Efficient query method based on perception multi-mode big data - Google Patents

Efficient query method based on perception multi-mode big data Download PDF

Info

Publication number
CN112732956A
CN112732956A CN202011547371.3A CN202011547371A CN112732956A CN 112732956 A CN112732956 A CN 112732956A CN 202011547371 A CN202011547371 A CN 202011547371A CN 112732956 A CN112732956 A CN 112732956A
Authority
CN
China
Prior art keywords
data
image
point cloud
query
modal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011547371.3A
Other languages
Chinese (zh)
Inventor
李海涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Zhishui Intelligent Technology Co ltd
Original Assignee
Jiangsu Zhishui Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Zhishui Intelligent Technology Co ltd filed Critical Jiangsu Zhishui Intelligent Technology Co ltd
Priority to CN202011547371.3A priority Critical patent/CN112732956A/en
Publication of CN112732956A publication Critical patent/CN112732956A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an efficient query method based on perception multi-modal big data, which comprises the following steps: collecting an image modal data set, filtering an image and generating point cloud data; step two: according to the image depth values under the point cloud data in the step one, obtaining characteristic values of all points under the point cloud data; step three: asynchronously collecting a text modal data set, and extracting the features of a text; step four: establishing a training table of image characteristics-text characteristics by using a CAA algorithm; step five: randomly classifying data in the training table into a training set database and a test set database, and performing transfer learning on the data in the training set database and the test set database; step six: and establishing a query model according to the transfer learning and uploading the query model to the cloud, inputting the subsequent query graph or character into the query model, judging the characteristics of the subsequent graph or character by the query model, and querying on the Internet according to the judged characteristics. The method may speculate on subsequent data based on the detected data.

Description

Efficient query method based on perception multi-mode big data
Technical Field
The invention relates to the field of data query, in particular to an efficient query method based on perception multi-mode big data.
Background
By "modality", English is modal, in colloquial terms is "sense", and multimodal is meant the fusion of multiple senses. The Turing OS robot operating system defines the interaction mode between the robot and the human as multi-mode interaction, namely man-machine interaction is carried out in various modes such as characters, voice, vision, actions and environment, and the interaction mode between the human and the human is fully simulated. The interactive mode accords with the morphological characteristics and the user expectation of robot products, and breaks through the traditional PC type keyboard input and the point-touch interactive mode of the smart phone.
In the field of water source management, automatic control is generally adopted, video monitoring, depth drop detection, water quality detection and the like are adopted, and an image set or a text set is generated no matter whether the video monitoring or the depth drop detection is adopted.
The existing processing mode for the image sets or the text sets is that the reasonability of the detection data of other same areas can not be judged according to the existing data through human observation or set maximum value alarm.
Disclosure of Invention
In order to overcome the defects in the prior art, the efficient query method based on the perception multi-modal big data provided by the invention can be used for speculating the subsequent data according to the detection data.
In order to achieve the above object, the efficient query method based on perception multi-modal big data of the present invention includes the following steps: collecting an image modal data set, filtering an image and generating point cloud data;
step two: according to the image depth values under the point cloud data in the step one, obtaining characteristic values of all points under the point cloud data; step three: asynchronously acquiring a text modal data set, and performing feature extraction on a text by using a CountVectorizer; step four: establishing a training table of image characteristics-text characteristics by using a CAA algorithm; step five: randomly classifying data in the training table to generate a training set database and a test set database, and performing transfer learning on the data in the training set database and the test set database; step six: and establishing a query model according to the transfer learning and uploading the query model to the cloud, inputting the subsequent query graph or character into the query model, judging the characteristics of the subsequent graph or character by the query model, and querying on the Internet according to the judged characteristics.
Further, in the step one, a variance method is adopted for filtering, and the formula is as follows:
Figure BDA0002856712880000021
wherein, ω is0Is the number of background points in the image proportion, u0Is the average gray of the number of background points of the image, u is the total average gray of the image; g is the variance of the image.
Further, in the second step, a threshold method is adopted to obtain the image depth value, and the formula is as follows:
p(x,y)=p(x,y)dis≤p(x,y)≤dis*2h,
where p (x, y) is the image depth, dis is the image height, and h is the height of the filtered image.
And in the second step, collecting image depth to form point cloud data, and processing the point cloud data by using PCL to obtain a characteristic value of the image depth.
Further, in step three, the characteristic selection frequency formula of the countvectorzer is as follows:
Figure BDA0002856712880000022
the IDF is a feature selection probability, m is the number of point cloud data, and g is the number of feature values.
Further, the ratio of the training set database to the test set database is 1: 2.
Has the advantages that: and calling characteristic values of the image modal data set and the text modal data set, establishing a model for the characteristic values by using transfer learning, training, inputting a subsequent measured image set or text set into the transfer learning model, and judging whether the subsequent image set or text set conforms to the surface of the transfer learning model so as to judge whether the data of the image set and the text set is qualified.
Drawings
The present invention will be further described and illustrated with reference to the following drawings.
FIG. 1 is a flow chart of a preferred embodiment of the present invention.
Detailed Description
The technical solution of the present invention will be more clearly and completely explained by the description of the preferred embodiments of the present invention with reference to the accompanying drawings.
As shown in fig. 1, the efficient query method based on perceptual multi-modal big data according to the preferred embodiment of the present invention includes the following steps: an image modality data set is collected, an image is filtered, and point cloud data is generated.
In the water management field, the acquired modality data generally includes two modalities, one is an image modality and one is a text modality. The image modality includes videos, photos and the like taken by the monitoring system. The text modality includes measured depth values, water quality values, and the like.
Point cloud data (point cloud data) refers to a collection of vectors in a three-dimensional coordinate system. The scan data is recorded in the form of dots, each dot containing three-dimensional coordinates, some of which may contain color information (RGB) or Intensity information (Intensity).
Step two: and C, obtaining characteristic values of all points under the point cloud data according to the image depth values under the point cloud data in the step one.
For each data point in the point cloud, a least squares local plane p is fitted through its K local proximity points such that the sum of the distances of all the proximity points of the data point to this plane is minimal.
Step three: the text modality data set is asynchronously collected and the text is feature extracted using a CountVectori zer.
The CountVectorzer () function only considers the frequency of occurrence of each word; then, a feature matrix is formed, and each line represents a word frequency statistical result of the training text. The idea is that according to all training texts, the appearance sequence is not considered, and each appeared vocabulary in the training texts is only considered as a column of characteristics to form a vocabulary table.
Step four: and establishing a training table of image features-text features by using a CAA algorithm.
Computer aided numerical analysis (CAA for short) and computer aided design (CAD for short) are analysis and design methods using computer as main tool, which are an emerging discipline developed based on computing technology, applied mathematics and simulation theory, and become an important branch in the field of computer application. The rapid development of scientific technology makes mathematical models built in scientific theoretical research, new product development and engineering design increasingly complex.
Step five: and randomly classifying the data in the training table to generate a training set database and a test set database, and performing transfer learning on the data in the training set database and the test set database.
The example-based transfer learning research is to select examples useful for training in the target field from the source field, for example, effective weight distribution is performed on labeled data examples in the source field, and the distribution of the source field examples is close to that of the examples in the target field, so that a reliable learning model with high classification accuracy is established in the target field.
Because the data distribution of the source domain and the target domain in the migration learning is inconsistent, all marked data instances in the source domain are not necessarily useful for the target domain.
Step six: and establishing a query model according to the transfer learning and uploading the query model to the cloud, inputting the subsequent query graph or character into the query model, judging the characteristics of the subsequent graph or character by the query model, and querying on the Internet according to the judged characteristics.
After the migration learning result is uploaded, different data can be conveniently distinguished from multiple places for multiple times.
The above detailed description merely describes preferred embodiments of the present invention and does not limit the scope of the invention. Without departing from the spirit and scope of the present invention, it should be understood that various changes, substitutions and alterations can be made herein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims and their equivalents. The scope of the invention is defined by the claims.

Claims (6)

1. An efficient query method based on perception multi-modal big data comprises the following steps,
the method comprises the following steps: collecting an image modal data set, filtering an image and generating point cloud data;
step two: according to the image depth values under the point cloud data in the step one, obtaining characteristic values of all points under the point cloud data;
step three: asynchronously acquiring a text modal data set, and performing feature extraction on a text by using a CountVectorizer;
step four: establishing a training table of image characteristics-text characteristics by using a CAA algorithm;
step five: randomly classifying data in the training table to generate a training set database and a test set database, and performing transfer learning on the data in the training set database and the test set database;
step six: and establishing a query model according to the transfer learning and uploading the query model to the cloud, inputting the subsequent query graph or character into the query model, judging the characteristics of the subsequent graph or character by the query model, and querying on the Internet according to the judged characteristics.
2. The efficient query method based on perceptual multi-modal big data as claimed in claim 1, wherein the filtering in the first step is performed by a variance method, and the formula is as follows:
Figure FDA0002856712870000011
wherein, ω is0Is the number of background points in the image proportion, u0Is the average gray of the number of background points of the image, u is the total average gray of the image; g is the variance of the image.
3. The efficient query method based on perceptual multi-modal big data as claimed in claim 2, wherein the image depth value is obtained by a thresholding method in the second step, and the formula is as follows:
p(x,y)=p(x,y)dis≤p(x,y)≤dis*2h,
where p (x, y) is the image depth, dis is the image height, and h is the height of the filtered image.
4. The efficient query method based on the perceptual multi-modal big data as claimed in claim 3, wherein in the second step, image depth is collected to form point cloud data, and PCL is adopted to process the point cloud data to obtain a feature value of the image depth.
5. The efficient query method based on perceptual multi-modal big data as claimed in claim 3, wherein in step three, the feature selection frequency formula of the countvectorzer is as follows:
Figure FDA0002856712870000012
the IDF is a feature selection probability, m is the number of point cloud data, and g is the number of feature values.
6. The efficient perceptual-multimodal big data based query method of claim 3, wherein a ratio of the training set database to the test set database is 1: 2.
CN202011547371.3A 2020-12-24 2020-12-24 Efficient query method based on perception multi-mode big data Pending CN112732956A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011547371.3A CN112732956A (en) 2020-12-24 2020-12-24 Efficient query method based on perception multi-mode big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011547371.3A CN112732956A (en) 2020-12-24 2020-12-24 Efficient query method based on perception multi-mode big data

Publications (1)

Publication Number Publication Date
CN112732956A true CN112732956A (en) 2021-04-30

Family

ID=75605556

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011547371.3A Pending CN112732956A (en) 2020-12-24 2020-12-24 Efficient query method based on perception multi-mode big data

Country Status (1)

Country Link
CN (1) CN112732956A (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719140A (en) * 2009-12-23 2010-06-02 中山大学 Figure retrieving method
US20120189211A1 (en) * 2011-01-21 2012-07-26 Jiebo Luo Rapid image search in a large database
CN107679580A (en) * 2017-10-21 2018-02-09 桂林电子科技大学 A kind of isomery shift image feeling polarities analysis method based on the potential association of multi-modal depth
US20180181592A1 (en) * 2016-12-27 2018-06-28 Adobe Systems Incorporate Multi-modal image ranking using neural networks
CN108334574A (en) * 2018-01-23 2018-07-27 南京邮电大学 A kind of cross-module state search method decomposed based on Harmonious Matrix
CN108595636A (en) * 2018-04-25 2018-09-28 复旦大学 The image search method of cartographical sketching based on depth cross-module state correlation study
CN108701220A (en) * 2016-02-05 2018-10-23 索尼公司 System and method for handling multi-modality images
CN109033245A (en) * 2018-07-05 2018-12-18 清华大学 A kind of mobile robot visual-radar image cross-module state search method
CN110569387A (en) * 2019-08-20 2019-12-13 清华大学 radar-image cross-modal retrieval method based on depth hash algorithm
CN110647904A (en) * 2019-08-01 2020-01-03 中国科学院信息工程研究所 Cross-modal retrieval method and system based on unmarked data migration
US10614366B1 (en) * 2006-01-31 2020-04-07 The Research Foundation for the State University o System and method for multimedia ranking and multi-modal image retrieval using probabilistic semantic models and expectation-maximization (EM) learning
CN111190981A (en) * 2019-12-25 2020-05-22 中国科学院上海微系统与信息技术研究所 Method and device for constructing three-dimensional semantic map, electronic equipment and storage medium
US20200388071A1 (en) * 2019-06-06 2020-12-10 Qualcomm Technologies, Inc. Model retrieval for objects in images using field descriptors

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10614366B1 (en) * 2006-01-31 2020-04-07 The Research Foundation for the State University o System and method for multimedia ranking and multi-modal image retrieval using probabilistic semantic models and expectation-maximization (EM) learning
CN101719140A (en) * 2009-12-23 2010-06-02 中山大学 Figure retrieving method
US20120189211A1 (en) * 2011-01-21 2012-07-26 Jiebo Luo Rapid image search in a large database
CN108701220A (en) * 2016-02-05 2018-10-23 索尼公司 System and method for handling multi-modality images
US20180181592A1 (en) * 2016-12-27 2018-06-28 Adobe Systems Incorporate Multi-modal image ranking using neural networks
CN107679580A (en) * 2017-10-21 2018-02-09 桂林电子科技大学 A kind of isomery shift image feeling polarities analysis method based on the potential association of multi-modal depth
CN108334574A (en) * 2018-01-23 2018-07-27 南京邮电大学 A kind of cross-module state search method decomposed based on Harmonious Matrix
CN108595636A (en) * 2018-04-25 2018-09-28 复旦大学 The image search method of cartographical sketching based on depth cross-module state correlation study
CN109033245A (en) * 2018-07-05 2018-12-18 清华大学 A kind of mobile robot visual-radar image cross-module state search method
US20200388071A1 (en) * 2019-06-06 2020-12-10 Qualcomm Technologies, Inc. Model retrieval for objects in images using field descriptors
CN110647904A (en) * 2019-08-01 2020-01-03 中国科学院信息工程研究所 Cross-modal retrieval method and system based on unmarked data migration
CN110569387A (en) * 2019-08-20 2019-12-13 清华大学 radar-image cross-modal retrieval method based on depth hash algorithm
CN111190981A (en) * 2019-12-25 2020-05-22 中国科学院上海微系统与信息技术研究所 Method and device for constructing three-dimensional semantic map, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KAIMIN WEI;ZHIBO ZHOU: "Adversarial Attentive Multi-Modal Embedding Learning for Image-Text Matching", 《IEEE》 *
李晓雨等: "基于迁移学习的图像检索算法", 《计算机科学》, vol. 46, no. 1 *

Similar Documents

Publication Publication Date Title
JPH06243297A (en) Method and equipment for automatic handwritten character recognition using static and dynamic parameter
Rabbani et al. Hand drawn optical circuit recognition
Qiu et al. 3d-aware scene change captioning from multiview images
Dering et al. An unsupervised machine learning approach to assessing designer performance during physical prototyping
CN114863125A (en) Intelligent scoring method and system for calligraphy/fine art works
Zhang Application of artificial intelligence recognition technology in digital image processing
Pradhan et al. A hand gesture recognition using feature extraction
CN108428234B (en) Interactive segmentation performance optimization method based on image segmentation result evaluation
Lahiani et al. Real Time Static Hand Gesture Recognition System for Mobile Devices.
Gang et al. Coresets for PCB character recognition based on deep learning
CN109886164B (en) Abnormal gesture recognition and processing method
CN112732956A (en) Efficient query method based on perception multi-mode big data
CN116486177A (en) Underwater target identification and classification method based on deep learning
CN113420839B (en) Semi-automatic labeling method and segmentation positioning system for stacking planar target objects
CN113158878B (en) Heterogeneous migration fault diagnosis method, system and model based on subspace
CN114595786A (en) Attention fine-grained classification method based on weak supervision position positioning
Li et al. Using scribble gestures to enhance editing behaviors of sketch recognition systems
CN114840680A (en) Entity relationship joint extraction method, device, storage medium and terminal
CN104680123A (en) Object identification device, object identification method and program
Munggaran et al. Handwritten pattern recognition using Kohonen neural network based on pixel character
Shanmugam et al. Newton algorithm based DELM for enhancing offline tamil handwritten character recognition
Roj et al. Classification of CAD-Models Based on Graph Structures and Machine Learning
CN111582400A (en) Deep learning-based garment image classification model establishing method
Cheng et al. Research on recognition method of interface elements based on machine learning
Chen et al. A new semantic-based tool detection method for robots

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination