CN112732956A - Efficient query method based on perception multi-mode big data - Google Patents
Efficient query method based on perception multi-mode big data Download PDFInfo
- Publication number
- CN112732956A CN112732956A CN202011547371.3A CN202011547371A CN112732956A CN 112732956 A CN112732956 A CN 112732956A CN 202011547371 A CN202011547371 A CN 202011547371A CN 112732956 A CN112732956 A CN 112732956A
- Authority
- CN
- China
- Prior art keywords
- data
- image
- point cloud
- query
- modal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 21
- 230000008447 perception Effects 0.000 title claims abstract description 7
- 238000012549 training Methods 0.000 claims abstract description 23
- 238000013526 transfer learning Methods 0.000 claims abstract description 12
- 238000012360 testing method Methods 0.000 claims abstract description 10
- 238000001914 filtration Methods 0.000 claims abstract description 5
- 238000000605 extraction Methods 0.000 claims description 2
- 238000001514 detection method Methods 0.000 description 5
- 230000003993 interaction Effects 0.000 description 4
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 4
- 238000012544 monitoring process Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012356 Product development Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 238000011960 computer-aided design Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Library & Information Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an efficient query method based on perception multi-modal big data, which comprises the following steps: collecting an image modal data set, filtering an image and generating point cloud data; step two: according to the image depth values under the point cloud data in the step one, obtaining characteristic values of all points under the point cloud data; step three: asynchronously collecting a text modal data set, and extracting the features of a text; step four: establishing a training table of image characteristics-text characteristics by using a CAA algorithm; step five: randomly classifying data in the training table into a training set database and a test set database, and performing transfer learning on the data in the training set database and the test set database; step six: and establishing a query model according to the transfer learning and uploading the query model to the cloud, inputting the subsequent query graph or character into the query model, judging the characteristics of the subsequent graph or character by the query model, and querying on the Internet according to the judged characteristics. The method may speculate on subsequent data based on the detected data.
Description
Technical Field
The invention relates to the field of data query, in particular to an efficient query method based on perception multi-mode big data.
Background
By "modality", English is modal, in colloquial terms is "sense", and multimodal is meant the fusion of multiple senses. The Turing OS robot operating system defines the interaction mode between the robot and the human as multi-mode interaction, namely man-machine interaction is carried out in various modes such as characters, voice, vision, actions and environment, and the interaction mode between the human and the human is fully simulated. The interactive mode accords with the morphological characteristics and the user expectation of robot products, and breaks through the traditional PC type keyboard input and the point-touch interactive mode of the smart phone.
In the field of water source management, automatic control is generally adopted, video monitoring, depth drop detection, water quality detection and the like are adopted, and an image set or a text set is generated no matter whether the video monitoring or the depth drop detection is adopted.
The existing processing mode for the image sets or the text sets is that the reasonability of the detection data of other same areas can not be judged according to the existing data through human observation or set maximum value alarm.
Disclosure of Invention
In order to overcome the defects in the prior art, the efficient query method based on the perception multi-modal big data provided by the invention can be used for speculating the subsequent data according to the detection data.
In order to achieve the above object, the efficient query method based on perception multi-modal big data of the present invention includes the following steps: collecting an image modal data set, filtering an image and generating point cloud data;
step two: according to the image depth values under the point cloud data in the step one, obtaining characteristic values of all points under the point cloud data; step three: asynchronously acquiring a text modal data set, and performing feature extraction on a text by using a CountVectorizer; step four: establishing a training table of image characteristics-text characteristics by using a CAA algorithm; step five: randomly classifying data in the training table to generate a training set database and a test set database, and performing transfer learning on the data in the training set database and the test set database; step six: and establishing a query model according to the transfer learning and uploading the query model to the cloud, inputting the subsequent query graph or character into the query model, judging the characteristics of the subsequent graph or character by the query model, and querying on the Internet according to the judged characteristics.
Further, in the step one, a variance method is adopted for filtering, and the formula is as follows:
wherein, ω is0Is the number of background points in the image proportion, u0Is the average gray of the number of background points of the image, u is the total average gray of the image; g is the variance of the image.
Further, in the second step, a threshold method is adopted to obtain the image depth value, and the formula is as follows:
p(x,y)=p(x,y)dis≤p(x,y)≤dis*2h,
where p (x, y) is the image depth, dis is the image height, and h is the height of the filtered image.
And in the second step, collecting image depth to form point cloud data, and processing the point cloud data by using PCL to obtain a characteristic value of the image depth.
Further, in step three, the characteristic selection frequency formula of the countvectorzer is as follows:
the IDF is a feature selection probability, m is the number of point cloud data, and g is the number of feature values.
Further, the ratio of the training set database to the test set database is 1: 2.
Has the advantages that: and calling characteristic values of the image modal data set and the text modal data set, establishing a model for the characteristic values by using transfer learning, training, inputting a subsequent measured image set or text set into the transfer learning model, and judging whether the subsequent image set or text set conforms to the surface of the transfer learning model so as to judge whether the data of the image set and the text set is qualified.
Drawings
The present invention will be further described and illustrated with reference to the following drawings.
FIG. 1 is a flow chart of a preferred embodiment of the present invention.
Detailed Description
The technical solution of the present invention will be more clearly and completely explained by the description of the preferred embodiments of the present invention with reference to the accompanying drawings.
As shown in fig. 1, the efficient query method based on perceptual multi-modal big data according to the preferred embodiment of the present invention includes the following steps: an image modality data set is collected, an image is filtered, and point cloud data is generated.
In the water management field, the acquired modality data generally includes two modalities, one is an image modality and one is a text modality. The image modality includes videos, photos and the like taken by the monitoring system. The text modality includes measured depth values, water quality values, and the like.
Point cloud data (point cloud data) refers to a collection of vectors in a three-dimensional coordinate system. The scan data is recorded in the form of dots, each dot containing three-dimensional coordinates, some of which may contain color information (RGB) or Intensity information (Intensity).
Step two: and C, obtaining characteristic values of all points under the point cloud data according to the image depth values under the point cloud data in the step one.
For each data point in the point cloud, a least squares local plane p is fitted through its K local proximity points such that the sum of the distances of all the proximity points of the data point to this plane is minimal.
Step three: the text modality data set is asynchronously collected and the text is feature extracted using a CountVectori zer.
The CountVectorzer () function only considers the frequency of occurrence of each word; then, a feature matrix is formed, and each line represents a word frequency statistical result of the training text. The idea is that according to all training texts, the appearance sequence is not considered, and each appeared vocabulary in the training texts is only considered as a column of characteristics to form a vocabulary table.
Step four: and establishing a training table of image features-text features by using a CAA algorithm.
Computer aided numerical analysis (CAA for short) and computer aided design (CAD for short) are analysis and design methods using computer as main tool, which are an emerging discipline developed based on computing technology, applied mathematics and simulation theory, and become an important branch in the field of computer application. The rapid development of scientific technology makes mathematical models built in scientific theoretical research, new product development and engineering design increasingly complex.
Step five: and randomly classifying the data in the training table to generate a training set database and a test set database, and performing transfer learning on the data in the training set database and the test set database.
The example-based transfer learning research is to select examples useful for training in the target field from the source field, for example, effective weight distribution is performed on labeled data examples in the source field, and the distribution of the source field examples is close to that of the examples in the target field, so that a reliable learning model with high classification accuracy is established in the target field.
Because the data distribution of the source domain and the target domain in the migration learning is inconsistent, all marked data instances in the source domain are not necessarily useful for the target domain.
Step six: and establishing a query model according to the transfer learning and uploading the query model to the cloud, inputting the subsequent query graph or character into the query model, judging the characteristics of the subsequent graph or character by the query model, and querying on the Internet according to the judged characteristics.
After the migration learning result is uploaded, different data can be conveniently distinguished from multiple places for multiple times.
The above detailed description merely describes preferred embodiments of the present invention and does not limit the scope of the invention. Without departing from the spirit and scope of the present invention, it should be understood that various changes, substitutions and alterations can be made herein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims and their equivalents. The scope of the invention is defined by the claims.
Claims (6)
1. An efficient query method based on perception multi-modal big data comprises the following steps,
the method comprises the following steps: collecting an image modal data set, filtering an image and generating point cloud data;
step two: according to the image depth values under the point cloud data in the step one, obtaining characteristic values of all points under the point cloud data;
step three: asynchronously acquiring a text modal data set, and performing feature extraction on a text by using a CountVectorizer;
step four: establishing a training table of image characteristics-text characteristics by using a CAA algorithm;
step five: randomly classifying data in the training table to generate a training set database and a test set database, and performing transfer learning on the data in the training set database and the test set database;
step six: and establishing a query model according to the transfer learning and uploading the query model to the cloud, inputting the subsequent query graph or character into the query model, judging the characteristics of the subsequent graph or character by the query model, and querying on the Internet according to the judged characteristics.
2. The efficient query method based on perceptual multi-modal big data as claimed in claim 1, wherein the filtering in the first step is performed by a variance method, and the formula is as follows:
wherein, ω is0Is the number of background points in the image proportion, u0Is the average gray of the number of background points of the image, u is the total average gray of the image; g is the variance of the image.
3. The efficient query method based on perceptual multi-modal big data as claimed in claim 2, wherein the image depth value is obtained by a thresholding method in the second step, and the formula is as follows:
p(x,y)=p(x,y)dis≤p(x,y)≤dis*2h,
where p (x, y) is the image depth, dis is the image height, and h is the height of the filtered image.
4. The efficient query method based on the perceptual multi-modal big data as claimed in claim 3, wherein in the second step, image depth is collected to form point cloud data, and PCL is adopted to process the point cloud data to obtain a feature value of the image depth.
5. The efficient query method based on perceptual multi-modal big data as claimed in claim 3, wherein in step three, the feature selection frequency formula of the countvectorzer is as follows:
the IDF is a feature selection probability, m is the number of point cloud data, and g is the number of feature values.
6. The efficient perceptual-multimodal big data based query method of claim 3, wherein a ratio of the training set database to the test set database is 1: 2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011547371.3A CN112732956A (en) | 2020-12-24 | 2020-12-24 | Efficient query method based on perception multi-mode big data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011547371.3A CN112732956A (en) | 2020-12-24 | 2020-12-24 | Efficient query method based on perception multi-mode big data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112732956A true CN112732956A (en) | 2021-04-30 |
Family
ID=75605556
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011547371.3A Pending CN112732956A (en) | 2020-12-24 | 2020-12-24 | Efficient query method based on perception multi-mode big data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112732956A (en) |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101719140A (en) * | 2009-12-23 | 2010-06-02 | 中山大学 | Figure retrieving method |
US20120189211A1 (en) * | 2011-01-21 | 2012-07-26 | Jiebo Luo | Rapid image search in a large database |
CN107679580A (en) * | 2017-10-21 | 2018-02-09 | 桂林电子科技大学 | A kind of isomery shift image feeling polarities analysis method based on the potential association of multi-modal depth |
US20180181592A1 (en) * | 2016-12-27 | 2018-06-28 | Adobe Systems Incorporate | Multi-modal image ranking using neural networks |
CN108334574A (en) * | 2018-01-23 | 2018-07-27 | 南京邮电大学 | A kind of cross-module state search method decomposed based on Harmonious Matrix |
CN108595636A (en) * | 2018-04-25 | 2018-09-28 | 复旦大学 | The image search method of cartographical sketching based on depth cross-module state correlation study |
CN108701220A (en) * | 2016-02-05 | 2018-10-23 | 索尼公司 | System and method for handling multi-modality images |
CN109033245A (en) * | 2018-07-05 | 2018-12-18 | 清华大学 | A kind of mobile robot visual-radar image cross-module state search method |
CN110569387A (en) * | 2019-08-20 | 2019-12-13 | 清华大学 | radar-image cross-modal retrieval method based on depth hash algorithm |
CN110647904A (en) * | 2019-08-01 | 2020-01-03 | 中国科学院信息工程研究所 | Cross-modal retrieval method and system based on unmarked data migration |
US10614366B1 (en) * | 2006-01-31 | 2020-04-07 | The Research Foundation for the State University o | System and method for multimedia ranking and multi-modal image retrieval using probabilistic semantic models and expectation-maximization (EM) learning |
CN111190981A (en) * | 2019-12-25 | 2020-05-22 | 中国科学院上海微系统与信息技术研究所 | Method and device for constructing three-dimensional semantic map, electronic equipment and storage medium |
US20200388071A1 (en) * | 2019-06-06 | 2020-12-10 | Qualcomm Technologies, Inc. | Model retrieval for objects in images using field descriptors |
-
2020
- 2020-12-24 CN CN202011547371.3A patent/CN112732956A/en active Pending
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10614366B1 (en) * | 2006-01-31 | 2020-04-07 | The Research Foundation for the State University o | System and method for multimedia ranking and multi-modal image retrieval using probabilistic semantic models and expectation-maximization (EM) learning |
CN101719140A (en) * | 2009-12-23 | 2010-06-02 | 中山大学 | Figure retrieving method |
US20120189211A1 (en) * | 2011-01-21 | 2012-07-26 | Jiebo Luo | Rapid image search in a large database |
CN108701220A (en) * | 2016-02-05 | 2018-10-23 | 索尼公司 | System and method for handling multi-modality images |
US20180181592A1 (en) * | 2016-12-27 | 2018-06-28 | Adobe Systems Incorporate | Multi-modal image ranking using neural networks |
CN107679580A (en) * | 2017-10-21 | 2018-02-09 | 桂林电子科技大学 | A kind of isomery shift image feeling polarities analysis method based on the potential association of multi-modal depth |
CN108334574A (en) * | 2018-01-23 | 2018-07-27 | 南京邮电大学 | A kind of cross-module state search method decomposed based on Harmonious Matrix |
CN108595636A (en) * | 2018-04-25 | 2018-09-28 | 复旦大学 | The image search method of cartographical sketching based on depth cross-module state correlation study |
CN109033245A (en) * | 2018-07-05 | 2018-12-18 | 清华大学 | A kind of mobile robot visual-radar image cross-module state search method |
US20200388071A1 (en) * | 2019-06-06 | 2020-12-10 | Qualcomm Technologies, Inc. | Model retrieval for objects in images using field descriptors |
CN110647904A (en) * | 2019-08-01 | 2020-01-03 | 中国科学院信息工程研究所 | Cross-modal retrieval method and system based on unmarked data migration |
CN110569387A (en) * | 2019-08-20 | 2019-12-13 | 清华大学 | radar-image cross-modal retrieval method based on depth hash algorithm |
CN111190981A (en) * | 2019-12-25 | 2020-05-22 | 中国科学院上海微系统与信息技术研究所 | Method and device for constructing three-dimensional semantic map, electronic equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
KAIMIN WEI;ZHIBO ZHOU: "Adversarial Attentive Multi-Modal Embedding Learning for Image-Text Matching", 《IEEE》 * |
李晓雨等: "基于迁移学习的图像检索算法", 《计算机科学》, vol. 46, no. 1 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JPH06243297A (en) | Method and equipment for automatic handwritten character recognition using static and dynamic parameter | |
Rabbani et al. | Hand drawn optical circuit recognition | |
Qiu et al. | 3d-aware scene change captioning from multiview images | |
Dering et al. | An unsupervised machine learning approach to assessing designer performance during physical prototyping | |
CN114863125A (en) | Intelligent scoring method and system for calligraphy/fine art works | |
Zhang | Application of artificial intelligence recognition technology in digital image processing | |
Pradhan et al. | A hand gesture recognition using feature extraction | |
CN108428234B (en) | Interactive segmentation performance optimization method based on image segmentation result evaluation | |
Lahiani et al. | Real Time Static Hand Gesture Recognition System for Mobile Devices. | |
Gang et al. | Coresets for PCB character recognition based on deep learning | |
CN109886164B (en) | Abnormal gesture recognition and processing method | |
CN112732956A (en) | Efficient query method based on perception multi-mode big data | |
CN116486177A (en) | Underwater target identification and classification method based on deep learning | |
CN113420839B (en) | Semi-automatic labeling method and segmentation positioning system for stacking planar target objects | |
CN113158878B (en) | Heterogeneous migration fault diagnosis method, system and model based on subspace | |
CN114595786A (en) | Attention fine-grained classification method based on weak supervision position positioning | |
Li et al. | Using scribble gestures to enhance editing behaviors of sketch recognition systems | |
CN114840680A (en) | Entity relationship joint extraction method, device, storage medium and terminal | |
CN104680123A (en) | Object identification device, object identification method and program | |
Munggaran et al. | Handwritten pattern recognition using Kohonen neural network based on pixel character | |
Shanmugam et al. | Newton algorithm based DELM for enhancing offline tamil handwritten character recognition | |
Roj et al. | Classification of CAD-Models Based on Graph Structures and Machine Learning | |
CN111582400A (en) | Deep learning-based garment image classification model establishing method | |
Cheng et al. | Research on recognition method of interface elements based on machine learning | |
Chen et al. | A new semantic-based tool detection method for robots |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |