CN102147815B

CN102147815B - Method and system for searching images

Info

Publication number: CN102147815B
Application number: CN 201110100485
Authority: CN
Inventors: 段凌宇; 纪荣嵘; 陈杰; 李冰; 黄铁军; 姚鸿勋; 高文
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2011-04-21
Filing date: 2011-04-21
Publication date: 2013-04-17
Anticipated expiration: 2031-04-21
Also published as: CN102147815A

Abstract

The present invention provides a picture search method and a picture search system, the method includes a client receiving query content, the query content includes the target picture to be queried, or the target picture to be queried and related information; the client obtains the visual word of the target picture , and select at least one target visual word dictionary corresponding to the query content from more than one visual word dictionary on the client according to preset rules, and obtain the target visual word of the visual word according to the target visual word dictionary; after encoding the target visual word Sent to the server to obtain the result picture matching the query content and/or the relevant information of the result picture. The method improves the image search speed by reducing the amount of data uploaded by the client, shortens the user's waiting time, and can improve the accuracy of system retrieval.

Description

Image search method and image search system

技术领域 technical field

本发明涉及图片的识别与搜索技术领域，尤其涉及一种图片搜索方法和图片搜索系统。The invention relates to the technical field of picture identification and search, in particular to a picture search method and a picture search system.

背景技术 Background technique

随着无线网络的蓬勃发展和移动设备的功能不断增强，用户利用移动设备上网查询图片信息越来越频繁。最早出现的是采用文本描述图片的内容，进而依据该些文本内容进行后续的检索/搜索。然而由于文本不能准确描述图片内容，以及文本检索图片的检索结果常常不是用户所需要的信息，进而文本检索方式不能令用户满意。With the vigorous development of wireless networks and the continuous enhancement of the functions of mobile devices, users use mobile devices to search for picture information online more and more frequently. What first appeared was to use text to describe the content of pictures, and then perform subsequent retrieval/search based on the text content. However, because the text cannot accurately describe the content of the picture, and the retrieval result of the text retrieval image is often not the information that the user needs, the text retrieval method cannot satisfy the user.

另外一种基于内容的图片搜索方法是采用图片作为查询，搜索到相似图片为目的一种检索方法，可以避免上述文本检索图片所带来的文本描述不准确的问题。但是，该基于内容的图片搜索方法是直接向服务端传送图像，由此会产生较大的数据传输量。特别地，在带宽有限且不稳定的无线网络环境下，图片搜索往往需要较长的查询响应时间。Another content-based image search method is to use images as queries and search for similar images as a retrieval method, which can avoid the problem of inaccurate text descriptions caused by the above-mentioned text retrieval images. However, this content-based image search method is to directly transmit images to the server, which will generate a large amount of data transmission. In particular, in an unstable wireless network environment with limited bandwidth, image search often requires a long query response time.

由此，业内人士通过视觉描述子对图片进行描述，将图片转化成多个数据组成的一维向量，从而将向服务器传送图片改成向服务器传输数据向量。该视觉描述子对图片的描述方式能够提高图片的查询响应时间，但是受限于目前移动网络质量，上传速度仍然不能满足用户的实际需求。鉴于此，如何提供一种即能保证图片检索性能与效率，又可以降低图片检索中对带宽的要求的图片检索方法是当前需要解决的技术问题。As a result, people in the industry use visual descriptors to describe the picture, and convert the picture into a one-dimensional vector composed of multiple data, so that the transmission of pictures to the server is changed to the transmission of data vectors to the server. The way the visual descriptor describes the picture can improve the query response time of the picture, but limited by the quality of the current mobile network, the upload speed still cannot meet the actual needs of users. In view of this, how to provide an image retrieval method that can not only ensure the performance and efficiency of image retrieval, but also reduce the bandwidth requirement in image retrieval is a technical problem that needs to be solved at present.

发明内容Contents of the invention

针对现有技术中的缺陷，本发明提供一种图片搜索方法和图片搜索系统，该方法和系统通过在不降低搜索性能的条件下减少客户端上传的数据量的方式提高图片的检索速度，缩短了用户的等待时间，且能够提升搜索系统搜索的准确率。Aiming at the defects in the prior art, the present invention provides a picture search method and a picture search system. The method and the system improve the picture retrieval speed by reducing the amount of data uploaded by the client without reducing the search performance, shortening the The user's waiting time is reduced, and the accuracy of the search system can be improved.

本发明提供的图片搜索方法，包括：The image search method provided by the present invention includes:

客户端接收查询内容，该查询内容包括待查询的目标图片、或者待查询的目标图片和相关信息；The client receives the query content, the query content includes the target picture to be queried, or the target picture to be queried and related information;

客户端获取目标图片的视觉单词，并依据预置规则在客户端的一个以上的视觉单词词典中选取与查询内容对应的至少一个目标视觉单词词典，以及，依据目标视觉单词词典获取视觉单词的目标视觉单词；The client obtains the visual word of the target picture, and selects at least one target visual word dictionary corresponding to the query content from more than one visual word dictionaries on the client according to preset rules, and obtains the target visual word of the visual word according to the target visual word dictionary word;

将目标视觉单词编码后发送至服务端，以获取匹配查询内容的结果图片和/或结果图片的相关信息。Encode the target visual word and send it to the server to obtain the result picture matching the query content and/or the relevant information of the result picture.

根据本发明的另一方面，本发明还提供一种图片搜索方法，其包括：According to another aspect of the present invention, the present invention also provides an image search method, which includes:

服务端接收编码后的目标视觉单词并解码出目标视觉单词；The server receives the encoded target visual word and decodes the target visual word;

该服务端基于目标视觉单词查找服务端内视觉单词词典对应的索引表，以获得结果图片和/或结果图片的相关信息，并将其发送至客户端；The server searches the index table corresponding to the visual word dictionary in the server based on the target visual word to obtain the result picture and/or the relevant information of the result picture, and sends it to the client;

所述视觉单词词典为：对服务端图片数据库的全部图片的视觉特征采用聚类方式建立的视觉单词词典。The visual word dictionary is: a visual word dictionary established by clustering the visual features of all pictures in the server-side picture database.

根据本发明的另一方面，本发明还提供一种图片搜索系统，其包括：According to another aspect of the present invention, the present invention also provides an image search system, which includes:

接收模块，客户端接收包括待查询的目标图片、或者待查询的目标图片和相关信息的查询内容；The receiving module, the client receives the query content including the target picture to be queried, or the target picture to be queried and related information;

目标视觉单词获取模块，客户端获取目标图片的视觉单词，并依据预置规则在客户端的一个以上的视觉单词词典中选取与查询内容对应的至少一个目标视觉单词词典，以及，依据目标视觉单词词典获取视觉单词的目标视觉单词；The target visual word acquisition module, the client obtains the visual word of the target picture, and selects at least one target visual word dictionary corresponding to the query content from more than one visual word dictionary on the client according to preset rules, and, according to the target visual word dictionary Get the target visual word of the visual word;

目标视觉单词发送模块，将目标视觉单词编码后发送至服务端，The target visual word sending module encodes the target visual word and sends it to the server,

接收和查找模块，服务端接收编码的目标视觉单词并解码，以及基于目标视觉单词查找数据库中所有图片的视觉单词词典对应的索引表，以获得结果图片和/或结果图片的相关信息；The receiving and searching module, the server receives the coded target visual word and decodes it, and searches the index table corresponding to the visual word dictionary of all pictures in the database based on the target visual word to obtain the result picture and/or the relevant information of the result picture;

发送模块，所述服务端将结果图片和/或结果图片的相关信息发送至客户端。A sending module, the server sends the result picture and/or related information of the result picture to the client.

本发明的图片搜索方法和图片搜索系统，主要是通过在客户端将目标图片压缩为具有视觉内容描述能力的目标视觉单词，以上传至服务端，进而实现客户端和服务端之间的低比特传输数据，缩短了用户在查询目标图片时的等待时间，同时提高了系统中的服务端的响应时间，进而提高了图片搜索方法中的查询效率。The image search method and image search system of the present invention mainly compress target images into target visual words with visual content description capabilities on the client side, and then upload them to the server, thereby realizing low bit rate between the client and the server. The data transmission shortens the waiting time of the user when querying the target picture, and improves the response time of the server in the system at the same time, thereby improving the query efficiency in the picture search method.

进一步地，本发明中的搜索方法还能够提高搜索结果的准确率。本发明能够推广应用于各种图片的检索/搜索，且能够获取到结果图片的扩展信息，使得该方法的适用范围较广，可适用各个领域，方便用户检索各类信息。Furthermore, the search method in the present invention can also improve the accuracy of search results. The present invention can be popularized and applied to the retrieval/search of various pictures, and can obtain the extended information of the result picture, so that the method has a wide application range, can be applied to various fields, and is convenient for users to search various types of information.

附图说明 Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description These are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.

图1为本发明中的图片搜索方法实施例的步骤流程图；Fig. 1 is a flow chart of steps of an embodiment of a picture search method in the present invention;

图2为本发明中的用于筛选有效视觉词典的步骤流程图；Fig. 2 is a flow chart of steps for screening effective visual dictionaries in the present invention;

图3为本发明中的图片搜索方法实施例的步骤流程图；Fig. 3 is a flow chart of the steps of an embodiment of the image search method in the present invention;

图4为本发明中的图片搜索系统实施例的结构示意图。FIG. 4 is a schematic structural diagram of an embodiment of the image search system in the present invention.

具体实施方式 Detailed ways

为使本发明的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动的前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are the Some, but not all, embodiments are invented. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

本发明主要是提供一种图片搜索方法，该搜索方法主要是利用客户端预先预置的视觉单词词典库获取较少传输数据量的针对目标图片的目标视觉单词，进而将该些目标视觉单词编码后发送至服务端，已从服务端获取结果图片和/或相关的扩展信息。该方法有效减小了描述目标图片视觉单词数目，降低了传输给服务端的数据量，达到客户端与服务端之间的数据低比特传输，其可有效解决在当前带宽限制下，数据传输时间长的问题，同时能够减小服务端的响应时间，进而能够较好的节省用户等待的时间。The present invention mainly provides a picture search method, the search method is mainly to use the visual word dictionary library preset in the client to obtain the target visual words for the target picture with a small amount of transmission data, and then encode these target visual words After that, it is sent to the server, and the result picture and/or related extended information has been obtained from the server. This method effectively reduces the number of visual words describing the target picture, reduces the amount of data transmitted to the server, and achieves low-bit data transmission between the client and the server. It can effectively solve the problem of long data transmission time under the current bandwidth limitation. At the same time, it can reduce the response time of the server, which can better save the waiting time of users.

以下描述中需要注意的是：Note the following descriptions:

视觉单词：对视觉特征空间的进行离散划分，每个单词为一个划分，采用图片特征来描述图片内容，为最基本的数据特征；Visual words: Discretely divide the visual feature space, each word is a division, and use image features to describe image content, which is the most basic data feature;

视觉单词词典：图片数据库中所有或挑选的部分图片的视觉单词构成的集合。Visual word lexicon: A collection of visual words for all or a selected portion of images in an image database.

参照图1所示，图1示出了本发明中图片搜索方法实施例的步骤流程图，其步骤包括：Referring to Fig. 1, Fig. 1 shows a flow chart of the steps of an embodiment of the image search method in the present invention, the steps of which include:

步骤101，客户端接收查询内容，该查询内容包括待查询的目标图片/查询图片、或者待查询的目标图片和相关信息；该处的相关信息为除目标图片以外的信息。例如相关信息可为对目标图片进行描述的文本信息、地理位置信息、出版社信息、出版社条码、出版社徽标或电子标签等等。Step 101, the client receives query content, the query content includes the target picture to be queried/query picture, or the target picture to be queried and related information; the relevant information here is information other than the target picture. For example, the relevant information may be text information describing the target image, geographic location information, publishing house information, publishing house barcode, publishing house logo or electronic label, and the like.

步骤102，客户端获取目标图片的视觉单词，并依据预置规则在客户端的一个以上的视觉单词词典中选取与查询内容对应的至少一个目标视觉单词词典，以及，依据目标视觉单词词典获取视觉单词的目标视觉单词；Step 102, the client obtains the visual word of the target picture, and selects at least one target visual word dictionary corresponding to the query content from more than one visual word dictionaries on the client according to preset rules, and obtains the visual word according to the target visual word dictionary the target visual word of

目标图片的视觉单词的生成方式可为，获取目标图片的一个以上的视觉特征，根据视觉特征与视觉单词的映射规则将特征转换成原始视觉单词词典中的视觉单词。优选地，客户端的原始视觉单词词典与服务端的原始视觉单词词典相同，其客户端的原始视觉单词词典可以预先预置在客户端，并能够实时从服务端更新。The visual words of the target picture can be generated by acquiring more than one visual feature of the target picture, and converting the features into visual words in the original visual word dictionary according to the mapping rules between visual features and visual words. Preferably, the client's original visual word dictionary is the same as the server's original visual word dictionary, and the client's original visual word dictionary can be preset on the client and can be updated from the server in real time.

其中原始视觉单词词典的生成方式可为，获取服务端数据库图片的一个以上的视觉特征，对数据库图片的视觉特征采用聚类方式生成多个类。该处以及后续的聚类方法的具体方式可以是K均值聚类、分层聚类、谱聚类等，其中谱聚类可以参考文献“Ng A.，Jordan M.，and Weiss Y.On SpectralClustering：Analysis and an algorithm.NIPS，849-856，2001”中的方法。每个类的类中心来代表该类，称为视觉单词，即每个类为一个视觉单词，整个数据库的视觉单词集合构成原始视觉单词词典。The original visual word dictionary can be generated in such a way as to acquire more than one visual feature of the server-side database picture, and generate multiple classes by clustering the visual features of the database picture. The specific methods of this and subsequent clustering methods can be K-means clustering, hierarchical clustering, spectral clustering, etc., where spectral clustering can refer to the literature "Ng A., Jordan M., and Weiss Y. On Spectral Clustering : Analysis and an algorithm. NIPS, 849-856, 2001". The class center of each class represents the class, which is called a visual word, that is, each class is a visual word, and the collection of visual words in the entire database constitutes the original visual word dictionary.

具体地，本实施例中的可通过提取所述目标图片的颜色直方图、纹理图、尺度不变描述子、梯度位置朝向直方图或方向梯度直方图等视觉特征；Specifically, in this embodiment, visual features such as the color histogram, texture map, scale-invariant descriptor, gradient position orientation histogram or direction gradient histogram of the target image can be extracted;

接着，根据视觉特征与所述视觉单词的映射规则，将所述目标图片的颜色直方图、纹理图、尺度不变描述子(SIFT)、梯度位置朝向直方图(GLOH)或方向梯度直方图(HOG)转换成所有与服务端的原始视觉单词词典对应的视觉单词。Then, according to the mapping rules of the visual features and the visual words, the color histogram, texture map, scale-invariant descriptor (SIFT), gradient position orientation histogram (GLOH) or direction gradient histogram ( HOG) into all visual words corresponding to the server-side original visual word dictionary.

子步骤1021，依据查询内容的类型，从客户端预先预置的一个或多个视觉单词词典库中，查找匹配查询内容的类型的视觉单词词典库及预测损失函数。也就是说，预先设置了查询内容类型与视觉单词词典库映射规则。例如，查询内容为图片和对图片进行描述的文本信息，视觉单词词典库为与文本信息对应的视觉单词词典库。Sub-step 1021, according to the type of the query content, from one or more visual word dictionaries preset on the client side, search for a visual word dictionary and a prediction loss function matching the type of the query content. That is to say, the mapping rule between the query content type and the visual word dictionary is preset. For example, the query content is a picture and text information describing the picture, and the visual word dictionary is a visual word dictionary corresponding to the text information.

特别地，客户端预先预置的一个或多个视觉单词词典库为客户端预先从服务端获取的，并且，所述客户端定时更新所述一个或多个视觉单词词典库。或者，在服务端有新的图片增加时，可以提示客户端更新其内部的视觉单词词典库。以下通过后续的步骤P1至P3详细说明服务端如何获取视觉单词词典库的过程。In particular, the one or more visual word dictionaries preset by the client are obtained by the client from the server in advance, and the client regularly updates the one or more visual word dictionaries. Or, when a new picture is added on the server side, the client side may be prompted to update its internal visual word dictionary database. The process of how the server acquires the visual word dictionary database is described in detail below through subsequent steps P1 to P3.

子步骤1022，采用预测损失函数计算视觉单词词典库中的各视觉单词词典对目标图片的视觉单词的预测损失值，获取阈值范围内的一个或多个视觉单词词典。Sub-step 1022, using the prediction loss function to calculate the prediction loss value of each visual word dictionary in the visual word dictionary library for the visual word of the target picture, and obtain one or more visual word dictionaries within the threshold range.

其中，采用预测损失函数计算视觉单词词典库中的各视觉单词词典对目标图片的视觉单词的预测损失值，其预测损失值的具体计算方式可选择如下第一计算方式至第三计算方式中的任一种。Wherein, the predictive loss function is used to calculate the predictive loss value of each visual word dictionary in the visual word dictionary library to the visual word of the target picture, and the specific calculation method of the predictive loss value can be selected from the following first calculation method to the third calculation method any kind.

第一计算方式：目标图片的视觉单词和目标视觉单词词典所在图片类的类中心的余弦距离；或The first calculation method: the cosine distance between the visual word of the target picture and the class center of the picture class where the target visual word dictionary is located; or

第二计算方式：目标图片的视觉单词和目标视觉单词词典所在图片类的类中心的余弦距离，以及相关信息和视觉单词词典所在图片类的同类信息的欧式距离的加权和；The second calculation method: the cosine distance between the visual word of the target picture and the class center of the picture class where the target visual word dictionary is located, and the weighted sum of the Euclidean distances between the related information and the similar information of the picture class where the visual word dictionary is located;

第三计算方式：目标图片和目标视觉单词词典的视觉单词词典所在图片类的视觉相似性距离，以及相关信息和视觉单词词典所在图片类的同类信息的欧式距离的乘积。The third calculation method: the product of the visual similarity distance between the target picture and the picture class of the visual word dictionary of the target visual word dictionary, and the Euclidean distance between the related information and the same kind of information of the picture class of the visual word dictionary.

举例来说，预测损失函数f_预测(q_i，C_j)的公式为：For example, the prediction loss function _fpredict (q _i , C _j ) is formulated as:

f_预测(q_i，C_j)＝α·Vd_ij+β·Rd_ij f _predict (q _i , C _j )=α·Vd _ij +β·Rd _ij

f_预测(q_i，C_j)表示目标图片q_i和视觉单词词典所在图片类C_j的预测损失值，Vd_ij为目标图片的视觉单词和目标视觉单词词典所在图片类的类中心的余弦距离，Rd_ij为相关信息和视觉单词词典所在图片类的同类信息的欧式距离。α，β为实数，可以根据经验或者需求设置。f _prediction (q _i , C _j ) represents the prediction loss value of the target picture q _i and the picture class C _j where the visual word dictionary is located, and Vd _ij is the cosine distance between the visual word of the target picture and the class center of the picture class where the target visual word dictionary is located , Rd _ij is the Euclidean distance between the relevant information and the same kind of information in the image category of the visual word dictionary. α and β are real numbers and can be set according to experience or requirements.

目标图片的视觉单词和目标视觉单词词典所在图片类的类中心的余弦距离Vd_ij计算公式为，The cosine distance Vd _ij calculation formula of the visual word of the target picture and the class center of the picture class where the target visual word dictionary is located is,

${Vd Vd}_{ij ij} = = {| | | | \overset{&RightArrow; &Right Arrow;}{{BOW BOW}_{i i}},, \overset{&RightArrow; &Right Arrow;}{{BOW BOW}_{j j}} | | | |}_{Co co sin sin e e} = = \frac{\overset{&RightArrow; &Right Arrow;}{{BOW BOW}_{i i}} \cdot \cdot \overset{&RightArrow; &Right Arrow;}{{BOW BOW}_{j j}}}{| | | | \overset{&RightArrow; &Right Arrow;}{{BOW BOW}_{i i}} | | | | \cdot \cdot | | | | \overset{&RightArrow; &Right Arrow;}{{BOW BOW}_{j j}} | | | |};;$

图片i为目标图片的视觉单词，

为目标视觉单词词典所在图片类C_j的类中心。

Picture i is the visual word of the target picture,

is the class center of the picture class C _j where the target visual word dictionary is located.

相关信息和视觉单词词典所在图片类的同类信息的欧式距离Rd_ij计算公式为The formula for calculating the Euclidean distance Rd _ij of related information and the same kind of information in the image category of the visual word dictionary is

${Rd Rd}_{ij ij} = = {| | | | {R R}_{i i},, {R R}_{j j} | | | |}_{Co co sin sin e e} = = \sqrt{{(({R R}_{i i} - - {R R}_{j j}))}^{22}}$

R_i为图片i为查询内容中的相关信息，R_j为目标视觉单词词典所在图片类C_j的同类信息值。R _i is the relevant information in the query content for picture i, and R _j is the similar information value of the picture category C _j where the target visual word dictionary is located.

另外，上述子步骤1021中查询内容的类型可包括：目标图片类、目标图片和文本类、目标图片和传感器检测的信号类，目标图片和物体识别软件识别出图片中的物体标签类。其中，传感器检测的信号可包括利用全球定位系统装置(GPS)检测到地理位置信息，用条码扫描器扫描到图书或商品条码的条形码信息，用电子标签阅读器读取的电子标签信息(RFID)等。物体识别软件识别的物体标签可包括用人脸识别软件识别出人脸，文字识别系统软件(ORC)识别出文字等。In addition, the types of query content in the above sub-step 1021 may include: target pictures, target pictures and texts, target pictures and signals detected by sensors, and the target pictures and object recognition software recognizes object tags in the pictures. Among them, the signal detected by the sensor may include the geographical location information detected by the Global Positioning System (GPS), the barcode information of the book or commodity barcode scanned by the barcode scanner, and the electronic tag information (RFID) read by the electronic tag reader. wait. The object tags recognized by the object recognition software may include face recognition by face recognition software, text recognition by character recognition system software (ORC), and the like.

举例来说，查询内容的类型为目标图片类时，视觉单词词典库为根据图片相似性建立的视觉相似性的视觉单词词典库。For example, when the type of the query content is the target picture category, the visual word dictionary database is a visual word dictionary database based on visual similarity established based on similarity of pictures.

查询内容的类型为目标图片和传感器检测的信号类时，如查询内容为地标图片，以及传感器检测的信号可为地标图片中的建筑物、地标图片对应的地理位置信息、建筑物对应的地理位置信息或地标图片中的自然景观对应的地理位置信息。此时，视觉单词词典库为与地理位置信息对应的视觉单词词典库。When the type of query content is the target image and the signal detected by the sensor, if the query content is a landmark image, and the signal detected by the sensor can be the building in the landmark image, the geographic location information corresponding to the landmark image, and the geographic location corresponding to the building The geographical location information corresponding to the natural landscape in the information or landmark picture. At this time, the visual word dictionary is a visual word dictionary corresponding to the geographic location information.

查询内容的类型为目标图片和物体识别软件识别出图片中的物体标签类时，如查询内容为书本图片，物体识别软件识别出图片中的物体标签可为书本图片中的书本的出版社徽标或名称。此时，视觉单词词典库为与出版社徽标或名称对应的视觉单词词典库。When the type of query content is a target picture and the object recognition software recognizes the object label in the picture, if the query content is a book picture, the object recognition software recognizes that the object label in the picture can be the book publisher's logo in the book picture or name. At this time, the visual word dictionary base is a visual word dictionary base corresponding to the logo or name of the publishing house.

查询内容为商品的照片，物体识别软件识别出图片中的物体标签可为商品的商标，或者条码扫描器扫描出与照片中对应商品(实物)的条形码，视觉单词词典库为与商标或条形码对应的视觉单词词典库。The query content is a photo of the product, and the object recognition software recognizes that the object label in the picture can be the trademark of the product, or the barcode scanner scans the barcode corresponding to the product (physical object) in the photo, and the visual word dictionary library is the corresponding trademark or barcode A dictionary of visual words.

查询内容为博物馆展览室的引导指示图，物体识别软件识别出图片中的物体标签为引导指示图片中的条形码或电子标签，视觉单词词典库为与条形码或电子标签对应的视觉单词词典库。该步骤中将图片集合进行划分成多个类，使得划分后图片集合的耦合视觉单词最大，从而达到降低视觉单词词典维度的目的。The query content is the guide map of the museum exhibition room. The object recognition software recognizes that the object label in the picture is the barcode or electronic label in the guide picture, and the visual word dictionary is the visual word dictionary corresponding to the barcode or electronic tag. In this step, the picture set is divided into multiple categories, so that the coupled visual words of the divided picture set are the largest, thereby achieving the purpose of reducing the dimension of the visual word dictionary.

步骤103，将目标视觉单词编码后发送至服务端，以获取匹配查询内容的结果图片和/或结果图片的相关信息并显示。Step 103, encode the target visual word and send it to the server, so as to obtain and display the result picture matching the query content and/or the relevant information of the result picture.

在上述的子步骤1021中，当客户端预先预置的一个或多个视觉单词词典库为客户端预先从服务端获取时，服务端预先建立一个或多个视觉单词词典库的步骤包括：In the above-mentioned sub-step 1021, when the client pre-presets one or more visual word dictionaries for the client and obtains from the server in advance, the step of the server pre-establishing one or more visual word dictionaries includes:

第一步P1：采用图片集合划分方式将服务端数据库中的图片划分为各类型的图片集合。The first step P1: Divide the pictures in the server database into various types of picture sets by using the picture set division method.

其中，第一步P1的子步骤为利用图片之间视觉相似性将所有图片划分成多个图片集合。或者，第一步P1的子步骤为利用与图片有关信息如图片的拍照日期、文本标签、电子标签等将所有图片划分成多个图片集合。当然，第一步P1的子步骤还可为利用图片之间视觉相似性和与图片有关信息拍照日期、文本标签、电子标签等将所有图片划分成多个集合。Wherein, the sub-step of the first step P1 is to divide all the pictures into multiple picture sets by utilizing the visual similarity between the pictures. Alternatively, the sub-step of the first step P1 is to divide all the pictures into multiple picture sets by using the information related to the pictures, such as the date when the pictures were taken, text tags, electronic tags, etc. Of course, the sub-steps of the first step P1 can also divide all the pictures into multiple sets by utilizing the visual similarity between the pictures and information related to the pictures such as date of photographing, text label, electronic label, etc.

第二步P2：建立各图片集合对应的视觉单词词典，并分析各个图片对应的视觉单词词典。特别地。该处的视觉单词词典可为图片集合的视觉特征采用聚类方式建立图片的原始视觉单词词典；或者，该处的视觉单词词典为：对图片集合的视觉特征采用聚类方式建立图片的视觉单词词典，基于有效视觉单词词典的筛选规则，确定代表原始视觉单词词典的有效视觉单词词典，将有效视觉单词词典作为视觉单词词典，进而取的视觉单词词典的维度(N轴坐标系中的维度)可相对减少。The second step P2: establish a visual word dictionary corresponding to each picture set, and analyze the visual word dictionary corresponding to each picture. In particular. The visual word dictionary here can be the original visual word dictionary of the picture by clustering the visual features of the picture collection; or, the visual word dictionary here is: clustering the visual features of the picture set to build the visual word of the picture Dictionary, based on the screening rules of the effective visual word dictionary, determine the effective visual word dictionary representing the original visual word dictionary, use the effective visual word dictionary as the visual word dictionary, and then take the dimension of the visual word dictionary (dimensions in the N-axis coordinate system) can be relatively reduced.

第三步P3：(获取视觉单词词典库的第一种方式)若视觉单词词典满足视觉单词词典库建立条件，则各类型的图片集合对应的视觉单词词典的集合组成一个视觉单词词典库。The third step P3: (the first way to obtain the visual word dictionary database) If the visual word dictionary meets the conditions for establishing the visual word dictionary database, then the visual word dictionary sets corresponding to various types of picture collections form a visual word dictionary database.

其中：视觉单词词典库建立条件可为：划分后各个图片集合的视觉单词词典中视觉单词数目小于等于服务端数据库的视觉单词词典的视觉单词总数；以及且对划分后各个图片集合统计其视觉单词的概率分布，并计算视觉单词概率分布的熵，其概率分布的信息熵小于设定阈值。Wherein: the condition for establishing the visual word dictionary library can be: the number of visual words in the visual word dictionary of each picture collection after division is less than or equal to the total number of visual words in the visual word dictionary of the server database; and count its visual words for each picture collection after division and calculate the entropy of the probability distribution of visual words whose information entropy is less than the set threshold.

最后，服务端将建立的视觉单词词典发送至客户端，并使其存储以便后续使用。当服务端有新的图片时，可以对自身的视觉单词词典更新，以及使客户端的视觉单词词典同时更新。Finally, the server sends the established visual word dictionary to the client and stores it for subsequent use. When the server has a new picture, it can update its own visual word dictionary and update the client's visual word dictionary at the same time.

相比于现有技术，本实施例中的有效视觉单词词典的筛选规则可为(即第二步P2中使用的有效视觉单词词典的筛选规则可为)：Compared with the prior art, the screening rules of the effective visual word dictionary in the present embodiment can be (that is, the screening rules of the effective visual word dictionary used in the second step P2 can be):

步骤P41：从某一类的图片中选择某一数量的图片作为样本图片，以及将所述样本图片的特征转换为所述原始视觉单词词典中的视觉单词；Step P41: selecting a certain number of pictures from a certain type of pictures as sample pictures, and converting the features of the sample pictures into visual words in the original visual word dictionary;

步骤P42：依据样本图片的视觉单词在所述原始视觉单词词典的视觉单词索引表中查询，获得原始查询结果；Step P42: Query in the visual word index table of the original visual word dictionary according to the visual word of the sample picture, and obtain the original query result;

步骤P43：将属于原始视觉单词词典的任意视觉单词进行组合，以构成一个筛选视觉单词词典，基于筛选视觉单词词典，将所述样本图片的特征转换为对应该筛选视觉单词词典内的第一视觉单词，并采用第一视觉单词在所述原始视觉单词词典的视觉单词索引表中查询，获得与筛选视觉单词词典对应的第一查询结果；Step P43: Combine any visual words belonging to the original visual word dictionary to form a screened visual word dictionary, and based on the screened visual word dictionary, convert the features of the sample picture into corresponding first visual words in the screened visual word dictionary. Word, and adopt the first visual word to inquire in the visual word index table of described original visual word dictionary, obtain the first query result corresponding to screening visual word dictionary;

步骤P44：分析所有样本图片的原始查询结果与所述第一查询结果，若第一查询结果与原始查询结果相符，则采用当前的筛选视觉单词词典作为视觉单词词典；否则从所述原始视觉单词词典中选择一个视觉单词增加到当前的筛选视觉单词词典中，返回到获取所述第一查询结果的步骤。Step P44: Analyze the original query results and the first query results of all sample pictures, if the first query results match the original query results, then use the current filtered visual word dictionary as the visual word dictionary; otherwise, use the original visual word dictionary Select a visual word in the dictionary and add it to the current filtered visual word dictionary, and return to the step of obtaining the first query result.

需要说明的是：上述各类型的图片集合对应的视觉单词词典生成方式为，对图片集合的视觉特征采用聚类方式建立图片的视觉单词词典。It should be noted that: the visual word dictionaries corresponding to the above-mentioned various types of picture collections are generated in a manner of clustering the visual features of the picture collections to establish the visual word dictionaries of the pictures.

相比较于现有技术，本实施例中的搜索方法仅需向服务端传输几十比特的编码后的数据量，以实现客户端较快查询的目的，同时提高了客户端在查询目标图片过程中的传输效率，且缩短了服务端的响应查询时间。Compared with the prior art, the search method in this embodiment only needs to transmit tens of bits of encoded data to the server, so as to achieve the purpose of faster query by the client, and at the same time improve the process of querying the target image by the client. In the transmission efficiency, and shorten the response query time of the server.

特别地，本实施例的图片搜索方法主要是应用于移动终端中的图片查询，该些移动终端通过自适应地为查询信息选择适合的视觉单词词典，并获得具有视觉描述能力的目标视觉单词，以有效降低待查询的目标图片的数据量，进而实现客户端和服务端之间的低比特传输数据，缩短了用户在查询目标图片时的等待时间，同时提高了服务端的响应时间，进而提高了图片搜索方法的查询效率。In particular, the image search method of this embodiment is mainly applied to image query in mobile terminals. These mobile terminals adaptively select a suitable visual word dictionary for query information and obtain target visual words with visual description capabilities. In order to effectively reduce the data volume of the target picture to be queried, and then realize the low-bit data transmission between the client and the server, shorten the waiting time of the user when querying the target picture, and improve the response time of the server at the same time, thereby improving the Query efficiency of image search methods.

进一步地，本发明中的搜索方法还能够提高检索结果的准确率。本发明能够推广应用于各种图片的检索/搜索，且能够获取到结果图片的扩展信息，使得该方法的适用范围较广，可使用各个领域，方便用户检索各类信息。Furthermore, the search method in the present invention can also improve the accuracy of retrieval results. The present invention can be popularized and applied to the retrieval/search of various pictures, and can obtain the extended information of the result pictures, so that the method has a wide application range, can be used in various fields, and is convenient for users to search various types of information.

参照图2所示，图2示出了本发明中的用于筛选有效视觉词典的具体步骤流程图；即，上述用于分布式图片搜索的索引构建方法实施例中筛选有效视觉词典的具体计算步骤包括：Referring to Fig. 2, Fig. 2 shows a flow chart of specific steps for screening effective visual dictionaries in the present invention; that is, specific calculations for screening effective visual dictionaries in the above-mentioned embodiment of the index construction method for distributed image search Steps include:

第一步201：从整个图片数据库中挑选出N_sample张样本图片，将这些样本图片作为查询图片在视觉单词索引表中查询，检索前R个查询图片结果。对于第i张图片，其查询结果

为查询结果中排在第j位的图片，

的视觉单词向量为

The first step 201: select N _sample pictures from the entire picture database, use these sample pictures as query pictures in the visual word index table, and retrieve the results of the first R query pictures. For the i-th picture, its query result

is the image ranked jth in the query result,

The visual word vector of

第二步202：计算每个结果图片的term frequency-inverse documentfrequency(TF-IDF)，的TF-IDF为

从原始视觉单词词典的子集中筛选出有效视觉单词词典。The second step 202: calculate the term frequency-inverse document frequency (TF-IDF) of each result picture, The TF-IDF is

A valid visual word dictionary is filtered from a subset of the original visual word dictionary.

第三步203：设置迭代次数为d＝1，有效视觉单词词典min_V_j为空，候选视觉单词集合cadi_V_j＝V(V为原始视觉单词词典)，其元素的个数为N_cv，N_sample张图片的权重集合

w_i为图片i的权重为0，测试子集train_V为空；The third step 203: set the number of iterations as d=1, the effective visual word dictionary min_V _j is empty, the candidate visual word set cadi_V _j =V (V is the original visual word dictionary), and the number of elements is N _cv , N _sample A set of weights for an image

w _i is the weight of picture i is 0, and the test subset train_V is empty;

第四步204：若迭代次数d＞α或lost_Rank＜β则结束。Fourth step 204: if the number of iterations d>α or lost _Rank <β, then end.

第五步205：否则，将候选视觉单词集合中的N_cv个视觉单词分别加入到测试子集tran_V中，从而产生N_cv个测试子集train_V₁，...，

train_V_t＝min_V∪{wd_t}。Fifth step 205: otherwise, add N _cv visual words in the candidate visual word set to the test subset tran_V, thereby generating N _cv test subsets train_V ₁ , . . .

train_V _t =min_V∪{wd _t }.

第六步206：将各测试子集作为视觉单词词典，根据该视觉单词词典分别将查询图片i局部特征向量S_i转换为视觉单词向量，测试子集train_V_k对应的图片i视觉单词向量为

The sixth step 206: each test subset is used as a visual word dictionary, and the local feature vector S _i of the query picture i is converted into a visual word vector according to the visual word dictionary, and the visual word vector of the picture i corresponding to the test subset train_V _k is

第七步207：计算采用各测试子集描述每个查询图片所导致的总错误率

对于测试子集train_V_k和图片I_i，总错误率Lost(I_i)^k计算方法为如下的M1至M4所示：Step 7 207: Calculate the total error rate caused by using each test subset to describe each query image

For the test subset train_V _k and picture I _i , the calculation method of the total error rate Lost(I _i ) ^k is as shown in M1 to M4 as follows:

M1，将

映射为成原始视觉单词词典视觉向量

为映射向量；M1, will

Mapping to the original visual word dictionary visual vector

is the mapping vector;

M2，计算当查询图片用测试子集train_V_k描述时，结果图片

和查询图片i的内容相似性

计算方法为：M2, calculate when the query picture is described by the test subset train_V _k , the result picture

Content similarity to query image i

The calculation method is:

${| | | | \overset{&RightArrow; &Right Arrow;}{gBO gBO {W W}_{{I I}_{i i}} ((k k))} \cdot &Center Dot; \overset{&RightArrow; &Right Arrow;}{{BOW BOW}_{{A A}_{j j}^{i i}}} | | | |}_{Co co sin sin e e} = = \frac{\overset{&RightArrow; &Right Arrow;}{{BOW BOW}_{{A A}_{j j}^{i i}}} \cdot &Center Dot; \overset{&RightArrow; &Right Arrow;}{{gBOW wxya}_{{I I}_{i i}} ((k k))}}{| | | | \overset{&RightArrow; &Right Arrow;}{BO BO {W W}_{{A A}_{j j}^{i i}}} | | | | \cdot &Center Dot; | | | | \overset{&RightArrow; &Right Arrow;}{gBO gBO {W W}_{{I I}_{i i}}} ((k k)) | | | |};;$

M3，计算用测试子集train_V_k描述查询图片i导致的错误率Lost(I_i)^k M3, calculate the error rate Lost(I _i ) ^k caused by the test subset train_V _k describing the query image i

$Lost Lost {(({I I}_{i i}))}^{k k} = = {w w}_{i i}^{d d - - 11} \times \times {Σ Σ}_{r r = = 11}^{R R} R R (({A A}_{r r}^{i i})) \cdot &Center Dot; {TI T.I.}_{{A A}_{r r}} \cdot &Center Dot; {| | | | \overset{&RightArrow; &Right Arrow;}{gBO gBO {W W}_{{I I}_{i i}} ((k k))} \cdot \cdot \overset{&RightArrow; &Right Arrow;}{{BOW BOW}_{{A A}_{j j}^{i i}}} | | | |}_{Co co sin sin e e};;$

为与结果图片排序位置递增的函数，可以设置

for the picture with the result A function that increases the sorting position, you can set

M4，计算用测试子集train_V_k描述查询图片的总错误率

M4, calculate the total error rate of the query image described by the test subset train_V _k

${lost lost}_{Rank Rank}^{k k} = = {Σ Σ}_{i i = = 11}^{{N N}_{sample sample}} Lost Lost {(({I I}_{i i}))}^{d d - - 11} . .$

第八步208：选择使总错误率lost_Rank最小的测试子集，更新有效视觉单词词典和候选视觉单词集合，其具体方法为：若该测试子集为train_V_MIN，则有效视觉单词词典为min_V＝train_V_MIN，cadi_V＝cadi_V-{wd_MIN}。The eighth step 208: select the test subset that minimizes the total error rate lost _Rank , and update the effective visual word dictionary and the candidate visual word set. The specific method is: if the test subset is train_V _MIN , then the effective visual word dictionary is min_V = train_V _MIN , cadi_V = cadi_V - {wd _MIN }.

第九步209：更新每个查询图片的权重，查询图片i的权重更新的计算方法为： Ninth step 209: update the weight of each query image, the calculation method for updating the weight of query image i is:

第十步210：更新迭代次数d＝d+1，并返回步骤第四步204。The tenth step 210: update the number of iterations d=d+1, and return to the fourth step 204.

在上述实施例的基础，以下以查询信息仅包含图片为例进行详细说明，其查询步骤为：On the basis of the above-mentioned embodiments, the following takes the query information only including pictures as an example to describe in detail, and the query steps are as follows:

第一步，客户端获取待搜索的目标图片。In the first step, the client obtains the target image to be searched.

第二步，客户端获取所述目标图片的一个以上的特征，并将该些特征转换为视觉单词。In the second step, the client obtains more than one feature of the target picture, and converts these features into visual words.

具体地，本实施例中可通过提取所述目标图片的颜色直方图、纹理图、尺度不变描述子、梯度位置朝向直方图或方向梯度直方图等视觉特征。Specifically, in this embodiment, visual features such as color histogram, texture map, scale-invariant descriptor, gradient position orientation histogram or direction gradient histogram of the target picture may be extracted.

接着，根据视觉特征与视觉单词的映射规则，将所述目标图片的颜色直方图、纹理图、尺度不变描述子(SIFT)、梯度位置朝向直方图(GLOH)或方向梯度直方图(HOG)转换成客户端的视觉单词词典中的视觉单词。Then, according to the mapping rules of visual features and visual words, the color histogram, texture map, scale-invariant descriptor (SIFT), gradient position orientation histogram (GLOH) or orientation gradient histogram (HOG) of the target picture Converts to sight words in the client's sight word dictionary.

第三步，从客户端的一个或多个视觉单词词典库中，查找匹配目标图片的目标视觉单词词典。该些客户端的视觉单词词典库为客户端从服务端预先下载获取的。也就是说，客户端预先设置有和服务端相对应的视觉单词词典库。The third step is to search for a target visual word dictionary matching the target picture from one or more visual word dictionary databases of the client. The visual word dictionaries of these clients are pre-downloaded and obtained by the client from the server. That is to say, the client is preset with a visual word dictionary library corresponding to the server.

特别地，在查询内容只有目标图片时，客户端选择根据图片相似性建立的视觉相似性的视觉单词词典库，计算目标图片和视觉单词词典库所在视觉相似性的视觉单词词典库中任一视觉单词词典所在图片类的视觉相似性距离，选择相似性距离最小视觉单词词典为匹配目标图片的视觉单词词典即目标视觉单词词典。其中视觉相似性距离为目标图片的视觉单词与视觉单词词典所在图片类的类中心余弦距离。In particular, when the query content is only the target picture, the client selects the visual word dictionary database of visual similarity established according to the similarity of the picture, and calculates any visual word dictionary database of the visual similarity between the target picture and the visual word dictionary database. The visual similarity distance of the picture class where the word dictionary is located, the visual word dictionary with the smallest similarity distance is selected as the visual word dictionary matching the target picture, that is, the target visual word dictionary. The visual similarity distance is the class center cosine distance between the visual word of the target picture and the picture class where the visual word dictionary is located.

第四步，分析所述视觉单词和目标视觉单词词典，得到对应目标图片的目标视觉单词；具体为根据所述的视觉单词词典，对目标图片的视觉单词进行筛选，选择属于视觉单词词典内的视觉单词作为目标视觉单词；The fourth step is to analyze the visual word and the target visual word dictionary to obtain the target visual word corresponding to the target picture; specifically, according to the visual word dictionary, the visual word of the target picture is screened, and the visual word belonging to the visual word dictionary is selected. Sight words as target sight words;

第五步，根据哈夫曼(Huffman)编码方法将目标视觉单词压缩成数据包；其具体操作为，扫描各个目标视觉单词出现的概率，并建立哈夫曼树，用‘0’与‘1’对目标单词进行编码，概率越大，编码位数越少，将视觉单词和对应的编码保存到哈夫曼编码表中发送至客户端。The fifth step is to compress the target visual word into a data packet according to the Huffman encoding method; the specific operation is to scan the probability of occurrence of each target visual word, and establish a Huffman tree, using '0' and '1 'Encode the target word, the greater the probability, the fewer the number of encoding digits, save the visual word and the corresponding encoding in the Huffman encoding table and send it to the client.

第六步，服务端根据哈夫曼编码表将数据包解码为目标视觉单词，依据该目标视觉单词查找其内部的原始视觉单词词典的视觉单词索引表，得到对应该目标视觉单词的一个以上的结果图片，和/或获得该结果图片的扩展信息，并将结果图片和/或扩展信息发送至客户端以显示。In the sixth step, the server decodes the data packet into the target visual word according to the Huffman coding table, searches the visual word index table of the original visual word dictionary inside it according to the target visual word, and obtains more than one visual word corresponding to the target visual word The result picture, and/or obtain the extended information of the result picture, and send the result picture and/or the extended information to the client for display.

根据本发明的另一方面，本发明还提供一种图片搜索方法，如图3所示，其步骤包括：According to another aspect of the present invention, the present invention also provides a picture search method, as shown in Figure 3, the steps include:

步骤301：服务端接收编码后的目标视觉单词并解码出目标视觉单词。Step 301: The server receives the encoded target visual word and decodes the target visual word.

步骤302：该服务端基于目标视觉单词查找服务端内视觉单词词典对应的索引表，以获得结果图片和/或结果图片的相关信息。Step 302: The server searches the index table corresponding to the visual word dictionary in the server based on the target visual word to obtain the result picture and/or related information of the result picture.

所述视觉单词词典为：对服务端图片数据库的全部或部分图片的视觉特征采用聚类方式建立的视觉单词词典。The visual word dictionary is: a visual word dictionary established by clustering the visual features of all or part of the pictures in the server-side picture database.

步骤303：将结果图片和/或结果图片的相关信息发送至客户端以显示。Step 303: Send the result picture and/or related information of the result picture to the client for display.

上述实施例中采用较少的目标视觉单词查询结果图片，其能够实现原有的检索性能的基础上，提高了目标图片查询的效率，缩短了用户的等待时间，进而实现了在较少带宽情况下实现图片查询的目的。In the above embodiment, fewer target visual word query result pictures are used, which can improve the efficiency of the target picture query on the basis of the original retrieval performance, shorten the waiting time of the user, and then realize the low-bandwidth situation. To achieve the purpose of image query.

根据本发明的另一方面，本发明还提供一种图片搜索系统，如图4所示，其包括：According to another aspect of the present invention, the present invention also provides a picture search system, as shown in Figure 4, which includes:

接收模块401，客户端接收包括待查询的目标图片、或者待查询的目标图片和相关信息的查询内容；Receiving module 401, the client receives the query content including the target picture to be queried, or the target picture to be queried and related information;

目标视觉单词获取模块402，客户端获取目标图片的视觉单词，并依据预置规则在客户端的一个以上的视觉单词词典中选取与查询内容对应的至少一个目标视觉单词词典，以及，依据目标视觉单词词典获取视觉单词的目标视觉单词；The target visual word acquisition module 402, the client acquires the visual word of the target picture, and selects at least one target visual word dictionary corresponding to the query content from more than one visual word dictionaries on the client according to preset rules, and, according to the target visual word The dictionary gets the target visual word of the visual word;

目标视觉单词发送模块403，将目标视觉单词编码后发送至服务端，The target visual word sending module 403 sends the target visual word to the server after encoding,

接收和查找模块404，服务端接收编码的目标视觉单词并解码，以及基于目标视觉单词查找数据库中所有图片的视觉单词词典对应的索引表，以获得结果图片和/或结果图片的相关信息；Receiving and searching module 404, the server receives and decodes the coded target visual word, and searches the index table corresponding to the visual word dictionary of all pictures in the database based on the target visual word to obtain the result picture and/or the relevant information of the result picture;

发送模块405，所述服务端将结果图片和/或结果图片的相关信息发送至客户端。Sending module 405, the server sends the result picture and/or related information of the result picture to the client.

上述图片查询系统或图片搜索系统根据查询信息的组合类型，自动选择适合查询信息类型的视觉单词词典，根据该视觉单词词典将图片转成视觉单词，并且进一步将视觉单词压缩成数据量较少的目标视觉单词的数据包，然后根据数据包快速而准确地获取需要检索的目标图片的结果图片及其相关的扩展信息。The above-mentioned picture query system or picture search system automatically selects a visual word dictionary suitable for the type of query information according to the combination type of the query information, converts the picture into a visual word according to the visual word dictionary, and further compresses the visual word into a file with a small amount of data. The data package of the target visual word, and then quickly and accurately obtain the result picture of the target picture to be retrieved and its related extended information according to the data package.

其中，该图片搜索系统在获取目标图片的过程中，依据图片划分准则，对数据库图片集合进行有效划分，让划分后的各类图片的视觉单词的种类远小于原始数据库图片集合的视觉种类，从而有效减小描述图片视觉单词数目，实现了将目标图片转换成几十比特的目标视觉单词的数据包，降低了传输给服务端的数据量，达到客户端与服务端之间的低比特传输，进而可有效解决在当前带宽限制下，数据传输时间长的问题，能够较好的节省用户等待的时间。上述搜索方法适应不同类型的查询，其可扩展性强。Among them, in the process of obtaining the target picture, the picture search system effectively divides the database picture set according to the picture division criterion, so that the types of visual words of the divided pictures are much smaller than the visual types of the original database picture set, thus Effectively reduce the number of visual words describing pictures, realize the conversion of target pictures into data packets of tens of bits of target visual words, reduce the amount of data transmitted to the server, and achieve low-bit transmission between the client and the server. It can effectively solve the problem of long data transmission time under the current bandwidth limitation, and can better save the waiting time of users. The above search method is suitable for different types of queries and has strong scalability.

本实施例中提及的客户端可为移动终端，如手机、IPAD、平板电脑等。The client mentioned in this embodiment may be a mobile terminal, such as a mobile phone, an IPAD, a tablet computer, and the like.

特别地，本实施例中的客户端可包括：In particular, the client in this embodiment may include:

接收模块，接收包括待查询的目标图片、或者待查询的目标图片和相关信息的查询内容；The receiving module receives the query content including the target picture to be queried, or the target picture to be queried and related information;

目标视觉单词获取模块，获取其内部目标图片的视觉单词，并依据预置规则在客户端的一个以上的视觉单词词典中选取与查询内容对应的至少一个目标视觉单词词典，以及，依据目标视觉单词词典获取视觉单词的目标视觉单词；The target visual word acquisition module obtains the visual word of the internal target picture, and selects at least one target visual word dictionary corresponding to the query content from more than one visual word dictionary on the client according to preset rules, and, according to the target visual word dictionary Get the target visual word of the visual word;

结果图片接收模块，用于接收服务端查找并发送的结果图片和/或结果图片的相关信息并显示。The result picture receiving module is used for receiving and displaying the result picture searched and sent by the server and/or the relevant information of the result picture.

上述图片搜索系统中显示的各模块只是示意性的显示其内部的结构关系，可能在某一个系统、客户端或其它的结构中多次使用同一模块进行传输或接收，或间隔的使用上述的某一模块，上述实施例只是示意性的说明，其不局限图4中的结构排布关系和连接关系。另外还可能出现在图片搜索系统和客户端中增加一些能够实现本发明中的图片搜索方法中的某些步骤的其他模块均属于本发明的内容。The modules displayed in the image search system above are only schematically showing their internal structural relationships. The same module may be used for transmission or reception multiple times in a certain system, client or other structures, or one of the above-mentioned modules may be used at intervals. A module, the above embodiment is only a schematic illustration, and it is not limited to the structural arrangement and connection relationship in FIG. 4 . In addition, it is also possible to add some other modules in the image search system and client that can implement some steps in the image search method of the present invention, all of which belong to the content of the present invention.

最后应说明的是：上述图片搜索方法中的各步骤的顺序可以并行或交换进行，上述实施例仅为示意性的说明，并不限定步骤的执行顺序。另外，以上实施例仅用以说明本发明的技术方案而非对其限制；尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that: the order of the steps in the above image search method can be performed in parallel or interchanged, and the above embodiment is only for illustrative illustration, and does not limit the execution order of the steps. In addition, the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still be applied to the foregoing embodiments Modifications are made to the recorded technical solutions, or equivalent replacements are made to some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the present invention.

Claims

1. A method for image search, characterized in that, comprising:

The client receives the query content, the query content includes the target picture to be queried, or the target picture to be queried and related information;

The client obtains the visual words of the target image, and according to the type of the query content, from one or more visual word dictionaries preset by the client, finds the visual word dictionary and the prediction loss function matching the type of the query content, using The prediction loss function calculates the prediction loss value of each visual word dictionary in the visual word dictionary to the visual word of the target picture, obtains one or more visual word dictionaries within the threshold range, and obtains the target of the visual word according to the target visual word dictionary sight words;

Encode the target visual word and send it to the server to obtain the result picture matching the query content and/or the relevant information of the result picture.

2. The image search method according to claim 1, wherein the type of the query content comprises:

Target pictures, target pictures and texts, target pictures and signals detected by sensors, target pictures and object recognition software to identify the object labels in the pictures;

Among them, the signal detected by the sensor includes the geographical position information detected by the global positioning system device, the barcode barcode of the book or commodity is scanned by the barcode scanner, and the electronic label is read by the electronic label reader;

Object tags recognized by object recognition software include face recognition software to recognize faces, and text recognition system software to recognize text.

3. The image search method according to claim 1, characterized in that:

One or more visual word dictionaries preset by the client are obtained by the client from the server in advance, and the client regularly updates the one or more visual word dictionaries;

The step that described service end sets up one or more visual word dictionaries includes:

Divide the pictures in the server database into various types of picture sets by using the picture set division method, and establish the visual word dictionary corresponding to each picture set, analyze the visual word dictionary corresponding to each picture, if the visual word dictionary satisfies the visual word dictionary library Establish conditions, then the collection of visual word dictionaries corresponding to various types of picture collections form a visual word dictionary library;

Among them: the conditions for establishing the visual word dictionary database are:

After division, the number of visual words in the visual word dictionary of each picture collection is less than or equal to the total number of visual words in the visual word dictionary of the server database;

And the probability distribution of the visual words in the picture set is counted, and the entropy of the probability distribution of the visual words is calculated, and the information entropy of the probability distribution is less than the set threshold.

4. The image search method according to claim 3, characterized in that:

The visual word dictionary is: the visual features of the picture collection are clustered to establish the original visual word dictionary of the picture; or,

The visual word dictionary of the picture is established by clustering the visual features of the picture collection, based on the screening rules of the effective visual word dictionary, the effective visual word dictionary representing the original visual word dictionary is determined, and the effective visual word dictionary is used as the visual word dictionary.

5. The image search method according to claim 3, characterized in that:

The steps of dividing the pictures in the server database into various types of picture sets by using the picture set division method include:

Divide all pictures into multiple picture sets by utilizing the visual similarity between pictures; or,

Divide all pictures into multiple picture collections using information about pictures; or

All the pictures are divided into multiple sets by using the visual similarity between the pictures and the information related to the pictures.

6 . The image search method according to claim 5 , wherein the information related to the image includes the date when the image was taken, a text label, and an electronic label.

7. The image search method according to claim 1, characterized in that:

In the step of using the prediction loss function to calculate the prediction loss value of each visual word dictionary in the visual word dictionary library to the visual word of the target picture, the calculation method of the prediction loss value is as follows:

the cosine distance between the visual word of the target image and the class center of the image class in which the target visual word dictionary is located; or

The cosine distance between the visual word of the target picture and the class center of the picture class where the target visual word dictionary is located, and the weighted sum of the Euclidean distances between the related information and the same kind of information of the picture class where the visual word dictionary is located; or

The product of the visual similarity distance between the target picture and the picture class of the visual word dictionary of the target visual word dictionary, and the Euclidean distance between the related information and the same kind of information of the picture class of the visual word dictionary.

8. The image search method according to claim 4, wherein:

Based on the screening rules of the effective visual word dictionary, the steps of determining a visual word dictionary representative of the original visual word dictionary include:

Selecting a certain number of pictures from a certain type of pictures as sample pictures, and converting the features of the sample pictures into visual words in the original visual word dictionary;

Query in the visual word index table of the original visual word dictionary according to the visual word of the sample picture, and obtain the original query result;

Combining any visual words belonging to the original visual word dictionary to form a screening visual word dictionary, based on the screening visual word dictionary, converting the features of the sample picture into corresponding first visual words in the screening visual word dictionary, and Use the first visual word to query in the visual word index table of the original visual word dictionary, and obtain the first query result corresponding to the filtered visual word dictionary;

Analyze the original query result and the first query result of all sample pictures, if the first query result matches the original query result, then adopt the current screening visual word dictionary as the visual word dictionary; otherwise select from the original visual word dictionary A visual word is added to the current filtered visual word dictionary, and the process returns to the step of obtaining the first query result.

9. The image search method according to claim 1, further comprising:

The server receives the encoded target visual word and decodes the target visual word;

The server searches the index table corresponding to the visual word dictionary in the server based on the target visual word to obtain the result picture and/or the relevant information of the result picture, and sends it to the client;

The visual word dictionary is: a visual word dictionary established by clustering the visual features of all or part of the pictures in the server-side picture database.