CN112148902B - Data processing method, device, server and storage medium - Google Patents

Data processing method, device, server and storage medium Download PDF

Info

Publication number
CN112148902B
CN112148902B CN202011153160.1A CN202011153160A CN112148902B CN 112148902 B CN112148902 B CN 112148902B CN 202011153160 A CN202011153160 A CN 202011153160A CN 112148902 B CN112148902 B CN 112148902B
Authority
CN
China
Prior art keywords
multimedia data
data
target
hash code
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011153160.1A
Other languages
Chinese (zh)
Other versions
CN112148902A (en
Inventor
欧子菁
赵瑞辉
林民龙
苏勤亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202011153160.1A priority Critical patent/CN112148902B/en
Publication of CN112148902A publication Critical patent/CN112148902A/en
Application granted granted Critical
Publication of CN112148902B publication Critical patent/CN112148902B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/438Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/44Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a data processing method, a device, a server and a storage medium, wherein the method comprises the following steps: acquiring a sample multimedia data set, wherein the sample multimedia data set comprises at least two sample multimedia data; acquiring characteristic information of each sample multimedia data in a sample multimedia data set, wherein the characteristic information comprises data content information or vector distribution characteristic information; carrying out relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relation between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result; and constructing a target model according to the data association relation, wherein the target model is used for processing the input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data. A new target model can be constructed, and the speed and accuracy of data searching can be improved by adopting the new target model.

Description

Data processing method, device, server and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method, a data processing device, a server, and a storage medium.
Background
With the current deep development of internet technology, currently, in order to realize the output of multimedia data satisfying search conditions to a user based on search information (or query information) of the user, a trained model is generally used to support an information search (or data search) process of the user, wherein the model currently used to support the information search is to call the model to perform data analysis on the multimedia data, thereby determining whether each multimedia data satisfies the search conditions according to the analysis result of the model on the multimedia data, and outputting the multimedia data satisfying the search conditions. However, with the rapid increase of the number of multimedia data, the data analysis pressure on the multimedia data is increased more and more, so that the response speed of information search is reduced and the accuracy is reduced, so that it is seen how to construct a new model, and the new model has higher accuracy and search speed when information search is performed, and the new model becomes a current research hotspot.
Disclosure of Invention
The embodiment of the invention provides a data processing method, a data processing device, a server and a storage medium, a new target model can be constructed, and the speed and accuracy of data searching can be improved by adopting the new target model.
In one aspect, an embodiment of the present invention provides a data processing method, including:
Acquiring a sample multimedia data set, wherein the sample multimedia data set comprises at least two sample multimedia data;
Acquiring characteristic information of each sample multimedia data in the sample multimedia data set, wherein the characteristic information comprises data content information or vector distribution characteristic information;
carrying out relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relation between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result;
And constructing a target model according to the data association relation, wherein the target model is used for processing the input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
In still another aspect, an embodiment of the present invention provides a data processing apparatus, including:
an acquisition unit for acquiring a sample multimedia data set comprising at least two sample multimedia data;
The acquisition unit is further configured to acquire feature information of each sample multimedia data in the sample multimedia data set, where the feature information includes data content information or vector distribution feature information;
the determining unit is used for carrying out relevance analysis on the characteristic information of any two sample multimedia data and determining the data relevance relation between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result;
The building unit is used for building a target model according to the data association relation, wherein the target model is used for processing the input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
In yet another aspect, an embodiment of the present invention provides a server, including a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, and the memory is configured to store a computer program supporting a terminal to execute the above method, where the computer program includes program instructions, and the processor is configured to invoke the program instructions to perform the following steps:
Acquiring a sample multimedia data set, wherein the sample multimedia data set comprises at least two sample multimedia data;
Acquiring characteristic information of each sample multimedia data in the sample multimedia data set, wherein the characteristic information comprises data content information or vector distribution characteristic information;
carrying out relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relation between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result;
And constructing a target model according to the data association relation, wherein the target model is used for processing the input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
In yet another aspect, an embodiment of the present invention provides a computer readable storage medium having stored therein program instructions for performing the data processing method according to the first aspect when the program instructions are executed by a processor.
In the embodiment of the invention, the server can determine the data association relationship between any two sample multimedia data in the sample multimedia data set according to the vector distribution information of the hash code vector of the sample multimedia data by carrying out association analysis on the vector distribution characteristic information, or can determine the data association relationship between any two sample multimedia data in the sample multimedia data set according to association analysis on the data content of the sample multimedia data after the sample multimedia data set is acquired by the server. After the server determines the data association relationship between any two sample multimedia data in the sample multimedia data set, a target model can be constructed according to the data association relationship, so that the target model can refer to the data association relationship between the multimedia data, and when the trained target model is adopted to generate the hash code of the multimedia data, the generated hash code not only contains semantic features of the multimedia itself, but also can indicate data association information among different multimedia data, thereby improving the quality of the hash code generated by the target model, and improving the searching speed and searching accuracy when the data searching is performed based on the hash code.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1a is a schematic diagram of a model structure of a target model according to an embodiment of the present invention;
FIG. 1b is a schematic diagram of a model structure of a target model according to an embodiment of the present invention;
FIG. 1c is a schematic diagram of a distribution of association constraints introduced into a target model provided by an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a data processing method according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of a data processing method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of data recommendation using a trained object model according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an output feedback multimedia data according to an embodiment of the present invention;
FIG. 6 is a schematic block diagram of a data processing apparatus provided by an embodiment of the present invention;
fig. 7 is a schematic block diagram of a server according to an embodiment of the present invention.
Detailed Description
Artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that simulates, extends, and extends human intelligence using a digital computer or a machine controlled by a digital computer, perceives the environment, obtains knowledge, and uses the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions. Natural language processing (Nature Language processing, NLP) is an important direction in the fields of computer science and artificial intelligence, and it is a science that combines linguistics, computer science, and mathematics into a whole to study various theories and methods that can realize effective communication between human and computer by natural language. Thus, research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with research in linguistics, and natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph, and so on. The embodiment of the invention provides a data processing method, and the target model constructed based on the data processing method provided by the embodiment of the invention can be used for natural language processing after training is completed.
In one embodiment, in order to train to obtain the target model, a technical developer may send a plurality of sample multimedia data to a server through the terminal device, and after the server obtains the plurality of sample multimedia data, the server may analyze and determine a data association relationship between any two sample multimedia data, so as to introduce the data association relationship into the constructed target model based on the analysis of the data association relationship. In a specific implementation, in order to introduce a data association relationship between multimedia data into a target model, after determining sample multimedia data, a server may determine a data association relationship between any two sample multimedia data, so as to construct the target model based on the data association relationship, and implement introduction of the data association relationship between the multimedia data.
In one embodiment, the server may determine that the corresponding multimedia data has a data association relationship when the number product between the hash code vectors is greater than a preset number product threshold, and determine that the corresponding multimedia data does not have a data association relationship when the number product is less than or equal to the preset number product threshold, that is, since the association relationship between the hash code vectors has an indicating effect on the data association relationship between the multimedia data corresponding to the hash code vectors, the determination of the association relationship between the multimedia data based on the association relationship of the hash code vectors may be implemented. If the hash code corresponding to the multimedia data can guide the generation of the data content of the multimedia data and the generation of the data association relationship between different multimedia data, the semantic information of the data content of the multimedia data and the data association relationship between different multimedia data are contained in the hash code of the multimedia data, so that the server constructs a target model according to the data association relationship between the multimedia data. The association relationship between multimedia data is determined according to the distribution condition of the hash code vectors, that is, the data association relationship between multimedia data needs to be generated based on the hash code vectors, so that the data association relationship between multimedia data is generated based on the hash code vectors, and a target model constructed by the data association relationship can be called an edge-based model. The model structure corresponding to the edge-based model may be shown in fig. 1a, the edge-based model includes an encoder and a decoder, the model structure shown in fig. 1a may obtain a hash code vector after inputting sample multimedia data into the edge-based model, and the decoder may generate a data association relationship between multimedia data based on the hash code vector, where the multimedia data input into the edge-based model may be data x i and data x j in fig. 1a, and the data association relationship generated by the model is w ij output by the model shown in fig. 1 a.
In one embodiment, the method for constructing the model based on edge generation is adopted, the data association relationship between multimedia data is generated by using the hash code vector, so that the data association relationship between multimedia data is introduced into the generated target model, but since a large number of sample multimedia data do not contain the inter-link relationship between multimedia data, if the data association relationship between multimedia data obtained by adopting the external auxiliary means is often inaccurate, and the data association relationship referred to when the model based on edge generation is constructed is strongly constrained, when the data association relationship with insufficient accuracy is adopted to construct the target model, the strongly constrained relationship can cause noise amplification, so that when the trained target model is adopted to determine the hash code of the multimedia data, the quality of the generated hash code is greatly reduced. Therefore, when determining the data association relationship between any two multimedia data, it may be determined after performing association analysis on the data content of the multimedia data, that is, the data association relationship between the multimedia data is introduced through priori knowledge, and a weak constraint method is adopted, where the priori knowledge indicates that the data association relationship exists, and the priori knowledge indicates that different constraint conditions are added for the data association relationship does not exist, so as to implement construction of a target model based on the priori knowledge about whether the data association relationship exists between the multimedia data, and it can be understood that the target model constructed based on the priori knowledge is a model based on the priori knowledge.
In one embodiment, the model structure corresponding to the prior knowledge-based model may be as shown in fig. 1b, the prior knowledge-based model also includes an encoder and a decoder, as shown in fig. 1b, since after the sample multimedia data x i and x j are input into the prior knowledge-based model, the prior knowledge-based model does not output a data association relationship since the sample multimedia data x i and x j do not contain a interlinking relationship between multimedia data, and in order to implement the introduction of the data association relationship, association constraint information is added to the prior knowledge-based model, and the added association constraint information may be, for example, association constraint 1 and association constraint 2 shown in fig. 1b, where the distribution satisfied by the association constraint 1 and the association constraint 2 may be as shown in fig. 1c, where the upper graph in fig. 1c is the association constraint 1, and the lower graph is the association constraint 2. After the target model is constructed, the target model can be trained, wherein when the trained target model is adopted to generate the hash code of the input target multimedia data, the higher the similarity degree of the hash code of the multimedia data with the data association relationship is, and the lower the similarity degree of the hash code of the multimedia data without the data association relationship is.
When the existing model for generating the hash codes of the multimedia data is constructed, only the data content of single multimedia data is referred to, and the data association relationship between different multimedia data is ignored, so that the hash codes of the multimedia data generated by the current model only contain local information of multimedia data semantics, but cannot capture global information of the data association relationship between different multimedia data, namely the hash code quality of the multimedia data generated by the current model is lower. Specifically, referring to fig. 2, a schematic flow chart of a data processing method according to an embodiment of the present invention is shown in fig. 2, where the method may include:
S201, a sample multimedia data set is acquired, the sample multimedia data set including at least two sample multimedia data.
In an embodiment, the sample multimedia data included in the sample multimedia data set is randomly selected multimedia data, and the multimedia data may be different types of multimedia data, or may also be the same type of multimedia data, where the types of multimedia data include: it is understood that the obtained sample multimedia data set may include only multimedia data of a video type, only multimedia data of an audio type, only multimedia data of a text type, or may be a combination of multimedia data of different types, for example, the sample multimedia data set may include both multimedia data of a video type and multimedia data of a text type, etc., and in the embodiment of the present invention, the type of the sample multimedia data included in the obtained sample multimedia data set is not limited.
In one embodiment, after the sample multimedia data set is obtained, a data association relationship between any two sample multimedia data in the sample multimedia data set can be further determined, so that after the data association relationship between any two sample multimedia data is determined, a target model can be constructed according to the data association relationship. The application provides a general idea of modeling sample multimedia data and data association relation simultaneously, which is based on association analysis of data content information or vector distribution characteristic information of the sample multimedia data, and realizes that the data association relation between the multimedia data is introduced into a target model.
The data association relationship between any two sample multimedia data is used for indicating whether the sample multimedia data are related, wherein whether the sample multimedia data are related or not comprises the following steps: whether the data content included in the sample multimedia data is related or not is similar to that of the multimedia data under the same theme, but different themes, and the data content of the multimedia data with a too large theme gap is irrelevant, specifically, the data content of the multimedia data under the selected theme is related or not, and the data content of the multimedia data under the selected theme and the data content of the multimedia data under the movie theme are irrelevant. Or if the data contents of the reference relationship between the sample multimedia data are related, and the data contents of the reference relationship are not related, specifically, for example, the multimedia data a refers to the multimedia data B, the data contents of the multimedia data a and the multimedia data B are considered to be related, wherein the reference relationship may be a full reference relationship, or a partial reference relationship, or may also be a direct reference relationship, or an indirect reference relationship, or the like.
S202, feature information of each sample multimedia data in the sample multimedia data set is obtained, wherein the feature information comprises data content information or vector distribution feature information.
S203, carrying out relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relationship between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result.
In step S202 and step S203, after obtaining the sample multimedia data set, the server may determine, when determining a data association relationship between any two sample multimedia data in the multimedia data set, a data association relationship between any two sample multimedia data according to different feature information of each obtained sample multimedia data and different association analysis manners based on the feature information. If the feature information acquired by the server is vector distribution feature information, the server can perform association analysis according to the distribution condition of hash code vectors of the sample multimedia data, so as to determine the data association relationship between any two sample multimedia data; or if the feature information acquired by the server is data content information, the server can analyze the reference relation in the data content of the multimedia data so as to determine the data association relation between any two sample multimedia data. That is, the data association relationship between any two pieces of multimedia data determined by the server may be determined according to the distribution condition of the hash code vectors of the sample multimedia data, or may be determined according to the reference relationship of the data contents of the sample multimedia data.
In one embodiment, one sample multimedia data corresponds to one hash code vector, if the feature information of each sample multimedia data obtained by the server is vector distribution feature information, and the distribution of the hash code vector of each sample multimedia data in the multimedia data set belongs to target distribution; the data association relationship between any two pieces of multimedia data determined by the server is obtained by performing association analysis on the obtained vector distribution feature information of each piece of sample multimedia data by the server, wherein the association analysis on the vector distribution feature information by the server comprises the following steps: determining the number product of hash code vectors corresponding to any two sample multimedia data, and comparing the number product with a preset number product threshold; and taking a comparison result of the number and the preset number product threshold value as a correlation analysis result of the arbitrary two sample multimedia data. If the server performs a number product operation on hash code vectors of any two multimedia data, and the obtained number product is greater than a preset number product threshold, the server can determine that the any two hash code vectors with the number product have relevance, so as to determine that sample multimedia data corresponding to the hash code vectors also have a data relevance; if the server performs a number product operation on the hash code vectors of any two multimedia data, the obtained number product is smaller than or equal to the preset number product threshold value, it is determined that any two hash code vectors with the number product do not have relevance, and it is determined that sample multimedia data corresponding to the hash code vectors do not have a data relevance. The target distribution may be, for example, a gaussian distribution.
In one embodiment, if the characteristic information of each sample multimedia data acquired by the server is data content information, the data association relationship between any two pieces of multimedia data determined by the server is obtained by performing association analysis on the acquired data content information of each sample multimedia data by the server, where the association analysis on the data content information by the server includes: detecting the reference relation of any two sample multimedia data according to the data content information of the any two sample multimedia data; the result of the reference relation detection is used as a correlation analysis result of the arbitrary two sample multimedia data, wherein if the result of the reference relation detection indicates that the reference relation exists in the data content of the arbitrary two multimedia data, the data correlation relationship between the arbitrary two sample multimedia data is determined; and if the reference relation detection result indicates that the reference relation does not exist in the data content of any two pieces of multimedia data, determining that the data association relation does not exist between any two pieces of sample multimedia data.
S204, constructing a target model according to the data association relation, wherein the target model is used for processing the input target multimedia data and generating a target hash code, and the target hash code is used for acquiring feedback multimedia data related to the target multimedia data.
In one embodiment, if the server determines the data association relationship between any two sample multimedia data, the target model may be constructed based on the data association relationship, where the manner in which the server determines the data association relationship between any two sample multimedia data is different, the distribution function satisfied by the target model constructed by the server is different, in one embodiment, the distribution function satisfied by the target model is a joint probability distribution function, the joint probability distribution function is a first joint probability distribution function or a second joint probability distribution function, and when the manner in which the server determines the data association relationship is determined based on the association analysis of the vector distribution feature information, the distribution function satisfied by the target model is a first joint probability distribution function, and in one embodiment, the first joint probability distribution function may be shown in equation 1.1, for example:
pθ(xi,xj,zi,zj,eij)=pθ(xi|zi)pθ(xj|zj)pθ(wij|zi,zj)p(zi)p(zj) 1.1
Wherein the first joint probability distribution function is constructed by guiding the generation of the content and the data association relationship of the multimedia data through the hash code vector, x i and x j are any two sample multimedia data, z i and z j are hash code vectors corresponding to the any two sample multimedia data, p θ(xj|zj) are used for describing the probability that the multimedia data generated according to z i are sample multimedia data x i, p θ(xj|zj) are used for describing the probability that the multimedia data generated according to z j are sample multimedia data x j, p θ(wij|zi,zj) are used for describing the probability that the data association relationship w ij between the sample multimedia data x j and x j is generated according to z i and z j, and p (z i) and p (z j) are vector distribution characteristic information.
In one embodiment, the first joint probability distribution function is used to describe the hash code vectors z i and z j and indicates the joint probability distribution of the data association relationship e ij (or w ij) between the multimedia data x i and the multimedia data x j, where the parameter θ in the joint probability distribution function is a model parameter of the target model, and it can be understood that training the target model includes training the parameter θ, and training the target model is completed when the parameter θ obtains an optimal parameter. In one embodiment, p (z) (i.e., p (z i) and p (z j)) obeys a standard gaussian distribution, or is a priori obeying a bernoulli distribution, p θ(wij|zi,zj) obeys a bernoulli distribution, i.e.After training is completed based on the target model described by the first joint probability distribution function, the generation of hash codes for introducing the data association relationship into the multimedia data can be realized, namely, after training is completed, the target model described by the first joint probability distribution function, if the multimedia data input into the trained target model has the data association relationship, the corresponding hash codes generated by adopting the trained target model also have the association relationship.
In one embodiment, if the data association is determined according to the association analysis, the target model satisfies the second joint probability distribution function, where the second joint probability distribution function is constructed according to a priori knowledge obtained by performing the association analysis on the data association relation of the multimedia data, and to introduce the priori knowledge, the server may construct the second joint probability distribution function according to the observed conditional probability distribution p (z i,zj|wij) of the data (including the multimedia data x j and the multimedia data x j, and the observed conditional probability distribution p (z i,zj|wij) indicating the data association relation w ij between the multimedia data), where the second joint probability distribution function may be represented by, for example, formula 1.2:
pθ(xi,xj,zi,zj|wij)=pθ(xi|zi)pθ(xj|zj)p(zi,zj|wij) 1.2
Wherein, p θ(xj|zj) is also used to describe the probability that the multimedia data generated according to z i is the sample multimedia data x i, p θ(xj|zj) is also used to describe the probability that the multimedia data generated according to z j is the sample multimedia data x j, and p (z i,zj|wij) is used to indicate the hash code vector corresponding to the case that the observed data association exists between the two multimedia data, and the hash code vector corresponding to the case that the observed data association does not exist between the two multimedia data, that is, the hash code vector corresponding to the case that the observed data association exists between the two multimedia data, that is, the hash code vector corresponding to the two multimedia data having the data association exists when w ij =1, and the hash code vector corresponding to the two multimedia data having no data association when w ij =0. When the objective function satisfies the second joint probability distribution function shown in the formula 1.2, in order to implement the introduction of the data association relationship, the introduction of the data association relationship may be performed by adding association constraint conditions, where the distribution corresponding to the added association constraint conditions may be, for example, as shown in fig. 1 c.
After the target model is constructed, the target model can be trained by adopting the sample multimedia data contained in the sample multimedia data set, and the trained target model is obtained. After the trained target model is obtained, the trained target model can be adopted to process the multimedia data, so that the hash codes corresponding to the multimedia data with the data association relationship are more similar, the hash codes corresponding to the multimedia data without the data association relationship are different, wherein when the similarity degree of two different hash codes is compared, the Hamming distance of the different hash codes in a low-dimensional Hamming space can be determined, the Hamming distance refers to the number of different bit values in the two hash codes and is called as the Hamming distance, specifically, the two hash codes can be subjected to exclusive OR operation, the number of statistical results is 1, and the number of 1 is the Hamming distance between the two hash codes.
In one embodiment, when determining the hamming distance of different hash codes in the low-dimensional space, semantic hashing may be used to map the hash codes from Gao Weide hash code vectors to the low-dimensional hamming space, and semantic hashing may be used to map Gao Weide hash code vectors to the low-dimensional hamming space, so that the similarity of the original space vectors is maintained, and the hamming distance of the new space vectors may reflect the similarity of the original space vectors. Therefore, after the trained target model is obtained, the hash code of each multimedia data can be determined according to the trained target model, so that the data association relation between the corresponding multimedia data is determined based on the hash codes of each multimedia data, further, recommendation of the multimedia data or similarity search can be performed according to the data association relation, wherein the similarity search is also called nearest neighbor search, and the purpose of the similarity search is to find the target feedback multimedia data which is most similar to the query request in a large-scale database (namely a feedback multimedia database) according to the query request of a user, so that the user can quickly find the required multimedia data, the improvement of the search speed and the search precision is realized.
In the embodiment of the invention, the server can determine the data association relationship between any two sample multimedia data in the sample multimedia data set according to the vector distribution information of the hash code vector of the sample multimedia data by carrying out association analysis on the vector distribution characteristic information, or can determine the data association relationship between any two sample multimedia data in the sample multimedia data set according to association analysis on the data content of the sample multimedia data after the sample multimedia data set is acquired by the server. After the server determines the data association relationship between any two sample multimedia data in the sample multimedia data set, a target model can be constructed according to the data association relationship, so that the target model can refer to the data association relationship between the multimedia data, and when the trained target model is adopted to generate the hash code of the multimedia data, the generated hash code not only contains semantic features of the multimedia itself, but also can indicate data association information among different multimedia data, thereby improving the quality of the hash code generated by the target model, and improving the searching speed and searching accuracy when the data searching is performed based on the hash code.
Referring to fig. 3, a schematic flow chart of a data processing method according to an embodiment of the invention is shown in fig. 3, where the method may include:
s301, a sample multimedia data set is acquired, the sample multimedia data set comprising at least two sample multimedia data.
S302, feature information of each sample multimedia data in the sample multimedia data set is obtained, wherein the feature information comprises data content information or vector distribution feature information.
S303, carrying out relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relationship between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result.
In an embodiment, the specific implementation manners of step S301 to step S303 may be referred to the specific implementation manners of step S201 to step S203 in the above embodiment, which are not described herein again.
S304, constructing a target model according to the data association relation, wherein the target model is used for processing the input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
S305, determining a reward function for training the target model.
In step S304 and step S305, after determining the data association relationship between the multimedia data of different samples, a target model may be constructed according to the data association relationship, and training is performed on the target model, so as to identify the multimedia data by using the trained target model, thereby obtaining the hash code corresponding to the multimedia data. In one embodiment, if the joint probability distribution function satisfied by the constructed object model is the first joint probability distribution function shown in the above formula 1.1, where the sample multimedia data x in the sample multimedia data set may be represented as a set of corresponding data contents, and if the sample multimedia data is a document, the sample multimedia data x may be represented as a set of words, and specifically, the sample multimedia data x in the sample multimedia data set may be represented as a set of corresponding data contents as shown in the formula 1.3:
x= { w 1,w2,…,w|x| } 1.3
Where x is any sample multimedia data, w i represents the ith data content in the sample data, if x is a sample document, w i represents the ith word in the sample document, further, each data content may be represented as a multi-class label vector (e.g., one-hot vector) in the |v| dimension; if x represents the number of data contents in the sample multimedia data (e.g., the number of words used to represent a certain sample document), V represents the number of total data contents in the sample multimedia data set (e.g., the total number of words included in the sample document set), p θ (x|z) in equation 1.1 for describing the probability that the multimedia data generated from the hash code vector is the corresponding sample multimedia data can be decomposed into an expression as shown in equation 1.4:
Wherein p θ(wi |z) is used to describe the data content of the multimedia data generated from the hash code vector, the probability of exactly the data content in the corresponding sample multimedia data x, that is, p θ(wi |z) can be used to describe the probability that the word generated from the hash code vector is the word of the original sample document, wherein the expression of p θ(wi |z) is as shown in equation 1.5:
where exp represents an exponential function, W e R d×|V| is a parameter matrix, d is the dimension of the hash vector (hidden variable) z, b i is the model bias term, then the parameters θ= { W, b i,...,b|V| } of the target model. It will be appreciated that by training and updating all of the sample multimedia data in the sample multimedia data set, the server can obtain the joint probability distribution function p θ (x, z) of the hash code vector and the sample multimedia data. Based on the hash code vector and the joint probability distribution function p θ (x, z) of the sample multimedia data, the hash code vector z corresponding to each multimedia data x can be determined by a posterior distribution function p θ (z|x). In one embodiment, the posterior distribution function p θ (z|x) is used to indicate the probability of calling the target model to generate each reference hash code vector of the input target multimedia data, and based on the posterior distribution function p θ (z|x), the trained target model may use the hash code indicated by the reference hash code vector corresponding to the maximum probability as the target hash code of the target multimedia data and output the target hash code.
In one embodiment, since the posterior distribution function p θ (z|x) is difficult to calculate, the server may approximate the posterior distribution function p θ (z|x) using a variance inference principle, which is a method for handling the difficult-to-integrate case that occurs in bayesian inference and machine learning, it is understood that when the variance lower bound takes the maximum value, an approximate function closest to the posterior distribution function p θ (z|x) may be taken, where the expression of the variance lower bound may be as shown in equation 1.6:
wherein the approximation function introduced for approximating the posterior distribution function p θ (z|x) is expressed as ThenFor representation in approximation functionsKL is the relative entropy, which is used to indicate the information difference of the two probability distributions, it will be appreciated that,For representing approximation functionsAnd p θ(zi) in the first joint probability distribution function,For representing approximation functionsAnd p θ(zj) in the first joint probability distribution function. In one embodiment, based on the idea of variation, when the expression of the lower variation bound as shown in equation 1.6 is maximized, an approximation function closest to the posterior distribution function is obtained, that is, the training of the object model is to obtain the maximum value of the expression of the lower variation bound as shown in equation 1.6.
In one embodiment, if the data association relationship between multimedia data cannot be directly determined, the data association relationship w ij may be assumed to be a hidden variable, and is inferred according to the variational inference principle, and when the joint probability distribution function satisfied by the target model is the first joint probability distribution function shown as 1.1, the expression of the variable lower boundary when w ij cannot be directly determined may be shown as formula 1.7:
wherein the approximation function introduced for approximating the posterior distribution function p θ (z|x) is expressed as For representation in the function domainIs used as a reference to the desired value of (a),For representation in the function domainIs used as a reference to the desired value of (a),For the operator to sum based on the value of w ij,Is an operator. It will be appreciated that whether or not w ij can be determined directly, when the joint probability distribution function satisfied by the target model is the first joint probability distribution function shown as 1.1, the reward function for training the target model is shown as formula 1.8:
in one embodiment of the present invention, in one embodiment, To sum the lower bound of the variation shown in equation 1.6,To sum the lower bound of the variation shown in equation 1.7. It will be appreciated that when the joint probability distribution function satisfied by the target model is the first joint probability distribution function shown as 1.1, the expression of the reward function obtained by training the target model is the expression corresponding to the expression 1.8.
In one embodiment, if the joint probability distribution function satisfied by the constructed object model is the second joint probability distribution function shown in the above formula 1.2, the server introduces two different association constraints to make hash codes between multimedia data with data association relationship as similar as possible through a weak constraint method, where the introduced association constraint may be an expression of prior distribution shown in the formula 2.1:
Wherein, p 1(zi,zj) and p 0(zi,zj) represent a priori distributions when the arbitrary two sample multimedia data (x i,xj) are strongly correlated and weakly correlated, respectively, when w ij =1, the arbitrary two sample multimedia data (x i,xj) are strongly correlated (i.e. have a data correlation), and when w ij =0, the arbitrary two sample multimedia data (x i,xj) are weakly correlated (i.e. have no data correlation).
In one embodiment, the expression of the prior distribution mentioned above may be specifically shown in the following equation 2.2:
Wherein p 1(zi,zj) is an expression of a hash code vector having a data association relationship, p 0(zi,zj) identifies an expression of a hash code vector having no data association relationship, d represents a vector dimension, λ is a degree of correlation between dimensions corresponding to z i and z j, wherein λ e (0, 1), that is, although a priori distribution makes each dimension of z i,zj itself independent of each other, the degree of correlation between dimensions corresponding to z i and z j is determined by a coefficient λ, and by this particular a priori distribution, each bit of a hash code between sample multimedia data having a data association relationship is positively correlated, thereby realizing that the data association relationship between the multimedia data is introduced into a model. Similarly, when the server invokes the target model to determine the hash code z corresponding to the input multimedia data x, the hash code z is also determined based on the posterior distribution function p θ (z|x), and the server also approximates the posterior distribution function p θ (z|x) by using the variance inference principle. In one embodiment, when the joint probability distribution function of the object model is the second joint probability distribution function shown in equation 1.2, the expression of the variation lower bound introduced to obtain the approximation function of the back-delay distribution function may be shown in equation 2.3:
Wherein q 0 is an approximate function representation after introducing association constraint conditions for the second joint profile distribution function according to the situation that no data association relationship exists between the sample multimedia data, and q 1 is an approximate function representation after introducing association constraint conditions for the second joint profile distribution function according to the situation that data association relationship exists between the sample multimedia data; then For representing the expected value at the approximation function q 1,For representing the expected value at the approximation function q 0. Similarly, when the variation lower bound shown in equation 2.3 is maximized, the approximate functions q 0 and q 1 closest to the posterior distribution function are obtained, and if the data association relationship between the multimedia data cannot be directly determined, then the data association relationship W ij can be assumed to be a hidden variable, and is deduced according to the variation inference principle, when the joint probability distribution function satisfied by the target model is the second joint probability distribution function shown in equation 1.2, and the expression of the variation lower bound cannot be directly determined in W ij can be shown in equation 2.4:
Wherein, For representing the expected value at the approximation function q, KL being the relative entropy, for indicating the difference of the information of the two probability distributions, thenFor representing approximation functionsAnd information differences of the p (x i,xj,wij) distribution function in the second joint probability distribution function. It will be appreciated that whether or not w ij can be determined directly, when the joint probability distribution function satisfied by the target model is the second joint probability distribution function shown as 1.2, the reward function for training the target model is shown as formula 2.5:
Wherein, To sum the lower bound of the variation shown in equation 2.3,To sum the variation lower bounds shown in equation 2.4, it can be understood that when the joint probability distribution function satisfied by the target model is the second joint probability distribution function shown in equation 1.2, the expression of the reward function obtained by training the target model is the expression corresponding to equation 2.5. After the server determines the reward function for training the target model, the target model may be trained using the sample multimedia data included in the sample multimedia data set in a direction in which the reward function increases, i.e., step S306 and step S307 may be performed instead.
And S306, training a joint probability distribution function and a probability calculation function which are met by the target model according to the increasing direction of the reward function by adopting the sample multimedia data included by the sample multimedia data set.
S307, training the target model is completed when the function value of the reward function meets a preset threshold.
In step S306 and step S307, the reward function determined by the server may be a reward function as shown in the above formula 1.8 or a reward function as shown in the above formula 2.5, and when the reward function takes a maximum value, the server may determine that the function value of the reward function meets a preset threshold, and complete training of the target model. In one embodiment, after training of the target model is completed, the trained target model may be invoked to generate a plurality of reference hash code vectors of the input target multimedia data, so that a probability value of each reference hash code vector as a theoretical hash code vector of the target multimedia data may be determined according to a probability calculation function, a target probability value meeting a preset condition is selected from the probability values, further, a target hash code vector corresponding to the target probability value may be determined, and a target hash code of the target multimedia data may be determined according to a hash code indicated by the target hash code vector. Specifically, when determining the target hash code of the target multimedia data according to the hash code indicated by the target hash code vector, the server may perform binarization processing on the hash code indicated by the target hash code vector, and use the hash code after binarization processing as the target hash code of the target multimedia data, where the server may determine a processing threshold first when performing binarization processing on the target hash code vector, so that each bit of the target hash code vector z may be further compared with the processing threshold, if the bit is greater than the processing threshold, the corresponding position of the target hash code is adjusted to be 1, and if the bit is less than or equal to the processing threshold, the corresponding position is adjusted to be 0, and it is required to be stated that when determining the processing threshold for performing binarization processing, the server should comply with the principle of maximizing entropy, and by performing binarization processing, the target hash code vector z may be mapped to a hash code only including 0 and 1.
In one embodiment, the probability calculation function is an approximation function of the posterior distribution function, and specifically, when determining the probability calculation function, the server may determine, according to a joint probability distribution function satisfied by the target model, a theoretical probability function (i.e., the posterior distribution function) adopted when obtaining a theoretical hash code vector, where the theoretical probability function is used to indicate a probability that multimedia data input into the target model generates the theoretical hash code vector when the joint probability distribution function is satisfied; an approximation function of the theoretical probability function may thus be determined, and the probability calculation function may be determined from the approximation function.
In one embodiment, if the joint probability distribution function satisfied by the object model is a first joint probability distribution function as shown in equation 1.1, then the server employsTo approximate the true posterior distribution p θ(zi,zj|xi,xj,eij), since the semantic information of the multimedia data may include the data association relationship between the multimedia data, the expression of the approximation function may be as shown in equation 2.6:
Wherein, Is Gaussian distribution: And Is the output of the model with x as the input target. Therefore, for the multimedia data x, the hash code outputted by the calling target model isThat is, training of the target model, i.e., model parameters θ and model parametersTraining and under variation lower boundAt maximum, the optimal model parameters theta and model parameters are obtained
In one embodiment, if the joint probability distribution function satisfied by the object model is a second joint probability distribution function as shown in equation 1.2, since the server implements the introduction of the data association relationship between multimedia data by applying an association constraint to the hash codes, an approximation function of the posterior distribution function of the second joint probability distribution function is shown in equation 2.7:
Wherein, Namely the above-mentioned q 1,I.e., q 0 as described above, and, in addition,It is known thatThe mean and diagonal covariance matrices of the Gaussian distribution are respectively, gamma ij∈Rd×d is also a diagonal matrix, the value range of gamma ij is [0,1], and whether the relation between z i and z j is shown, and the parameters are shownIs the output corresponding to the input of the target model (x i,xj). In one embodiment, based on the introduced prior distribution function, the explicit hidden variable (i.e. hash code vector) between the sample multimedia data with the data association is realized as similar as possible, and for the sample multimedia data without the data association, no association is imposed, so that the data association information between the sample multimedia data is introduced into the hash code, and the noise problem introduced by constructing the data association through an auxiliary means is solved. Then, in this case, when the variation lower bound logp (x i,xj|wij) is maximum, the optimal model parameters θ and model parameters are obtained
In one embodiment, the model parameters θ and model parameters are optimized in obtaining the target modelThen, training the target model is completed, and in order to determine the quality of the hash code generated by the trained target model, the trained target model is compared with the existing best model, as shown in table 1:
TABLE 1
8 Bits of 16 Bits 32 Bits 64 Bits 128 Bits
First code (VDSH) model 0.433 0.6853 0.7108 0.4410 0.5847
Second code (BMSH) model 0 0.7062 0.7481 0.7519 0.7450
Edge-based object model 0.7358 0.7982 0.8364 0.8474 0.8491
Target model based on priori knowledge 0.7546 0.8345 0.8563 0.8665 0.8676
As can be seen from table 1, whether the joint probability distribution function of the target model is a first joint probability distribution function (i.e., the target model is an edge-based target model) or the joint probability distribution function of the target model is a second joint probability distribution function (i.e., the target model is a priori knowledge-based target model), the quality of the hash code generated by the target model is higher than that of the hash code generated by the existing model at each bit rate, so that the quality of the hash code can be obviously improved by the target model constructed based on the data association relationship between multimedia data.
After training the target model, the trained target model can be applied to information search, search information obtained by search is ordered and output according to the association degree with the search information, specifically, a server can acquire a query request, call the trained target model to determine a query hash code corresponding to the query information in the query request, and further the server can acquire a feedback multimedia database, wherein the feedback multimedia database comprises a plurality of feedback multimedia data and hash codes of each feedback multimedia data, and each feedback hash code of the media data is determined by the server calling the target model in advance; after the server determines the query hash code and the hash code of each piece of feedback multimedia data, the target feedback multimedia data matched with the query hash code can be selected from the feedback multimedia database according to the query hash code and the hash code of each piece of feedback multimedia data. In one embodiment, when the server selects the target feedback multimedia data matched with the query hash code from the feedback multimedia database according to the query hash code and the hash code of each feedback multimedia data, the hamming distance between the query hash code and the hash code of each feedback multimedia data may be calculated first, so that the feedback multimedia data with the hamming distance less than or equal to the preset distance threshold may be selected from the feedback multimedia database as the target feedback multimedia data.
In one embodiment, as shown in fig. 4, if the trained target model is applied to the health query search field after the training of the target model is completed, the query information in the query request acquired by the server is "what skin allergy is to be noted" as marked by 40 in fig. 4, and after the query information is acquired by the server, the server will perform steps s11 to s14, specifically,
S11, the server converts the query information into a target feature vector. The target feature vector may be, for example, a (Term Frequency-Inverse Document Frequency, TFIDF) feature vector for information retrieval and data mining;
and s12, after the feature vector is obtained, obtaining a query hash code of the query information through a trained target model, wherein the target model which can be called by the server can be an edge-based target model with the corresponding joint probability distribution function as a first joint probability distribution function, or can be an a priori knowledge-based target model with the corresponding joint probability distribution function as a second joint probability distribution function. In addition, the server also calls the trained target model in advance to determine the hash code of each piece of feedback multimedia data, wherein when the trained target model is called to determine the hash code of each piece of feedback multimedia data, the feedback multimedia data x is passed through a function Mapping the binary code into a hash code vector z, and performing binarization processing on the hash code vector z to obtain a corresponding hash code.
S13, determining a hamming distance between the query hash code and the hash codes of the existing feedback multimedia data, specifically, after determining the query hash code and the hash codes of each feedback multimedia data, the server may perform an exclusive-or operation on the query hash code and the hash codes of each feedback multimedia data to obtain a hamming distance between the query hash code and each feedback multimedia data, where the closer the hamming distance is, the stronger the correlation between the corresponding hash codes is, and the stronger the correlation between the multimedia data corresponding to the hash codes and the query information is,
And s14, outputting feedback multimedia data associated with the query information, wherein after determining the Hamming distance between the query hash code and each feedback multimedia data, the server can sequentially display the feedback multimedia data in the user interface according to the distance of the Hamming distance. Specifically, when the input query information is "what skin allergy is to be noted", the feedback multimedia data output to the user interface display may be as shown in fig. 5.
In one embodiment, to determine the search accuracy and search speed of the target model, the target model proposed in the embodiment of the present invention is compared with the result of searching for information based on the existing language model, which may be shown in table 2, and in one embodiment, the existing language model may be, for example, BERT (Bidirectional Encoder Representations from Transformers) models.
TABLE 2
Model Precision of Response speed Characteristic quantity
Bert model 0.946 200 Ms 11 Billion
Edge-based object model 0.9161 1.2 Ms 0.05 Million
Target model based on priori knowledge 0.9226 1.3 Ms 0.05 Million
As shown in table 2, the target model has a smaller search accuracy than the existing model, but the search speed is about 1000 times that of the existing model, which greatly improves the data distribution capacity of the server, thereby improving the search speed.
In the embodiment of the invention, after the server determines the data association relationship between any two pieces of multimedia data in the sample multimedia data set, the server can construct a target model based on the data association relationship, after the target model is constructed, the server can determine different reward functions according to joint probability distribution functions satisfied by the target model to train the target model, so as to obtain a trained target model, the server invokes the trained target model to determine that the quality of the hash codes of the input multimedia data is higher, and the data association relationship between the multimedia data is introduced into the hash codes, so that the server has higher similarity degree of the hash codes of the similar multimedia data when invoking the hash codes of the trained multimedia data, and the relevance of the multimedia data reflected by the hash codes is realized.
Based on the above description of the embodiments of the data processing method, the embodiments of the present invention also provide a data processing apparatus, which may be a computer program (including program code) running in the above server. The data processing apparatus may be used to perform the data processing method as described in fig. 2 and 3, referring to fig. 6, the data processing apparatus includes: an acquisition unit 601, a determination unit 602, and a construction unit 603.
An acquisition unit 601 for acquiring a sample multimedia data set, the sample multimedia data set comprising at least two sample multimedia data;
The obtaining unit 601 is further configured to obtain feature information of each sample multimedia data in the sample multimedia data set, where the feature information includes data content information or vector distribution feature information;
A determining unit 602, configured to perform relevance analysis on feature information of any two sample multimedia data, and determine a data relevance relationship between any two sample multimedia data in the sample multimedia data set according to a relevance analysis result;
The construction unit 603 is configured to construct a target model according to the data association relationship, where the target model is configured to process input target multimedia data and generate a target hash code, and the target hash code is configured to obtain multimedia data related to the target multimedia data.
In one embodiment, one sample of multimedia data corresponds to one hash code vector; if the characteristic information is vector distribution characteristic information, and the distribution of hash code vectors of each sample multimedia data in the multimedia data set belongs to target distribution; the determining unit 602 is specifically configured to:
Determining the number product of hash code vectors corresponding to any two sample multimedia data, and comparing the number product with a preset number product threshold;
and taking a comparison result of the number and the preset number product threshold value as a correlation analysis result of the arbitrary two sample multimedia data.
In one embodiment, the determining unit 602 is specifically configured to:
If the number product is larger than the preset number product threshold, determining that the data association relationship exists between any two sample multimedia data;
And if the number product is smaller than or equal to the preset number product threshold, determining that the data association relationship between any two pieces of sample multimedia data does not exist.
In one embodiment, if the characteristic information is data content information, the determining unit 602 is specifically configured to:
detecting the reference relation of any two sample multimedia data according to the data content information of the any two sample multimedia data;
And taking the result of the reference relation detection as a correlation analysis result of the arbitrary two sample multimedia data.
In one embodiment, the determining unit 602 is specifically configured to:
If the reference relation detection result indicates that the reference relation exists in the data content of any two pieces of multimedia data, determining that the data association relation exists between any two pieces of sample multimedia data;
and if the reference relation detection result indicates that the reference relation does not exist in the data content of any two pieces of multimedia data, determining that the data association relation does not exist between any two pieces of sample multimedia data.
In one embodiment, the constructed target model satisfies a joint probability distribution function, which is either a first joint probability distribution function or a second joint probability distribution function;
and if the data association relation is determined according to the association analysis of the data content information, the target model meets the second joint probability distribution function.
In one embodiment, the apparatus further comprises: training unit 604.
The determining unit 602 is further configured to determine a reward function for training the target model;
A training unit 604, configured to use the sample multimedia data included in the sample multimedia data set, and train the joint probability distribution function and the probability calculation function that are satisfied by the target model according to the direction in which the reward function increases;
the training unit 604 is further configured to complete training of the target model when the function value of the reward function meets a preset threshold.
In one embodiment, the apparatus further comprises: a generating unit 605.
A generating unit 605 for generating a plurality of reference hash code vectors of the input target multimedia data by calling the trained target model;
The determining unit 602 is further configured to determine, according to a probability calculation function, a probability value of each reference hash code vector being a theoretical hash code vector of the target multimedia data, and select a target probability value satisfying a preset condition from the probability values;
The determining unit 602 is further configured to determine a target hash code vector corresponding to the target probability value, and determine a target hash code of the multimedia data input to the target model according to the hash code indicated by the target hash code vector.
In one embodiment, the determining unit 602 is further configured to determine, according to a joint probability distribution function satisfied by the target model, a theoretical probability function adopted when the theoretical hash code vector is obtained, where the theoretical probability function is used to indicate a probability that the target model generates the theoretical hash code vector when the joint probability distribution function is satisfied;
The determining unit 602 is further configured to determine an approximation function of the theoretical probability function, and determine the probability calculation function according to the approximation function.
In one embodiment, the determining unit 602 is specifically configured to:
And carrying out binarization processing on the hash code indicated by the target hash code vector, and taking the hash code subjected to binarization processing as a target hash code of the target multimedia data.
In one embodiment, the apparatus further comprises: a selection unit 606.
The acquiring unit 601 is further configured to acquire a query request, and invoke a trained target model to determine a query hash code corresponding to query information in the query request;
The obtaining unit 601 is further configured to obtain a feedback multimedia database, where the feedback multimedia database includes a plurality of feedback multimedia data, and a hash code of each feedback multimedia data;
and a selecting unit 606, configured to select, according to the query hash code and the hash code of each feedback multimedia data, target feedback multimedia data matching the query hash code from the feedback multimedia database.
In one embodiment, the selecting unit 606 is specifically configured to:
calculating the Hamming distance between the inquiry hash code and the hash code of each feedback multimedia data;
and selecting feedback multimedia data with the hamming distance smaller than or equal to a preset distance threshold from the feedback multimedia database as target feedback multimedia data.
In the embodiment of the present invention, the acquiring unit 601 acquires the sample multimedia data set, so that the determining unit 602 may determine, according to the vector distribution information of the hash code vector of the sample multimedia data, by performing correlation analysis on the vector distribution feature information, a data correlation between any two sample multimedia data in the sample multimedia data set, or the determining unit 602 may determine, according to correlation analysis performed on the data content of the sample multimedia data after the acquiring unit 601 acquires the sample multimedia data set, a data correlation between any two sample multimedia data in the sample multimedia data set, and further the constructing unit 603 may determine, according to the data correlation after determining the data correlation between any two sample multimedia data in the sample multimedia data set, a target model, so that the target model may refer to the data correlation between the multimedia data, and may include semantic features of multimedia itself when the trained target model is used to generate the hash code of the multimedia data, and may indicate data correlation information between different multimedia data, thereby improving the accuracy of the hash code generation and the search quality of the hash code based on the search model.
Referring to fig. 7, a schematic block diagram of a server according to an embodiment of the present invention may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, and basic cloud computing services such as big data and artificial intelligence platforms. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The server in the present embodiment as shown in fig. 7 may include: one or more processors 701; one or more input devices 702, one or more output devices 703 and a memory 704. The processor 701, the input device 702, the output device 703, and the memory 704 are connected by a bus 705. The memory 704 is used for storing a computer program comprising program instructions, and the processor 701 is used for executing the program instructions stored in the memory 704.
The memory 704 may include volatile memory (RAM), such as random-access memory (RAM); the memory 704 may also include a non-volatile memory (non-volatile memory), such as a flash memory (flash memory), a solid state disk (solid-state drive-STATE DRIVE, SSD), etc.; memory 704 may also include combinations of the above types of memory.
The processor 701 may be a central processing unit (central processing unit, CPU). The processor 701 may further comprise a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (programmable logic device, PLD), or the like. The PLD may be a field-programmable gate array (FPGA) of field-programmable GATE ARRAY, generic array logic (GENERIC ARRAY logic, GAL), or the like. The processor 701 may also be a combination of the above structures.
In an embodiment of the present invention, the memory 704 is configured to store a computer program, where the computer program includes program instructions, and the processor 701 is configured to execute the program instructions stored in the memory 704, to implement the steps of the corresponding method shown in fig. 2 and 3.
In one embodiment, the processor 701 is configured to call the program instructions for executing:
Acquiring a sample multimedia data set, wherein the sample multimedia data set comprises at least two sample multimedia data;
Acquiring characteristic information of each sample multimedia data in the sample multimedia data set, wherein the characteristic information comprises data content information or vector distribution characteristic information;
carrying out relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relation between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result;
And constructing a target model according to the data association relation, wherein the target model is used for processing the input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
In one embodiment, one sample multimedia data corresponds to one hash code vector, and if the feature information is vector distribution feature information, the distribution of the hash code vector of each sample multimedia data in the multimedia data set belongs to target distribution; the processor 701 is configured to call the program instructions for executing:
Determining the number product of hash code vectors corresponding to any two sample multimedia data, and comparing the number product with a preset number product threshold;
and taking a comparison result of the number and the preset number product threshold value as a correlation analysis result of the arbitrary two sample multimedia data.
In one embodiment, the processor 701 is configured to call the program instructions for executing:
If the number product is larger than the preset number product threshold, determining that the data association relationship exists between any two sample multimedia data;
And if the number product is smaller than or equal to the preset number product threshold, determining that the data association relationship between any two pieces of sample multimedia data does not exist.
In one embodiment, if the characteristic information is data content information, the processor 701 is configured to call the program instructions to execute:
detecting the reference relation of any two sample multimedia data according to the data content information of the any two sample multimedia data;
And taking the result of the reference relation detection as a correlation analysis result of the arbitrary two sample multimedia data.
In one embodiment, the processor 701 is configured to call the program instructions for executing:
If the reference relation detection result indicates that the reference relation exists in the data content of any two pieces of multimedia data, determining that the data association relation exists between any two pieces of sample multimedia data;
and if the reference relation detection result indicates that the reference relation does not exist in the data content of any two pieces of multimedia data, determining that the data association relation does not exist between any two pieces of sample multimedia data.
In one embodiment, the constructed target model satisfies a joint probability distribution function, which is either a first joint probability distribution function or a second joint probability distribution function;
And if the data association relation is determined according to the association analysis of the vector distribution characteristic information, the target model meets the first joint probability distribution function, and if the data association relation is determined according to the association analysis of the data content information, the target model meets the second joint probability distribution function.
In one embodiment, the processor 701 is configured to call the program instructions for executing:
Determining a reward function for training the target model;
Training a joint probability distribution function and a probability calculation function which are met by the target model according to the increasing direction of the reward function by adopting sample multimedia data contained in the sample multimedia data set;
and finishing training the target model when the function value of the reward function meets a preset threshold.
In one embodiment, the processor 701 is configured to call the program instructions for executing:
Calling the trained target model to generate a plurality of reference hash code vectors of the input target multimedia data;
Determining each reference hash code vector as a probability value of a theoretical hash code vector of the target multimedia data according to a probability calculation function, and selecting a target probability value meeting a preset condition from the probability values;
and determining a target hash code vector corresponding to the target probability value, and determining the target hash code of the target multimedia data according to the hash code indicated by the target hash code vector.
In one embodiment, the processor 701 is configured to call the program instructions for executing:
Determining a theoretical probability function adopted when the theoretical hash code vector is obtained according to a joint probability distribution function satisfied by the target model, wherein the theoretical probability function is used for indicating the probability that the target model generates the theoretical hash code vector when the joint probability distribution function is satisfied;
An approximation function of the theoretical probability function is determined, and the probability calculation function is determined according to the approximation function.
In one embodiment, the processor 701 is configured to call the program instructions for executing:
And carrying out binarization processing on the hash code indicated by the target hash code vector, and taking the hash code subjected to binarization processing as a target hash code of the target multimedia data.
In one embodiment, the processor 701 is configured to call the program instructions for executing:
Acquiring a query request, and calling a trained target model to determine a query hash code corresponding to query information in the query request;
Acquiring a feedback multimedia database, wherein the feedback multimedia database comprises a plurality of feedback multimedia data and hash codes of each feedback multimedia data;
And selecting target feedback multimedia data matched with the query hash codes from the feedback multimedia database according to the query hash codes and the hash codes of each feedback multimedia data.
In one embodiment, the processor 701 is configured to call the program instructions for executing:
calculating the Hamming distance between the inquiry hash code and the hash code of each feedback multimedia data;
and selecting feedback multimedia data with the hamming distance smaller than or equal to a preset distance threshold from the feedback multimedia database as target feedback multimedia data.
Embodiments of the present invention provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium and executes the computer instructions to cause the computer device to perform the method embodiments described above as shown in fig. 2 or fig. 3. The computer readable storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.
The foregoing disclosure is merely illustrative of some embodiments of the present invention and it is not to be construed as limiting the scope of the invention, as a person of ordinary skill in the art will appreciate that all or part of the above-described embodiments may be practiced with equivalent variations which fall within the scope of the invention as defined in the appended claims.

Claims (15)

1. A method of data processing, comprising:
Acquiring a sample multimedia data set, wherein the sample multimedia data set comprises at least two sample multimedia data;
Acquiring characteristic information of each sample multimedia data in the sample multimedia data set, wherein the characteristic information comprises data content information or vector distribution characteristic information; the vector distribution characteristic information is used for indicating the distribution condition of hash code vectors corresponding to each sample multimedia data;
Carrying out relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relation between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result; if the characteristic information is the data content information, performing correlation analysis on the characteristic information of any two sample multimedia data includes: detecting the reference relation of any two sample multimedia data according to the data content information of the any two sample multimedia data; taking the result of the reference relation detection as a correlation analysis result of the arbitrary two sample multimedia data; if the characteristic information is the vector distribution characteristic information, performing relevance analysis on the characteristic information of any two sample multimedia data according to the distribution condition of hash code vectors corresponding to the sample multimedia data;
And constructing a target model according to the data association relation, wherein the target model is used for processing the input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
2. The method of claim 1, wherein one sample multimedia data corresponds to one hash code vector, and if the feature information is vector distribution feature information, the distribution of the hash code vector of each sample multimedia data in the multimedia data set belongs to a target distribution; the performing association analysis on the characteristic information of any two sample multimedia data comprises the following steps:
Determining the number product of hash code vectors corresponding to any two sample multimedia data, and comparing the number product with a preset number product threshold;
and taking a comparison result of the number and the preset number product threshold value as a correlation analysis result of the arbitrary two sample multimedia data.
3. The method according to claim 2, wherein determining the data association relationship between any two sample multimedia data in the sample multimedia data set according to the association analysis result comprises:
If the number product is larger than the preset number product threshold, determining that the data association relationship exists between any two sample multimedia data;
And if the number product is smaller than or equal to the preset number product threshold, determining that the data association relationship between any two pieces of sample multimedia data does not exist.
4. The method according to claim 1, wherein determining the data association relationship between any two sample multimedia data in the sample multimedia data set according to the association analysis result comprises:
If the reference relation detection result indicates that the reference relation exists in the data content of any two pieces of multimedia data, determining that the data association relation exists between any two pieces of sample multimedia data;
and if the reference relation detection result indicates that the reference relation does not exist in the data content of any two pieces of multimedia data, determining that the data association relation does not exist between any two pieces of sample multimedia data.
5. The method of claim 1, wherein the constructed target model satisfies a joint probability distribution function, the joint probability distribution function being either a first joint probability distribution function or a second joint probability distribution function;
and if the data association relation is determined according to the association analysis of the data content information, the target model meets the second joint probability distribution function.
6. The method of claim 1, wherein after the constructing the object model according to the data association relationship, the method further comprises:
Determining a reward function for training the target model;
Training a joint probability distribution function and a probability calculation function which are met by the target model according to the increasing direction of the reward function by adopting sample multimedia data contained in the sample multimedia data set;
and finishing training the target model when the function value of the reward function meets a preset threshold.
7. The method of claim 6, wherein the method further comprises:
Calling the trained target model to generate a plurality of reference hash code vectors of the input target multimedia data;
Determining each reference hash code vector as a probability value of a theoretical hash code vector of the target multimedia data according to a probability calculation function, and selecting a target probability value meeting a preset condition from the probability values;
and determining a target hash code vector corresponding to the target probability value, and determining the target hash code of the target multimedia data according to the hash code indicated by the target hash code vector.
8. The method of claim 7, wherein the method further comprises:
Determining a theoretical probability function adopted when the theoretical hash code vector is obtained according to a joint probability distribution function satisfied by the target model, wherein the theoretical probability function is used for indicating the probability that the target model generates the theoretical hash code vector when the joint probability distribution function is satisfied;
An approximation function of the theoretical probability function is determined, and the probability calculation function is determined according to the approximation function.
9. The method of claim 7, wherein the determining the target hash code of the target multimedia data from the hash code indicated by the target hash code vector comprises:
And carrying out binarization processing on the hash code indicated by the target hash code vector, and taking the hash code subjected to binarization processing as a target hash code of the target multimedia data.
10. The method of claim 6, wherein the method further comprises:
Acquiring a query request, and calling a trained target model to determine a query hash code corresponding to query information in the query request;
Acquiring a feedback multimedia database, wherein the feedback multimedia database comprises a plurality of feedback multimedia data and hash codes of each feedback multimedia data;
And selecting target feedback multimedia data matched with the query hash codes from the feedback multimedia database according to the query hash codes and the hash codes of each feedback multimedia data.
11. The method of claim 1, wherein selecting the target feedback multimedia data from the feedback multimedia database that matches the query hash code based on the query hash code and the hash code for each feedback multimedia data, comprises:
calculating the Hamming distance between the inquiry hash code and the hash code of each feedback multimedia data;
and selecting feedback multimedia data with the hamming distance smaller than or equal to a preset distance threshold from the feedback multimedia database as target feedback multimedia data.
12. A data processing apparatus, comprising:
an acquisition unit for acquiring a sample multimedia data set comprising at least two sample multimedia data;
The acquisition unit is further configured to acquire feature information of each sample multimedia data in the sample multimedia data set, where the feature information includes data content information or vector distribution feature information; the vector distribution characteristic information is used for indicating the distribution condition of hash code vectors corresponding to each sample multimedia data;
The determining unit is used for carrying out relevance analysis on the characteristic information of any two sample multimedia data and determining the data relevance relation between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result; if the characteristic information is the data content information, performing correlation analysis on the characteristic information of any two sample multimedia data includes: detecting the reference relation of any two sample multimedia data according to the data content information of the any two sample multimedia data; taking the result of the reference relation detection as a correlation analysis result of the arbitrary two sample multimedia data; if the characteristic information is the vector distribution characteristic information, performing relevance analysis on the characteristic information of any two sample multimedia data according to the distribution condition of hash code vectors corresponding to the sample multimedia data;
The building unit is used for building a target model according to the data association relation, wherein the target model is used for processing the input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
13. A server comprising a processor, an input device, an output device and a memory, the processor, the input device, the output device and the memory being interconnected, wherein the memory is adapted to store a computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method of any of claims 1-11.
14. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method of any of claims 1-11.
15. A computer program product, characterized in that the computer program product comprises a computer program or computer instructions for executing the method according to any of claims 1-11 by a processor.
CN202011153160.1A 2020-10-23 2020-10-23 Data processing method, device, server and storage medium Active CN112148902B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011153160.1A CN112148902B (en) 2020-10-23 2020-10-23 Data processing method, device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011153160.1A CN112148902B (en) 2020-10-23 2020-10-23 Data processing method, device, server and storage medium

Publications (2)

Publication Number Publication Date
CN112148902A CN112148902A (en) 2020-12-29
CN112148902B true CN112148902B (en) 2024-08-06

Family

ID=73954943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011153160.1A Active CN112148902B (en) 2020-10-23 2020-10-23 Data processing method, device, server and storage medium

Country Status (1)

Country Link
CN (1) CN112148902B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800253B (en) * 2021-04-09 2021-07-06 腾讯科技(深圳)有限公司 Data clustering method, related device and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110209867A (en) * 2019-06-05 2019-09-06 腾讯科技(深圳)有限公司 Training method, device, equipment and the storage medium of image encrypting algorithm

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102431737B1 (en) * 2017-02-28 2022-08-11 삼성전자주식회사 Method of searching highlight in multimedia data and apparatus therof
RU2673385C9 (en) * 2017-05-26 2018-12-24 Максим Львович Лихвинцев Method of data exchange recording control in information – telecommunication network and identification system of electron mail
US20190332921A1 (en) * 2018-04-13 2019-10-31 Vosai, Inc. Decentralized storage structures and methods for artificial intelligence systems
CN111026887B (en) * 2019-12-09 2023-05-23 武汉科技大学 Cross-media retrieval method and system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110209867A (en) * 2019-06-05 2019-09-06 腾讯科技(深圳)有限公司 Training method, device, equipment and the storage medium of image encrypting algorithm

Also Published As

Publication number Publication date
CN112148902A (en) 2020-12-29

Similar Documents

Publication Publication Date Title
US11227118B2 (en) Methods, devices, and systems for constructing intelligent knowledge base
CN111539197A (en) Text matching method and device, computer system and readable storage medium
CN113157863A (en) Question and answer data processing method and device, computer equipment and storage medium
CN109376222A (en) Question and answer matching degree calculation method, question and answer automatic matching method and device
CN111783903B (en) Text processing method, text model processing method and device and computer equipment
CN114861889B (en) Deep learning model training method, target object detection method and device
CN110210038B (en) Core entity determining method, system, server and computer readable medium thereof
CN113204618A (en) Information identification method, device and equipment based on semantic enhancement and storage medium
CN114547257B (en) Class matching method and device, computer equipment and storage medium
CN112070550A (en) Keyword determination method, device and equipment based on search platform and storage medium
CN113434639A (en) Audit data processing method and device
CN114880991B (en) Knowledge graph question-answering question-sentence entity linking method, device, equipment and medium
CN114528588A (en) Cross-modal privacy semantic representation method, device, equipment and storage medium
CN117312535B (en) Method, device, equipment and medium for processing problem data based on artificial intelligence
CN113591490B (en) Information processing method and device and electronic equipment
CN109086386B (en) Data processing method, device, computer equipment and storage medium
CN112148902B (en) Data processing method, device, server and storage medium
CN112307738B (en) Method and device for processing text
CN109117471B (en) Word relevancy calculation method and terminal
CN112749557A (en) Text processing model construction method and text processing method
CN114528908B (en) Network request data classification model training method, classification method and storage medium
CN115525781A (en) Multi-mode false information detection method, device and equipment
CN113779370A (en) Address retrieval method and device
CN118093885B (en) Data processing method, device and equipment, medium and product
US11836449B2 (en) Information processing device and information processing method for judging the semantic relationship between words and sentences

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40034949

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant