CN112148902A - Data processing method, device, server and storage medium - Google Patents

Data processing method, device, server and storage medium Download PDF

Info

Publication number
CN112148902A
CN112148902A CN202011153160.1A CN202011153160A CN112148902A CN 112148902 A CN112148902 A CN 112148902A CN 202011153160 A CN202011153160 A CN 202011153160A CN 112148902 A CN112148902 A CN 112148902A
Authority
CN
China
Prior art keywords
multimedia data
data
hash code
target
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011153160.1A
Other languages
Chinese (zh)
Inventor
欧子菁
赵瑞辉
林民龙
苏勤亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202011153160.1A priority Critical patent/CN112148902A/en
Publication of CN112148902A publication Critical patent/CN112148902A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/438Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/44Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a data processing method, a data processing device, a server and a storage medium, wherein the method comprises the following steps: obtaining a sample multimedia data set, wherein the sample multimedia data set comprises at least two sample multimedia data; acquiring characteristic information of each sample multimedia data in a sample multimedia data set, wherein the characteristic information comprises data content information or vector distribution characteristic information; performing relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relationship between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result; and constructing a target model according to the data association relation, wherein the target model is used for processing the input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data. A new target model can be constructed, and the speed and the accuracy in data searching can be improved by adopting the new target model.

Description

Data processing method, device, server and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method, an apparatus, a server, and a storage medium.
Background
With the deep development of the current internet technology, currently, in order to output multimedia data meeting search conditions to a user based on search information (or query information) of the user, a trained model is usually adopted to support an information search (or data search) process of the user, wherein the model used for supporting information search at present calls the model to perform data analysis on the multimedia data, so as to determine whether each multimedia data meets the search conditions according to an analysis result of the model on the multimedia data, and output the multimedia data meeting the search conditions. However, with the rapid increase of the number of multimedia data, the data analysis pressure on the multimedia data is getting larger, so that the response speed and accuracy of information search are reduced, and thus how to construct a new model and enable the new model to have higher accuracy and search speed when performing information search becomes a current research hotspot.
Disclosure of Invention
The embodiment of the invention provides a data processing method, a data processing device, a server and a storage medium, which can be used for constructing a new target model, and the speed and the accuracy of data search can be improved by adopting the new target model.
In one aspect, an embodiment of the present invention provides a data processing method, including:
obtaining a sample multimedia data set, the sample multimedia data set comprising at least two sample multimedia data;
acquiring characteristic information of each sample multimedia data in the sample multimedia data set, wherein the characteristic information comprises data content information or vector distribution characteristic information;
performing relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relationship between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result;
and constructing a target model according to the data association relation, wherein the target model is used for processing input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
In another aspect, an embodiment of the present invention provides a data processing apparatus, including:
an obtaining unit, configured to obtain a sample multimedia data set, where the sample multimedia data set includes at least two sample multimedia data;
the acquiring unit is further configured to acquire feature information of each sample multimedia data in the sample multimedia data set, where the feature information includes data content information or vector distribution feature information;
the determining unit is used for carrying out relevance analysis on the characteristic information of any two sample multimedia data and determining the data relevance relation between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result;
and the construction unit is used for constructing a target model according to the data association relation, the target model is used for processing input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
In another aspect, an embodiment of the present invention provides a server, including a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, where the memory is used to store a computer program that supports a terminal to execute the foregoing method, where the computer program includes program instructions, and the processor is configured to call the program instructions to perform the following steps:
obtaining a sample multimedia data set, the sample multimedia data set comprising at least two sample multimedia data;
acquiring characteristic information of each sample multimedia data in the sample multimedia data set, wherein the characteristic information comprises data content information or vector distribution characteristic information;
performing relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relationship between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result;
and constructing a target model according to the data association relation, wherein the target model is used for processing input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
In still another aspect, an embodiment of the present invention provides a computer-readable storage medium, in which program instructions are stored, and when the program instructions are executed by a processor, the computer-readable storage medium is configured to perform the data processing method according to the first aspect.
In the embodiment of the present invention, the server obtains the sample multimedia data set, so that the server can determine the data association relationship between any two sample multimedia data in the sample multimedia data set according to the vector distribution information of the hash code vector of the sample multimedia data and through the association analysis of the vector distribution characteristic information, or the server can also determine the data association relationship between any two sample multimedia data in the sample multimedia data set according to the association analysis of the data content of the sample multimedia data after obtaining the sample multimedia data set. After the server determines the data association relationship between any two sample multimedia data in the sample multimedia data set, a target model can be constructed according to the data association relationship, so that the target model can refer to the data association relationship between the multimedia data, and when the trained target model is used for generating the hash code of the multimedia data, the generated hash code not only contains the semantic features of the multimedia, but also can indicate the data association information between different multimedia data, the hash code quality of the hash code generated by using the target model is improved, and the searching speed and the searching accuracy when the data searching is carried out based on the hash code are also improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1a is a schematic diagram of a model structure of an object model according to an embodiment of the present invention;
FIG. 1b is a diagram illustrating a model structure of an object model according to an embodiment of the present invention;
FIG. 1c is a diagram illustrating a distribution of association constraints introduced into an object model according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart diagram of a data processing method provided by an embodiment of the invention;
FIG. 3 is a schematic flow chart diagram of a data processing method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of data recommendation using a trained target model according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating output feedback multimedia data according to an embodiment of the present invention;
FIG. 6 is a schematic block diagram of a data processing apparatus provided by an embodiment of the present invention;
fig. 7 is a schematic block diagram of a server according to an embodiment of the present invention.
Detailed Description
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like. Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence, and it is a research on various theories and methods that can realize effective communication between people and computers using natural Language, and natural Language processing is a science integrating linguistics, computer science, and mathematics. Therefore, the research in this field will relate to natural language, i.e. the language used by people daily, so it is closely related to the research of linguistics, and natural language processing techniques generally include text processing, semantic understanding, machine translation, robot question and answer, knowledge mapping, and so on. The data processing method provided by the embodiment of the invention refers to the data association relation among the multimedia data when the target model is constructed, so that the introduction of the association relation among the multimedia data is realized, and further, when the hash codes of the multimedia data are generated by adopting the target model after the constructed target model is trained, the association (or similarity) among the hash codes corresponding to the multimedia data with the data association relation is higher, and the association among the hash codes corresponding to the multimedia data without the data association relation is lower.
In one embodiment, in order to obtain a target model through training, a technical developer may send a plurality of sample multimedia data to a server through the terminal device, and after the server obtains the plurality of sample multimedia data, the server may analyze and determine a data association relationship between any two sample multimedia data, so that the data association relationship is introduced into the constructed target model based on the analysis of the data association relationship. In a specific implementation, in order to introduce the data association relationship between the multimedia data into the target model, after determining the sample multimedia data, the server may determine the data association relationship between any two sample multimedia data, so as to construct the target model based on the data association relationship, thereby implementing the introduction of the data association relationship between the multimedia data, in one embodiment, when determining the data association relationship between any two multimedia data, the server may determine the data association relationship according to the distribution condition of the hash code vectors of the multimedia data, specifically, when determining the association relationship between the multimedia data according to the distribution condition of the hash code vectors of the multimedia data, the server may determine the distribution information obeyed by the hash code vectors, so as to determine the number product of the hash code vectors obeyed by any two multimedia data when the hash code vectors are distributed in a target manner, if the larger the number product, the higher the similarity between the hash codes corresponding to the hash code vector, the higher the correlation degree between the corresponding multimedia data (i.e. the data correlation relationship is obtained), wherein the target distribution may be, for example, gaussian distribution.
In one embodiment, the server may determine that the corresponding multimedia data has a data association relationship when the number product between the hash code vectors is greater than a preset number product threshold, and determine that the corresponding multimedia data does not have a data association relationship when the number product is less than or equal to the preset number product threshold, that is, the data association relationship between the multimedia data corresponding to the hash code vectors is stored due to the association relationship between the hash code vectorsIn the indicating function, the determination of the incidence relation among the multimedia data based on the incidence relation of the hash code vectors can be realized. If the hash code corresponding to the multimedia data can guide the generation of the data content of the multimedia data and the generation of the data association relationship between different multimedia data, the hash code of the multimedia data contains the semantic information of the data content in the multimedia data and the data association relationship between different multimedia data, so that the server constructs a target model according to the data association relationship between the multimedia data. The correlation between the multimedia data is determined according to the distribution of the hash code vectors, that is, the data correlation between the multimedia data needs to be generated based on the hash code vectors, so that the data correlation between the multimedia data generated based on the hash code vectors is adopted, and the target model constructed by the data correlation can be referred to as an edge-based generation model. The model structure corresponding to the edge-based generation model may be as shown in fig. 1a, where the edge-based generation model includes an encoder and a decoder, and according to the model structure shown in fig. 1a, after the sample multimedia data is input into the edge-based generation model, hash code vectors may be obtained at the encoder, and the decoder will generate data association relationships between the multimedia data based on the hash code vectors, respectively, where the multimedia data input into the edge-based generation model may be data x in 1aiAnd data xjThe data association generated by the model is w of the model output as shown in FIG. 1aij
In an embodiment, the above method for constructing a model based on edge generation introduces the data association relationship between multimedia data into a generated target model by generating the data association relationship of the multimedia data by using a hash code vector, but since a large amount of sample multimedia data itself does not contain the mutual link relationship between multimedia data, if the data association relationship between multimedia data obtained by using the external auxiliary means is often not accurate enough, but the data association relationship referred to when constructing the model based on edge generation is strongly constrained, when constructing the target model using the data association relationship that is not accurate enough, the strongly constrained relationship may cause noise amplification, so that when determining the hash code of the multimedia data by using the trained target model, the quality of the generated hash code will be greatly reduced. Therefore, when determining the data association relationship between any two multimedia data, the data association relationship between the two multimedia data can be determined after performing association analysis on the data content of the multimedia data, that is, the data association relationship between the multimedia data is introduced through the prior knowledge, and a weak constraint method is adopted to add different constraint conditions for indicating that the prior knowledge has the data association relationship and indicating that the prior knowledge does not have the data association relationship, so that the prior knowledge based on whether the data association relationship exists between the multimedia data is used for constructing the target model, and it can be understood that the target model constructed based on the prior knowledge is the model based on the prior knowledge.
In one embodiment, the model structure corresponding to the a priori knowledge-based model may be as shown in fig. 1b, and the a priori knowledge-based model also includes an encoder and a decoder, such as the model structure shown in fig. 1b, since the sample multimedia data x is being samplediAnd xjAfter the model based on the prior knowledge is input, the multimedia data x is obtained due to the sampleiAnd xjIn order to realize the introduction of the data association relationship, association constraint information is added to the model based on the prior knowledge, and the added association constraint information may be, for example, association constraint 1 and association constraint 2 shown in fig. 1b, where the distribution satisfied by the association constraint 1 and the association constraint 2 may be shown in fig. 1c, where the upper graph in fig. 1c is association constraint 1, and the lower graph is association constraint 2. After the target model is constructed, the target model can be trained, wherein when the trained target model is adopted to generate the hash code of the input target multimedia data, the similarity degree of the hash code of the multimedia data with the data association relation is higher, but the hash code does not existThe similarity degree of the return codes of the multimedia data of the data association relation is lower.
The existing model for generating hash codes of multimedia data, when constructed, only refers to the data content of a single multimedia data, the data association relation between different multimedia data is ignored, so the hash code of the multimedia data generated by the current model only contains the local information of the multimedia data semantics, the global information of the data association relationship between different multimedia data cannot be captured, which results in the low quality of the hash code of the multimedia data generated by the current model, in order to improve the quality of the generated hash code, an embodiment of the present invention provides a data processing method, which is used for, when constructing an object model, and referencing the data association relation among the multimedia data, so that the constructed target model considers the data association relation among the multimedia data when generating the hash code of the multimedia data. Specifically, please refer to fig. 2, which is a schematic flow chart of a data processing method according to an embodiment of the present invention, and as shown in fig. 2, the method may include:
s201, a sample multimedia data set is obtained, where the sample multimedia data set includes at least two sample multimedia data.
In one embodiment, the sample multimedia data set includes sample multimedia data that is randomly selected multimedia data, which may be different types of multimedia data or the same type of multimedia data, wherein the types of multimedia data include: a video type, an audio type, or a text type, etc., it is understood that the obtained sample multimedia data set may only include multimedia data of a video type, or only include multimedia data of an audio type, or only include multimedia data of a text type, or the obtained sample multimedia data set may also be a combination of multimedia data of different types, for example, the sample multimedia data may include multimedia data of a video type and multimedia data of a text type, etc., and in the embodiment of the present invention, the type of the sample multimedia data included in the obtained sample multimedia data set is not limited.
In an embodiment, after the sample multimedia data set is obtained, a data association relationship between any two sample multimedia data in the sample multimedia data set may be further determined, so that the target model may be constructed according to the data association relationship after the data association relationship between any two sample multimedia data is determined. The method includes the steps that a sample multimedia data set comprises a correlation diagram G (V, X, E) used for describing the relation between sample multimedia data besides the sample multimedia data, wherein V represents a node where the sample multimedia data is located, X represents the data content of the sample multimedia data corresponding to the node, and E is the data correlation relation between the sample multimedia data.
The data association relationship between any two sample multimedia data is used for indicating whether the sample multimedia data are related, wherein the data association relationship between the sample multimedia data comprises the following steps: whether the data contents included in the sample multimedia data are related is the same as the data contents of the multimedia data under the same theme but are related to the data contents of the multimedia data under different themes and with too large theme difference, specifically, the data contents of the multimedia data under the correction theme are related but are unrelated to the data contents of the multimedia data under the movie theme. Or if the data content of the reference relationship between the sample multimedia data is related, and the data content of the reference relationship is not related, specifically, for example, the multimedia data a references the multimedia data B, the data content of the multimedia data a and the data content of the multimedia data B are considered to be related, where the reference relationship may be a full reference relationship, or a partial reference relationship, or may also be a direct reference relationship, or an indirect reference relationship, and so on.
S202, acquiring characteristic information of each sample multimedia data in the sample multimedia data set, wherein the characteristic information comprises data content information or vector distribution characteristic information.
S203, performing relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relationship between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result.
In step S202 and step S203, after obtaining the sample multimedia data set, when determining a data association relationship between any two sample multimedia data in the multimedia data set, the server may determine the data association relationship between any two sample multimedia data based on different feature information of each obtained sample multimedia data, and based on different feature information, adopt different association analysis methods. If the characteristic information acquired by the server is vector distribution characteristic information, the server can perform relevance analysis according to the distribution condition of the hash code vectors of the sample multimedia data, so as to determine the data relevance relationship between any two sample multimedia data; or, if the characteristic information acquired by the server is data content information, the server may analyze a reference relationship in the data content of the multimedia data, so as to determine a data association relationship between any two sample multimedia data. That is to say, the data association relationship between any two pieces of multimedia data determined by the server may be determined according to the distribution of the hash code vectors of the sample multimedia data, or may also be determined according to the reference relationship of the data content of the sample multimedia data.
In one embodiment, one sample multimedia data corresponds to one hash code vector, if the feature information of each sample multimedia data acquired by the server is vector distribution feature information, and the distribution of the hash code vector of each sample multimedia data in the multimedia data set belongs to target distribution; then, the server determines a data association relationship between any two multimedia data, that is, the server performs association analysis on the obtained vector distribution feature information of each sample multimedia data, where the association analysis on the vector distribution feature information by the server includes: determining the quantity product of hash code vectors corresponding to any two sample multimedia data, and comparing the quantity product with a preset quantity product threshold value; and taking the comparison result of the number and the preset number product threshold value as the correlation analysis result of the random two sample multimedia data. If the server performs a quantity product operation on the hash code vectors of any two multimedia data, and the obtained quantity product is larger than a preset quantity product threshold value, the server can determine that any two hash code vectors which obtain the quantity product have relevance, so that the sample multimedia data corresponding to the hash code vectors also have a data association relation; and if the server performs a quantity product operation on the hash code vectors of any two multimedia data, and the obtained quantity product is less than or equal to the preset quantity product threshold, determining that any two hash code vectors of the quantity product do not have relevance, and determining that the sample multimedia data corresponding to the hash code vectors do not have a data association relationship. The target distribution may be, for example, a gaussian distribution.
In one embodiment, if the characteristic information of each sample multimedia data acquired by the server is data content information, the server determines a data association relationship between any two multimedia data, that is, the server performs association analysis on the data content information of each acquired sample multimedia data, where the association analysis on the data content information by the server includes: detecting the reference relation of any two sample multimedia data according to the data content information of any two sample multimedia data; taking the result of the reference relationship detection as a result of relevance analysis of the any two sample multimedia data, wherein if the result of the reference relationship detection indicates that the data contents of the any two multimedia data have a reference relationship, it is determined that the any two sample multimedia data have a data relevance relationship; and if the reference relation detection result indicates that the data contents of any two multimedia data do not have the reference relation, determining that no data association relation exists between any two sample multimedia data.
And S204, constructing a target model according to the data association relation, wherein the target model is used for processing input target multimedia data and generating a target hash code, and the target hash code is used for acquiring feedback multimedia data related to the target multimedia data.
In one embodiment, if the server determines the data association relationship between any two sample multimedia data, the target model may be constructed based on the data association relationship, wherein the data association relationship between any two sample multimedia data is determined differently based on the manner in which the server determines the data association relationship, the distribution function satisfied by the server-built object model is different and, in one embodiment, the target model satisfies a distribution function that is a joint probability distribution function that is either a first joint probability distribution function or a second joint probability distribution function, when the manner in which the server determines the data associations is based on relevance analysis determination for the vector distribution characteristic information, the distribution function satisfied by the target model is then a first joint probability distribution function, which in one embodiment may be, for example, as shown in equation 1.1:
pθ(xi,xj,zi,zj,eij)=pθ(xi|zi)pθ(xj|zj)pθ(wij|zi,zj)p(zi)p(zj) Formula 1.1
Wherein the first joint probability distribution function is constructed by guiding the generation of the content of the multimedia data and the data association relation through a Hash code vector, xiAnd xjFor any two samples of multimedia data, ziAnd zjIs a hash code vector, p, corresponding to the arbitrary two sample multimedia dataθ(xj|zj) For tracingAccording to ziThe generated multimedia data is sample multimedia data xiProbability of pθ(xj|zj) For describing the formula according to zjThe generated multimedia data is sample multimedia data xjProbability of pθ(wij|zi,zj) For describing the formula according to ziAnd zjGenerating sample multimedia data xjAnd xjData association relation w betweenijProbability of p (z)i) And p (z)j) The feature information is distributed for the vector.
In one embodiment, a first joint probability distribution function is used to describe the hash code vector ziAnd zjAnd indicating multimedia data xiAnd multimedia data xjData association relation e betweenij(or w)ij) The parameter θ in the joint probability distribution function is a model parameter of the target model, and it can be understood that the training of the target model includes training the parameter θ, and when the parameter θ obtains an optimal parameter, the training of the target model is completed. In one embodiment, p (z) (i.e., p (z))i) And p (z)j) Obey a standard Gaussian distribution, or a prior distribution obeying a Bernoulli distribution, pθ(wij|zi,zj) When complying with Bernoulli distribution, i.e.
Figure BDA0002739727300000101
After the target model described based on the first joint probability distribution function is trained, the generation of the hash code introducing the data association relationship into the multimedia data can be realized, namely, after the target model described by the first joint probability distribution function is trained, if the multimedia data input into the trained target model has the data association relationship, the corresponding hash code generated by adopting the trained target model also has the association relationship.
In one embodiment, the target model satisfies the second joint probability distribution function if the data correlations are determined from a correlation analysis, wherein the second joint probability distribution functionThe joint probability distribution function is constructed according to prior knowledge obtained by analyzing the relevance of the data relevance relation of the multimedia data, and in order to introduce the prior knowledge, the server can be used for observing data (including the multimedia data x)jAnd multimedia data xjAnd observed data association relation w for indicating multimedia dataij) Conditional probability distribution p (z)i,zj|wij) Constructing the second joint probability distribution function, wherein the second joint probability distribution function may be represented by formula 1.2:
pθ(xi,xj,zi,zj|wij)=pθ(xi|zi)pθ(xj|zj)p(zi,zj|wij) Formula 1.2
Wherein p isθ(xj|zj) Also for the description according to ziThe generated multimedia data is sample multimedia data xiProbability of pθ(xj|zj) Also for the description according to zjThe generated multimedia data is sample multimedia data xjProbability of p (z)i,zj|wij) The hash code vector corresponding to the observed data association relationship between the two multimedia data and the hash code vector corresponding to the observed data association relationship between the two multimedia data, namely, the hash code vector corresponding to the observed data association relationship between the two multimedia dataijWhen the hash code vector is 1, corresponding to two multimedia data with data association relationship, at wijAnd when the hash code vector is 0, the hash code vector corresponds to two multimedia data without data association relationship. When the objective function satisfies the second joint probability distribution function shown in the formula 1.2, in order to implement the introduction of the data association relationship, the introduction of the data association relationship may be performed by adding an association constraint condition, and the distribution corresponding to the added association constraint condition may be, for example, as shown in fig. 1 c.
After the target model is constructed, the target model can be trained by using the sample multimedia data included in the sample multimedia data set, and the trained target model is obtained. After the trained target model is obtained, the trained target model can be used for processing the multimedia data, so that the hash codes corresponding to the multimedia data with data association relation are more similar, while the hash codes corresponding to the multimedia data without data association relation are different, wherein when the similarity degree of two different hash codes is compared, the determination can be performed based on the hamming distance of different hash codes in a low-dimensional hamming space, the hamming distance refers to the number of different bit values in the two hash codes and is called hamming distance, specifically, the two hash codes can be subjected to exclusive or operation, and the statistical result is 1, and the number of 1 is the hamming distance between the two hash codes.
In one embodiment, when determining the hamming distances of different hash codes in the low-dimensional space, the hash codes may be mapped from the high-dimensional hash code vector to the low-dimensional hamming space by using semantic hashing, and after the high-dimensional hash code vector is mapped to the low-dimensional hamming space by using semantic hashing, the similarity of the original space vectors is maintained, so that the hamming distance of the new space vector may reflect the similarity of the original space vectors. Therefore, after the trained target model is obtained, the hash code of each multimedia data can be determined according to the trained target model, so that the data association relation between the corresponding multimedia data is determined based on the hash code of each multimedia data, further, according to the data association relation, the recommendation of the multimedia data or the similarity search can be carried out, wherein the similarity search is also called nearest neighbor search, the purpose is to find the target feedback multimedia data which is most similar to the query request in a large-scale database (namely a feedback multimedia database) according to the query request of a user, so that the user can quickly find the required multimedia data, the improvement of the search speed and the search precision is realized, because the data association relation between the multimedia data is referred when the target model is constructed, and when the hash code of the multimedia data is generated after the training is finished, the data association relation is contained in the generated hash codes, so that the hash codes of similar multimedia data are similar, and the retrieval precision is improved when similarity search is carried out.
In the embodiment of the present invention, the server obtains the sample multimedia data set, so that the server can determine the data association relationship between any two sample multimedia data in the sample multimedia data set according to the vector distribution information of the hash code vector of the sample multimedia data and through the association analysis of the vector distribution characteristic information, or the server can also determine the data association relationship between any two sample multimedia data in the sample multimedia data set according to the association analysis of the data content of the sample multimedia data after obtaining the sample multimedia data set. After the server determines the data association relationship between any two sample multimedia data in the sample multimedia data set, a target model can be constructed according to the data association relationship, so that the target model can refer to the data association relationship between the multimedia data, and when the trained target model is used for generating the hash code of the multimedia data, the generated hash code not only contains the semantic features of the multimedia, but also can indicate the data association information between different multimedia data, the hash code quality of the hash code generated by using the target model is improved, and the searching speed and the searching accuracy when the data searching is carried out based on the hash code are also improved.
Referring to fig. 3, a schematic flow chart of a data processing method according to an embodiment of the present invention is shown in fig. 3, where the method includes:
s301, a sample multimedia data set is obtained, where the sample multimedia data set includes at least two sample multimedia data.
S302, obtaining characteristic information of each sample multimedia data in the sample multimedia data set, wherein the characteristic information comprises data content information or vector distribution characteristic information.
S303, performing relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relationship between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result.
In an embodiment, the specific implementation of steps S301 to S303 may refer to the specific implementation of steps S201 to S203 in the above embodiment, and is not described herein again.
S304, constructing a target model according to the data association relation, wherein the target model is used for processing input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
S305, determining a reward function for training the target model.
In step S304 and step S305, after determining the data association relationship between different sample multimedia data, an object model may be constructed according to the data association relationship, and the object model is trained, so as to perform multimedia data recognition by using the trained object model, thereby obtaining the hash code corresponding to the multimedia data. In an embodiment, if the constructed target model satisfies the joint probability distribution function as the first joint probability distribution function shown in the above formula 1.1, where the sample multimedia data x in the sample multimedia data set can be represented as a set of corresponding data contents, and if the sample multimedia data is a document, the sample multimedia data x can be represented as a set of words, specifically, the representation of the sample multimedia data x in the sample multimedia data set as a set of corresponding data contents can be shown in formula 1.3:
x={w1,w2,…,w|x|formula 1.3
Where x is any one sample of multimedia data, wiThen indicating the ith data content in the sample data, if x is the sample document, wiThen the ith word in the sample document is represented, and further, each data content can be represented as a multi-class label vector (e.g., one-hot vector) in | V | dimension; | x | represents the number of data contents in the sample multimedia data (such as the number of words used for representing a sample document), and | V | represents the number of all data contents in the sample multimedia data set (such as the total number of words included in the sample document set)Then p in equation 1.1 for describing the probability that the multimedia data generated from the hash code vector is the corresponding sample multimedia dataθ(x | z), which can be decomposed into an expression as shown in equation 1.4:
Figure BDA0002739727300000131
wherein p isθ(wiLz) is used to describe the probability that the data content of the multimedia data generated from the hash-code vector, exactly the data content in the corresponding sample multimedia data x, i.e. pθ(wiLz) may be used to describe the probability that a word generated from a hash-code vector is the word of the original sample document, where pθ(wi| z) is represented by formula 1.5:
Figure BDA0002739727300000132
where exp represents an exponential function, W ∈ Rd×|V|Is a parameter matrix, d is the dimension of the hash code vector (hidden variable) z, biIs the model bias term, then the parameter θ of the target model is { W, b ═ Wi,...,b|V|}. It can be understood that, after all the sample multimedia data in the sample multimedia data set are trained and updated, the server can obtain the joint probability distribution function p of the hash code vector and the sample multimedia dataθ(x, z). Joint probability distribution function p based on hash code vectors and sample multimedia dataθ(x, z), the hash code vector z corresponding to each multimedia data x can pass through the posterior distribution function pθ(z | x) was determined. In one embodiment, the posterior distribution function pθ(z | x) for indicating a probability of invoking the target model to generate each reference hash code vector of the input target multimedia data based on the a posteriori distribution function pθ(z | x), the trained target model may use the hash code indicated by the reference hash code vector corresponding to the maximum probability as the target hash code of the target multimedia data, andand outputting the target hash code.
In one embodiment, p is a function of posterior distributionθ(z | x) is difficult to compute, so the server can use the principle of variational inference to pair the posterior distribution function pθ(z | x) approximation, where the principle of variational inference is a method for dealing with the difficult integration of the cases that occur in Bayesian inference and machine learning, then it will be appreciated that the posterior distribution function p can be taken when the lower bound of the variational inference takes a maximum valueθ(z | x) the closest approximation function, where the expression for the lower bound of the variation can be as shown in equation 1.6:
Figure BDA0002739727300000141
wherein the introduced function is used for the posterior distribution function pθ(z | x) approximation function
Figure BDA0002739727300000142
Then
Figure BDA0002739727300000143
For representation in approximating functions
Figure BDA0002739727300000144
Is the relative entropy used to indicate the difference in information between the two probability distributions, it will be appreciated that,
Figure BDA0002739727300000145
for representing approximation functions
Figure BDA0002739727300000146
And p in the first joint probability distribution functionθ(zi) The information difference value of the distribution function is,
Figure BDA0002739727300000147
for representing approximation functions
Figure BDA0002739727300000148
And p in the first joint probability distribution functionθ(zj) The information difference of the distribution function. In one embodiment, based on the concept of variation, when the expression of the lower variation bound shown in equation 1.6 is maximized, an approximation function closest to the posterior distribution function is obtained, that is, the training of the target model is to obtain the maximum value of the expression of the lower variation bound shown in equation 1.6.
In one embodiment, if the data association relationship between the multimedia data cannot be directly determined, the data association relationship wijIt can be assumed as a hidden variable and inferred according to the principle of variation inference, so that when the joint probability distribution function satisfied by the target model is the first joint probability distribution function shown as 1.1, w cannot be directly determinedijThe expression for the lower bound of time-varying score can be as shown in equation 1.7:
Figure BDA0002739727300000151
wherein the introduced function is used for the posterior distribution function pθ(z | x) approximation function
Figure BDA0002739727300000152
Figure BDA0002739727300000153
For representation in the function domain
Figure BDA0002739727300000154
The expected value of (c) is,
Figure BDA0002739727300000155
for representation in the function domain
Figure BDA0002739727300000156
The expected value of (c) is,
Figure BDA0002739727300000157
is based on wijThe operator that sums the values of (a) and (b),
Figure BDA0002739727300000158
is an operator. It will be appreciated that w can be determined whether or not it is straightforwardijWhen the joint probability distribution function satisfied by the target model is the first joint probability distribution function shown as 1.1, the reward function for training the target model is shown as formula 1.8:
Figure BDA0002739727300000159
in one embodiment of the present invention,
Figure BDA00027397273000001510
to sum the lower bounds of variation shown in equation 1.6,
Figure BDA00027397273000001511
to sum the lower bounds of the variational variables shown in equation 1.7. It can be understood that, when the joint probability distribution function satisfied by the target model is the first joint probability distribution function as shown in 1.1, the expression of the reward function obtained by training the target model is the expression corresponding to the formula 1.8.
In an embodiment, if the joint probability distribution function satisfied by the constructed target model is the second joint probability distribution function shown in the above formula 1.2, the server makes hash codes between multimedia data having a data association relationship as similar as possible by introducing two different association constraints through a weak constraint method, where the introduced association constraints may be an expression of prior distribution shown in the formula 2.1:
Figure BDA00027397273000001512
wherein p is1(zi,zj) And p0(zi,zj) Are respectively provided withRepresenting any two samples of multimedia data (x)i,xj) A priori distribution at strong and weak correlations, at wijWhen 1, any two samples of multimedia data (x) are describedi,xj) Strongly associated (i.e. having a data association relationship), and at wijWhen 0, any two samples of multimedia data (x) are describedi,xj) Weakly associated (i.e., without data association).
In one embodiment, the expression of the prior distribution mentioned above can be specifically referred to as the following formula 2.2:
Figure BDA0002739727300000161
Figure BDA0002739727300000162
wherein p is1(zi,zj) Expression of Hash code vectors for data association, p0(zi,zj) Identifying an expression of a hash code vector without data association, d representing a vector dimension, λ being ziAnd zjDegree of correlation between corresponding dimensions, where λ ∈ (0, 1)]That is, although the prior distribution lets zi,zjEach dimension of itself is independent of another, but ziAnd zjThe correlation degree between corresponding dimensions is determined by a coefficient lambda, and through the special prior distribution, each bit of the hash code between sample multimedia data with the data association relationship is in positive correlation, so that the data association relationship between the implied multimedia data is introduced into the model. Similarly, when the server calls the target model to determine the hash code z corresponding to the input multimedia data x, the server is also based on the posterior distribution function pθThe (z | x) is determined, and the server also adopts the variation inference principle to carry out the posterior distribution function pθ(z | x) is approximated. In one embodiment, the joint probability distribution function in the target model isWhen the second combined probability distribution function is shown as formula 1.2, the expression of the lower bound of variation introduced to obtain the approximation function of the lag distribution function can be shown as formula 2.3:
Figure BDA0002739727300000163
wherein q is0In order to introduce an approximation function representation after association constraints for the second combined profile distribution function in dependence on the absence of data associations between sample multimedia data, q1According to the condition that data association relation exists between sample multimedia data, introducing association constraint conditions for a second association profile distribution function to obtain an approximate function representation; then
Figure BDA0002739727300000171
For representation in an approximation function q1The expected value of (c) is,
Figure BDA0002739727300000172
for representation in an approximation function q0Is calculated from the expected value of (c). Similarly, when the lower bound of variation shown in equation 2.3 is maximized, the approximation function q closest to the posterior distribution function is obtained0And q is1Then the data association W is not directly determined when the data association between the multimedia data cannot be directly determinedijIt can be assumed as a hidden variable and inferred according to the principle of variational inference, and when the joint probability distribution function satisfied by the target model is the second joint probability distribution function as shown in 1.2, it is impossible to directly determine WijThe expression for the lower bound of time-varying score can be as shown in equation 2.4:
Figure BDA0002739727300000173
wherein the content of the first and second substances,
Figure BDA0002739727300000174
for expressing the expected value, KL, at an approximation function qIs the relative entropy, the difference of the information used to indicate the two probability distributions, then
Figure BDA0002739727300000175
For representing approximation functions
Figure BDA0002739727300000176
And p (x) in the second combined probability distribution functioni,xj,wij) The information difference of the distribution function. It will be appreciated that w can be determined whether or not it is straightforwardijWhen the joint probability distribution function satisfied by the target model is the second joint probability distribution function shown as 1.2, the reward function for training the target model is shown as formula 2.5:
Figure BDA0002739727300000177
wherein the content of the first and second substances,
Figure BDA0002739727300000178
to sum the lower bounds of the variational contributions shown in equation 2.3,
Figure BDA0002739727300000179
to sum the lower bounds of variation shown in equation 2.4, it can be understood that, when the joint probability distribution function satisfied by the target model is the second joint probability distribution function shown in equation 1.2, the expression of the reward function obtained by training the target model is the expression corresponding to equation 2.5. After the server determines the reward function for training the goal model, the goal model may be trained in a direction in which the sample multimedia data included in the sample multimedia data set increases toward the reward function, i.e., the steps S306 and S307 may be performed instead.
S306, sample multimedia data included in the sample multimedia data set is adopted, and a joint probability distribution function and a probability calculation function which are satisfied by the target model are trained according to the direction of increasing the reward function.
S307, finishing the training of the target model when the function value of the reward function meets a preset threshold value.
In steps S306 and S307, the reward function determined by the server may be the reward function shown in the above equation 1.8, or the reward function shown in the above equation 2.5, so that when the reward function obtains the maximum value, the server may determine that the function value of the reward function satisfies the preset threshold, and complete the training of the target model. In an embodiment, after the training of the target model is completed, the trained target model may be invoked to generate a plurality of reference hash code vectors of the input target multimedia data, so that a probability value that each reference hash code vector is a theoretical hash code vector of the target multimedia data may be determined according to a probability calculation function, and a target probability value that satisfies a preset condition is selected from the probability values, further, a target hash code vector corresponding to the target probability value may be determined, and the target hash code of the target multimedia data may be determined according to the hash code indicated by the target hash code vector. Specifically, when determining the target hash code of the target multimedia data according to the hash code indicated by the target hash code vector, the server may perform binarization processing on the hash code indicated by the target hash code vector, and use the hash code after the binarization processing as the target hash code of the target multimedia data, wherein when performing the binarization processing on the target hash code vector, the server may determine a processing threshold value in advance, so that each bit of the target hash code vector z may be further compared with the processing threshold value, if the processing threshold value is greater than the processing threshold value, the corresponding position of the target hash code is adjusted to 1, and when the processing threshold value is less than or equal to the processing threshold value, the corresponding position is adjusted to 0, it should be noted that when determining the processing threshold value for performing the binarization processing, the processing threshold value should obey the entropy maximization principle, through the binarization processing, the target hash-code vector z may be mapped to a hash-code containing only 0 and 1.
In an embodiment, the probability calculation function is an approximation function of the posterior distribution function, specifically, when determining the probability calculation function, the server may first determine a theoretical probability function (i.e., the posterior distribution function) used when obtaining a theoretical hash code vector according to a joint probability distribution function satisfied by the target model, where the theoretical probability function is used to indicate a probability that multimedia data input to the target model generates the theoretical hash code vector when satisfying the joint probability distribution function; thereby, an approximation function of the theoretical probability function can be determined, and the probability calculation function can be determined from the approximation function.
In one embodiment, if the joint probability distribution function satisfied by the target model is the first joint probability distribution function as shown in equation 1.1, then the server adopts
Figure BDA0002739727300000181
To approximate the true posterior distribution pθ(zi,zj|xi,xj,eij) Since semantic information of multimedia data may imply data association relationship between multimedia data, the expression of the approximation function may be as shown in equation 2.6:
Figure BDA0002739727300000191
wherein the content of the first and second substances,
Figure BDA0002739727300000192
is a Gaussian distribution:
Figure BDA0002739727300000193
Figure BDA0002739727300000194
and
Figure BDA0002739727300000195
is the output of the target model with x as input. Therefore, for multimedia data x, the hash code output by calling the target model is
Figure BDA0002739727300000196
That is, the training of the target model, i.e., the model parameter θ and the model parameter
Figure BDA0002739727300000197
Training is carried out and the lower bound of variation is reached
Figure BDA0002739727300000198
At maximum, obtaining optimal model parameter theta and model parameter
Figure BDA0002739727300000199
In an embodiment, if the joint probability distribution function satisfied by the target model is a second joint probability distribution function as shown in formula 1.2, since the server implements introduction of the data association relationship between the multimedia data by applying association constraint to the hash code, an approximation function of the posterior distribution function of the second joint probability distribution function is as shown in formula 2.7:
Figure BDA00027397273000001910
Figure BDA00027397273000001911
wherein the content of the first and second substances,
Figure BDA00027397273000001912
i.e. q as described above1
Figure BDA00027397273000001913
I.e. q as described above0In addition, in the case of a single-layer,
Figure BDA00027397273000001914
to know
Figure BDA00027397273000001915
Mean and diagonal covariance matrices of the Gaussian distribution, gamma, respectivelyij∈Rd×dAlso diagonal matrix, gammaijIs in the value range of [0, 1 ]]Is shown as ziAnd zjWhether or not there is a correlation between, parameters
Figure BDA00027397273000001916
Is represented by (x)i,xj) As the output corresponding to the target model input. In one embodiment, based on the introduced prior distribution function, the explicit requirement that hidden variables (namely hash code vectors) between sample multimedia data with data association relations are similar as much as possible is realized, and for sample multimedia data without data association relations, association relations are not imposed, so that not only is the data association information between sample multimedia data introduced into hash codes realized, but also the problem of noise introduced by the auxiliary means for constructing the data association relations is solved. Then, in this case, the lower bound logp (x) is variedi,xj|wij) At maximum, the optimal model parameter theta and model parameter are obtained
Figure BDA00027397273000001917
In one embodiment, the model parameters θ and model parameters optimized in obtaining the target model
Figure BDA00027397273000001918
Then, the training of the target model is completed, and in order to determine the quality of the hash code generated by the trained target model, the trained target model is compared with the existing best model, as shown in table 1:
TABLE 1
8 bits 16 bits 32 bits 64 bits 128 bits
First coding (VDSH) model 0.433 0.6853 0.7108 0.4410 0.5847
Second coding (BMSH) model 0 0.7062 0.7481 0.7519 0.7450
Edge-based object model 0.7358 0.7982 0.8364 0.8474 0.8491
Target model based on prior knowledge 0.7546 0.8345 0.8563 0.8665 0.8676
As can be seen from table 1, no matter whether the joint probability distribution function of the object model is the first joint probability distribution function (that is, the object model is an edge-based object model) or the second joint probability distribution function (that is, the object model is an object model based on a priori knowledge), at each bit rate, the quality of the hash code generated by the object model is higher than that generated by the existing model, and thus, the quality of the hash code can be significantly improved by the object model constructed based on the data association relationship between the multimedia data.
After the target model is trained, the trained target model can be applied to information search, the search information obtained through search is output in a sequence according to the degree of correlation with the search information, specifically, the server can obtain a query request, and call the trained target model to determine a query hash code corresponding to the query information in the query request, so that the server can obtain a feedback multimedia database, wherein the feedback multimedia database comprises a plurality of feedback multimedia data and hash codes of each feedback multimedia data, and the hash code of each feedback multimedia data is determined by the server calling the target model in advance; after the server determines the query hash code and the hash code of each feedback multimedia data, the server can select the target feedback multimedia data matched with the query hash code from the feedback multimedia database according to the query hash code and the hash code of each feedback multimedia data. In one embodiment, when the server selects the target feedback multimedia data matched with the query hash code from the feedback multimedia database according to the query hash code and the hash code of each feedback multimedia data, a hamming distance between the query hash code and the hash code of each feedback multimedia data may be calculated first, so that the feedback multimedia data with the hamming distance less than or equal to a preset distance threshold may be selected from the feedback multimedia database as the target feedback multimedia data.
In one embodiment, as shown in fig. 4, if the trained target model is applied to the health query search field after the target model is trained, and the query information in the query request acquired by the server is "what is to be noticed by skin allergy" as marked by 40 in fig. 4, the server will perform steps s 11-s 14 after acquiring the query information, and specifically,
s11, the server converts the query information into a target feature vector. The target feature vector may be, for example, a (Term Frequency-Inverse Document Frequency, TFIDF) feature vector used for information retrieval and data mining;
s12, after obtaining the feature vector, obtaining a query hash code of the query information through a trained target model, where the target model that can be called by the server may be an edge-based target model whose corresponding joint probability distribution function is a first joint probability distribution function, or the target model may also be a priori knowledge-based target model whose corresponding joint probability distribution function is a second joint probability distribution function. In addition, the server also calls a trained target model in advance to determine the hash code of each feedback multimedia data, wherein when the trained target model is called to determine the hash code of each feedback multimedia data, the feedback multimedia data x pass through a function
Figure BDA0002739727300000211
And mapping into a hash code vector z, carrying out binarization processing on the hash code vector z, and mapping to obtain a corresponding hash code.
s13, determining the hamming distance between the query hash code and the hash code of the existing feedback multimedia data, specifically, the server may perform xor operation on the query hash code and the hash code of each feedback multimedia data after determining the query hash code and the hash code of each feedback multimedia data to obtain the hamming distance between the query hash code and each feedback multimedia data, wherein the closer the hamming distance is, the stronger the correlation between the corresponding hash codes is, the stronger the correlation between the multimedia data corresponding to the hash code and the query information is,
and s14, outputting feedback multimedia data associated with the query information, wherein the server can display the feedback multimedia data in the user interface in sequence according to the distance of the hamming distance after determining the hamming distance between the query hash code and each feedback multimedia data. Specifically, when the input query information is "what is to be noticed by skin allergy", the feedback multimedia data output to the user interface display may be as shown in fig. 5.
In one embodiment, in order to determine the search accuracy and the search speed of the target model, the target model proposed in the embodiment of the present invention is compared with the result of information search based on the existing language model, which may be, for example, a bert (bidirectional Encoder retrieval from transforms) model, as shown in table 2.
TABLE 2
Model (model) Accuracy of Speed of response Characteristic amount
Bert model 0.946 200 milliseconds 11 hundred million
Edge-based object model 0.9161 1.2 milliseconds 0.05 hundred million
Target model based on prior knowledge 0.9226 1.3 milliseconds 0.05 hundred million
As shown in table 2, compared with the existing model, the target model has a smaller difference in retrieval accuracy, but the search speed is about 1000 times that of the existing model, which greatly improves the data distribution capability of the server, thereby realizing the improvement of the search speed.
In the embodiment of the invention, after the server determines the data association relationship between any two multimedia data in the sample multimedia data set, the target model can be constructed based on the data association relationship, after the target model is constructed and obtained, the server can determine different reward functions according to the joint probability distribution function satisfied by the target model to train the target model, thereby obtaining a trained target model, the server calls the trained target model to determine that the quality of the hash code of the input multimedia data is higher, and the data association relation among the multimedia data is introduced into the hash code, therefore, when the server calls the trained hash codes of the multimedia data, the similarity degree of the hash codes of the similar multimedia data is higher, and the relevance of reflecting the multimedia data by the hash codes is realized.
Based on the description of the above data processing method embodiment, an embodiment of the present invention further provides a data processing apparatus, which may be a computer program (including a program code) running in the server. The data processing apparatus may be used to execute the data processing method as shown in fig. 2 and fig. 3, please refer to fig. 6, and the data processing apparatus includes: an acquisition unit 601, a determination unit 602 and a construction unit 603.
An obtaining unit 601, configured to obtain a sample multimedia data set, where the sample multimedia data set includes at least two sample multimedia data;
the obtaining unit 601 is further configured to obtain feature information of each sample multimedia data in the sample multimedia data set, where the feature information includes data content information or vector distribution feature information;
a determining unit 602, configured to perform relevance analysis on feature information of any two sample multimedia data, and determine a data relevance relationship between any two sample multimedia data in the sample multimedia data set according to a result of the relevance analysis;
a constructing unit 603, configured to construct a target model according to the data association relationship, where the target model is configured to process input target multimedia data and generate a target hash code, and the target hash code is configured to obtain multimedia data related to the target multimedia data.
In one embodiment, one sample multimedia data corresponds to one hash code vector; if the characteristic information is vector distribution characteristic information, and the distribution of the hash code vector of each sample multimedia data in the multimedia data set belongs to target distribution; the determining unit 602 is specifically configured to:
determining the quantity product of hash code vectors corresponding to any two sample multimedia data, and comparing the quantity product with a preset quantity product threshold value;
and taking the comparison result of the number and the preset number product threshold value as the correlation analysis result of the random two sample multimedia data.
In an embodiment, the determining unit 602 is specifically configured to:
if the number product is larger than the preset number product threshold value, determining that the random two sample multimedia data have a data association relation;
and if the number product is less than or equal to the preset number product threshold value, determining that no data association relation exists between any two sample multimedia data.
In an embodiment, if the characteristic information is data content information, the determining unit 602 is specifically configured to:
detecting the reference relation of any two sample multimedia data according to the data content information of any two sample multimedia data;
and taking the result of the reference relation detection as the correlation analysis result of the any two sample multimedia data.
In an embodiment, the determining unit 602 is specifically configured to:
if the reference relation detection result indicates that the data contents of any two multimedia data have reference relation, determining that the any two sample multimedia data have data association relation;
and if the reference relation detection result indicates that the data contents of any two multimedia data do not have the reference relation, determining that no data association relation exists between any two sample multimedia data.
In one embodiment, the constructed target model satisfies a joint probability distribution function, which is a first joint probability distribution function or a second joint probability distribution function;
and if the data association relation is determined according to the association analysis of the data content information, the target model meets the second joint probability distribution function.
In one embodiment, the apparatus further comprises: a training unit 604.
The determining unit 602 is further configured to determine a reward function for training the target model;
a training unit 604, configured to train a joint probability distribution function and a probability calculation function that are satisfied by the target model according to a direction in which the reward function increases, by using sample multimedia data included in the sample multimedia data set;
the training unit 604 is further configured to complete training of the target model when the function value of the reward function satisfies a preset threshold.
In one embodiment, the apparatus further comprises: a generating unit 605.
A generating unit 605, configured to invoke the trained target model to generate a plurality of reference hash code vectors of the input target multimedia data;
the determining unit 602 is further configured to determine, according to a probability calculation function, a probability value of each reference hash code vector as a theoretical hash code vector of the target multimedia data, and select a target probability value meeting a preset condition from the probability values;
the determining unit 602 is further configured to determine a target hash code vector corresponding to the target probability value, and determine a target hash code of the multimedia data input into the target model according to a hash code indicated by the target hash code vector.
In an embodiment, the determining unit 602 is further configured to determine, according to a joint probability distribution function satisfied by the target model, a theoretical probability function used when the theoretical hash code vector is obtained, where the theoretical probability function is used to indicate a probability that the target model generates the theoretical hash code vector when the joint probability distribution function is satisfied;
the determining unit 602 is further configured to determine an approximation function of the theoretical probability function, and determine the probability calculation function according to the approximation function.
In an embodiment, the determining unit 602 is specifically configured to:
and carrying out binarization processing on the hash code indicated by the target hash code vector, and taking the hash code after binarization processing as the target hash code of the target multimedia data.
In one embodiment, the apparatus further comprises: a selection unit 606.
The obtaining unit 601 is further configured to obtain a query request, and call a trained target model to determine a query hash code corresponding to query information in the query request;
the obtaining unit 601 is further configured to obtain a feedback multimedia database, where the feedback multimedia database includes a plurality of feedback multimedia data and a hash code of each feedback multimedia data;
a selecting unit 606, configured to select, according to the query hash code and the hash code of each feedback multimedia data, a target feedback multimedia data that matches the query hash code from the feedback multimedia database.
In an embodiment, the selecting unit 606 is specifically configured to:
calculating a Hamming distance between the query hash code and the hash code of each feedback multimedia data;
and selecting feedback multimedia data with the Hamming distance less than or equal to a preset distance threshold value from the feedback multimedia database as target feedback multimedia data.
In the embodiment of the present invention, the obtaining unit 601 obtains the sample multimedia data set, so that the determining unit 602 can determine the data association relationship between any two sample multimedia data in the sample multimedia data set according to the vector distribution information of the hash code vector of the sample multimedia data by performing association analysis on the vector distribution characteristic information, or the determining unit 602 can determine the data association relationship between any two sample multimedia data in the sample multimedia data set according to the association analysis performed on the data content of the sample multimedia data after the obtaining unit 601 obtains the sample multimedia data set, and further the constructing unit 603 can construct the target model according to the data association relationship after determining the data association relationship between any two sample multimedia data in the sample multimedia data set, the target model can refer to the data association relation among the multimedia data, and when the trained target model is adopted to generate the hash code of the multimedia data, the generated hash code not only contains the semantic features of the multimedia, but also can indicate the data association information among different multimedia data, so that the hash code quality of the hash code generated by the target model is improved, and the searching speed and the searching accuracy during data searching based on the hash code are also improved.
Fig. 7 is a schematic block diagram of a structure of a server according to an embodiment of the present invention, where the server may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, and a big data and artificial intelligence platform. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The server in the present embodiment as shown in fig. 7 may include: one or more processors 701; one or more input devices 702, one or more output devices 703, and memory 704. The processor 701, the input device 702, the output device 703, and the memory 704 are connected by a bus 705. The memory 704 is used to store a computer program comprising program instructions, and the processor 701 is used to execute the program instructions stored by the memory 704.
The memory 704 may include volatile memory (volatile memory), such as random-access memory (RAM); the memory 704 may also include a non-volatile memory (non-volatile memory), such as a flash memory (flash memory), a solid-state drive (SSD), etc.; the memory 704 may also comprise a combination of the above types of memory.
The processor 701 may be a Central Processing Unit (CPU). The processor 701 may further include a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a Programmable Logic Device (PLD), or the like. The PLD may be a field-programmable gate array (FPGA), a General Array Logic (GAL), or the like. The processor 701 may also be a combination of the above structures.
In an embodiment of the present invention, the memory 704 is configured to store a computer program, the computer program includes program instructions, and the processor 701 is configured to execute the program instructions stored in the memory 704, so as to implement the steps of the corresponding methods as described above in fig. 2 and fig. 3.
In one embodiment, the processor 701 is configured to call the program instructions to perform:
obtaining a sample multimedia data set, the sample multimedia data set comprising at least two sample multimedia data;
acquiring characteristic information of each sample multimedia data in the sample multimedia data set, wherein the characteristic information comprises data content information or vector distribution characteristic information;
performing relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relationship between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result;
and constructing a target model according to the data association relation, wherein the target model is used for processing input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
In one embodiment, one sample multimedia data corresponds to one hash code vector, and if the feature information is vector distribution feature information, the distribution of the hash code vector of each sample multimedia data in the multimedia data set belongs to a target distribution; the processor 701 is configured to call the program instructions for performing:
determining the quantity product of hash code vectors corresponding to any two sample multimedia data, and comparing the quantity product with a preset quantity product threshold value;
and taking the comparison result of the number and the preset number product threshold value as the correlation analysis result of the random two sample multimedia data.
In one embodiment, the processor 701 is configured to call the program instructions to perform:
if the number product is larger than the preset number product threshold value, determining that the random two sample multimedia data have a data association relation;
and if the number product is less than or equal to the preset number product threshold value, determining that no data association relation exists between any two sample multimedia data.
In one embodiment, if the characteristic information is data content information, the processor 701 is configured to call the program instruction to perform:
detecting the reference relation of any two sample multimedia data according to the data content information of any two sample multimedia data;
and taking the result of the reference relation detection as the correlation analysis result of the any two sample multimedia data.
In one embodiment, the processor 701 is configured to call the program instructions to perform:
if the reference relation detection result indicates that the data contents of any two multimedia data have reference relation, determining that the any two sample multimedia data have data association relation;
and if the reference relation detection result indicates that the data contents of any two multimedia data do not have the reference relation, determining that no data association relation exists between any two sample multimedia data.
In one embodiment, the constructed target model satisfies a joint probability distribution function, which is a first joint probability distribution function or a second joint probability distribution function;
and if the data association relation is determined according to the association analysis of the vector distribution characteristic information, the target model meets the first joint probability distribution function, and if the data association relation is determined according to the association analysis of the data content information, the target model meets the second joint probability distribution function.
In one embodiment, the processor 701 is configured to call the program instructions to perform:
determining a reward function for training the goal model;
training a joint probability distribution function and a probability calculation function which are satisfied by the target model according to the direction of increasing the reward function by adopting sample multimedia data included in the sample multimedia data set;
and finishing the training of the target model when the function value of the reward function meets a preset threshold value.
In one embodiment, the processor 701 is configured to call the program instructions to perform:
calling the trained target model to generate a plurality of reference hash code vectors of the input target multimedia data;
determining the probability value of each reference Hash code vector as the theoretical Hash code vector of the target multimedia data according to a probability calculation function, and selecting a target probability value meeting preset conditions from the probability values;
and determining a target hash code vector corresponding to the target probability value, and determining a target hash code of the target multimedia data according to the hash code indicated by the target hash code vector.
In one embodiment, the processor 701 is configured to call the program instructions to perform:
determining a theoretical probability function adopted when the theoretical hash code vector is obtained according to a joint probability distribution function met by the target model, wherein the theoretical probability function is used for indicating the probability of generating the theoretical hash code vector when the target model meets the joint probability distribution function;
determining an approximation function of the theoretical probability function, and determining the probability calculation function according to the approximation function.
In one embodiment, the processor 701 is configured to call the program instructions to perform:
and carrying out binarization processing on the hash code indicated by the target hash code vector, and taking the hash code after binarization processing as the target hash code of the target multimedia data.
In one embodiment, the processor 701 is configured to call the program instructions to perform:
acquiring a query request, and calling a trained target model to determine a query hash code corresponding to query information in the query request;
acquiring a feedback multimedia database, wherein the feedback multimedia database comprises a plurality of feedback multimedia data and a hash code of each feedback multimedia data;
and according to the query hash code and the hash code of each feedback multimedia data, selecting the target feedback multimedia data matched with the query hash code from the feedback multimedia database.
In one embodiment, the processor 701 is configured to call the program instructions to perform:
calculating a Hamming distance between the query hash code and the hash code of each feedback multimedia data;
and selecting feedback multimedia data with the Hamming distance less than or equal to a preset distance threshold value from the feedback multimedia database as target feedback multimedia data.
Embodiments of the present invention provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method embodiments as shown in fig. 2 or fig. 3. The computer-readable storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
While the invention has been described with reference to a particular embodiment, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (15)

1. A data processing method, comprising:
obtaining a sample multimedia data set, the sample multimedia data set comprising at least two sample multimedia data;
acquiring characteristic information of each sample multimedia data in the sample multimedia data set, wherein the characteristic information comprises data content information or vector distribution characteristic information;
performing relevance analysis on the characteristic information of any two sample multimedia data, and determining the data relevance relationship between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result;
and constructing a target model according to the data association relation, wherein the target model is used for processing input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
2. The method of claim 1, wherein a sample multimedia data corresponds to a hash code vector, and if the characteristic information is vector distribution characteristic information, the distribution of the hash code vector of each sample multimedia data in the multimedia data set belongs to a target distribution; the correlation analysis of the characteristic information of any two sample multimedia data includes:
determining the quantity product of hash code vectors corresponding to any two sample multimedia data, and comparing the quantity product with a preset quantity product threshold value;
and taking the comparison result of the number and the preset number product threshold value as the correlation analysis result of the random two sample multimedia data.
3. The method of claim 2, wherein the determining the data association relationship between any two sample multimedia data in the sample multimedia data set according to the result of the association analysis comprises:
if the number product is larger than the preset number product threshold value, determining that the random two sample multimedia data have a data association relation;
and if the number product is less than or equal to the preset number product threshold value, determining that no data association relation exists between any two sample multimedia data.
4. The method of claim 1, wherein if the characteristic information is data content information, the performing correlation analysis on the characteristic information of any two sample multimedia data comprises:
detecting the reference relation of any two sample multimedia data according to the data content information of any two sample multimedia data;
and taking the result of the reference relation detection as the correlation analysis result of the any two sample multimedia data.
5. The method of claim 4, wherein the determining the data association relationship between any two sample multimedia data in the sample multimedia data set according to the result of the association analysis comprises:
if the reference relation detection result indicates that the data contents of any two multimedia data have reference relation, determining that the any two sample multimedia data have data association relation;
and if the reference relation detection result indicates that the data contents of any two multimedia data do not have the reference relation, determining that no data association relation exists between any two sample multimedia data.
6. The method of claim 1, wherein the constructed target model satisfies a joint probability distribution function, the joint probability distribution function being a first joint probability distribution function or a second joint probability distribution function;
and if the data association relation is determined according to the association analysis of the data content information, the target model meets the second joint probability distribution function.
7. The method of claim 1, wherein after the building of the object model according to the data association relationship, the method further comprises:
determining a reward function for training the goal model;
training a joint probability distribution function and a probability calculation function which are satisfied by the target model according to the direction of increasing the reward function by adopting sample multimedia data included in the sample multimedia data set;
and finishing the training of the target model when the function value of the reward function meets a preset threshold value.
8. The method of claim 7, further comprising:
calling the trained target model to generate a plurality of reference hash code vectors of the input target multimedia data;
determining the probability value of each reference Hash code vector as the theoretical Hash code vector of the target multimedia data according to a probability calculation function, and selecting a target probability value meeting preset conditions from the probability values;
and determining a target hash code vector corresponding to the target probability value, and determining a target hash code of the target multimedia data according to the hash code indicated by the target hash code vector.
9. The method of claim 8, further comprising:
determining a theoretical probability function adopted when the theoretical hash code vector is obtained according to a joint probability distribution function met by the target model, wherein the theoretical probability function is used for indicating the probability of generating the theoretical hash code vector when the target model meets the joint probability distribution function;
determining an approximation function of the theoretical probability function, and determining the probability calculation function according to the approximation function.
10. The method of claim 8, wherein determining the target hash code for the target multimedia data based on the hash code indicated by the target hash code vector comprises:
and carrying out binarization processing on the hash code indicated by the target hash code vector, and taking the hash code after binarization processing as the target hash code of the target multimedia data.
11. The method of claim 7, further comprising:
acquiring a query request, and calling a trained target model to determine a query hash code corresponding to query information in the query request;
acquiring a feedback multimedia database, wherein the feedback multimedia database comprises a plurality of feedback multimedia data and a hash code of each feedback multimedia data;
and according to the query hash code and the hash code of each feedback multimedia data, selecting the target feedback multimedia data matched with the query hash code from the feedback multimedia database.
12. The method of claim 1, wherein the selecting the target feedback multimedia data matching the query hash code from the feedback multimedia database according to the query hash code and the hash code of each feedback multimedia data comprises:
calculating a Hamming distance between the query hash code and the hash code of each feedback multimedia data;
and selecting feedback multimedia data with the Hamming distance less than or equal to a preset distance threshold value from the feedback multimedia database as target feedback multimedia data.
13. A data processing apparatus, comprising:
an obtaining unit, configured to obtain a sample multimedia data set, where the sample multimedia data set includes at least two sample multimedia data;
the acquiring unit is further configured to acquire feature information of each sample multimedia data in the sample multimedia data set, where the feature information includes data content information or vector distribution feature information;
the determining unit is used for carrying out relevance analysis on the characteristic information of any two sample multimedia data and determining the data relevance relation between any two sample multimedia data in the sample multimedia data set according to the relevance analysis result;
and the construction unit is used for constructing a target model according to the data association relation, the target model is used for processing input target multimedia data and generating a target hash code, and the target hash code is used for acquiring multimedia data related to the target multimedia data.
14. A server comprising a processor, an input device, an output device and a memory, the processor, the input device, the output device and the memory being interconnected, wherein the memory is configured to store a computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method of any of claims 1 to 12.
15. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to carry out the method according to any one of claims 1 to 12.
CN202011153160.1A 2020-10-23 2020-10-23 Data processing method, device, server and storage medium Pending CN112148902A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011153160.1A CN112148902A (en) 2020-10-23 2020-10-23 Data processing method, device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011153160.1A CN112148902A (en) 2020-10-23 2020-10-23 Data processing method, device, server and storage medium

Publications (1)

Publication Number Publication Date
CN112148902A true CN112148902A (en) 2020-12-29

Family

ID=73954943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011153160.1A Pending CN112148902A (en) 2020-10-23 2020-10-23 Data processing method, device, server and storage medium

Country Status (1)

Country Link
CN (1) CN112148902A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800253A (en) * 2021-04-09 2021-05-14 腾讯科技(深圳)有限公司 Data clustering method, related device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2673385C1 (en) * 2017-05-26 2018-11-26 Максим Львович Лихвинцев Method of data exchange recording control in information - telecommunication network and identification system of electron mail
CN110209867A (en) * 2019-06-05 2019-09-06 腾讯科技(深圳)有限公司 Training method, device, equipment and the storage medium of image encrypting algorithm
US20190332921A1 (en) * 2018-04-13 2019-10-31 Vosai, Inc. Decentralized storage structures and methods for artificial intelligence systems
US20200028993A1 (en) * 2017-02-28 2020-01-23 Samsung Electronics Co., Ltd Method and device for processing multimedia data
CN111026887A (en) * 2019-12-09 2020-04-17 武汉科技大学 Cross-media retrieval method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200028993A1 (en) * 2017-02-28 2020-01-23 Samsung Electronics Co., Ltd Method and device for processing multimedia data
RU2673385C1 (en) * 2017-05-26 2018-11-26 Максим Львович Лихвинцев Method of data exchange recording control in information - telecommunication network and identification system of electron mail
US20190332921A1 (en) * 2018-04-13 2019-10-31 Vosai, Inc. Decentralized storage structures and methods for artificial intelligence systems
CN110209867A (en) * 2019-06-05 2019-09-06 腾讯科技(深圳)有限公司 Training method, device, equipment and the storage medium of image encrypting algorithm
CN111026887A (en) * 2019-12-09 2020-04-17 武汉科技大学 Cross-media retrieval method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800253A (en) * 2021-04-09 2021-05-14 腾讯科技(深圳)有限公司 Data clustering method, related device and storage medium

Similar Documents

Publication Publication Date Title
US11227118B2 (en) Methods, devices, and systems for constructing intelligent knowledge base
US20190057164A1 (en) Search method and apparatus based on artificial intelligence
CN112860866B (en) Semantic retrieval method, device, equipment and storage medium
CN111324696B (en) Entity extraction method, entity extraction model training method, device and equipment
CN112287089B (en) Classification model training and automatic question-answering method and device for automatic question-answering system
CN112131883B (en) Language model training method, device, computer equipment and storage medium
WO2019154411A1 (en) Word vector retrofitting method and device
CN114861889B (en) Deep learning model training method, target object detection method and device
CN110377733B (en) Text-based emotion recognition method, terminal equipment and medium
CN112084789A (en) Text processing method, device, equipment and storage medium
CN110210038B (en) Core entity determining method, system, server and computer readable medium thereof
CN113254620B (en) Response method, device and equipment based on graph neural network and storage medium
CN111368551A (en) Method and device for determining event subject
CN114547257B (en) Class matching method and device, computer equipment and storage medium
CN114528588A (en) Cross-modal privacy semantic representation method, device, equipment and storage medium
CN116150306A (en) Training method of question-answering robot, question-answering method and device
CN113591490B (en) Information processing method and device and electronic equipment
CN113360300B (en) Interface call link generation method, device, equipment and readable storage medium
CN117312535B (en) Method, device, equipment and medium for processing problem data based on artificial intelligence
WO2022022049A1 (en) Long difficult text sentence compression method and apparatus, computer device, and storage medium
CN110390011B (en) Data classification method and device
CN112307738B (en) Method and device for processing text
CN112148902A (en) Data processing method, device, server and storage medium
CN111967253A (en) Entity disambiguation method and device, computer equipment and storage medium
CN112749557A (en) Text processing model construction method and text processing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40034949

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination