CN112685452B

CN112685452B - Enterprise case retrieval method, device, equipment and storage medium

Info

Publication number: CN112685452B
Application number: CN202011643928.3A
Authority: CN
Inventors: 范凌
Original assignee: Tezign Shanghai Information Technology Co Ltd
Current assignee: Tezign Shanghai Information Technology Co Ltd
Priority date: 2020-12-31
Filing date: 2020-12-31
Publication date: 2021-08-10
Anticipated expiration: 2040-12-31
Also published as: CN112685452A

Abstract

The application discloses an enterprise case retrieval method, an enterprise case retrieval device, enterprise case retrieval equipment and a storage medium. The method comprises the following steps: receiving a search term; searching in a preset enterprise case vector pool by adopting an algorithm BM25 based on the search words to obtain a first recall result sequence of the enterprise cases; calculating the cosine distance between the search term vector and each vector in the vector pool; sorting the enterprise case samples corresponding to each vector in the vector pool according to each cosine distance to obtain a second recall result sorting; and comprehensively sequencing the first recall result sequence and the second recall result sequence to obtain a sequence list of the enterprise cases. The method and the device solve the technical problem that the retrieval effect in the prior art is not ideal.

Description

Enterprise case retrieval method, device, equipment and storage medium

Technical Field

The application relates to the technical field of computers, in particular to an enterprise case retrieval method, an enterprise case retrieval device, enterprise case retrieval equipment and a storage medium.

Background

At present, most of retrieval systems in the creative marketing field use an algorithm BM25 for retrieval. And measuring the correlation between the search terms and the documents by using a probability statistic mode, and mainly calculating the frequency of the search terms in the documents, the length of the documents and other characteristics. However, in the results obtained by actual retrieval, a considerable number of cases of the retrieval results are not actually related to the retrieval words, and the retrieval effect is not ideal.

Disclosure of Invention

The present application mainly aims to provide an enterprise case retrieval method, apparatus, device and storage medium to solve the above problems.

In order to achieve the above object, according to an aspect of the present application, there is provided an enterprise case retrieval method, including:

receiving a search term;

searching by adopting a BM25 algorithm based on the search terms to obtain a first recall result sequence of the enterprise case;

generating a corresponding search word vector by the search word through a case search model;

calculating the cosine distance between the search term vector and each vector in the vector pool;

sorting the enterprise case samples corresponding to each vector in the vector pool according to each cosine distance to obtain a second recall result sorting;

and comprehensively sequencing the first recall result sequence and the second recall result sequence to obtain a sequence list of the enterprise cases.

Further, before receiving the search term, the method further includes:

constructing a knowledge graph of the marketing field;

collecting related data of a user in a retrieval process in a preset historical period;

and constructing a case multitask learning model based on the knowledge graph and the related data, and training the case multitask learning model by adopting the knowledge graph and the related data.

Further, the related data comprises behavior data and retrieval data;

collecting behavior data of the user for retrieving the case, comprising:

acquiring behavior data of the user on the case; and a ranking position of the case on a recall list; the behavior data comprises clicks, collections and shares of the retrieval results by the user;

the retrieving data includes: the method comprises the steps of collecting search terms in a buried point system, obtaining enterprise cases according to the search terms, and obtaining the correlation between the enterprise cases and the search terms.

Further, for any of the cases, calculating the correlation includes:

counting the time of clicking;

calculating the difference from the current time in days;

adjustment factor = difference/365 of the clicked historical time and the current time point;

the impact factor for this click = 1-adjustment factor;

correlation value =

；

Wherein xi is the influence factor of the ith clicked; n is the total number of clicks.

Further, the method further comprises:

acquiring a target case text to be identified;

inputting the target case text to be recognized into a pre-trained case multi-task learning model to obtain the classification information of the target case text to be recognized;

the classification information includes: the industry of the target case text to be recognized, the brand of the target case text to be recognized, the style of the target case text to be recognized, and the type of the target case text to be recognized.

Further, constructing a knowledge graph of the marketing field, comprising:

constructing a knowledge graph according to a preset entity and a relation between the entities based on a case sample library;

wherein the entity comprises: project entity, company entity, case entity, brand entity, designer entity, platform application entity;

the relationship between the platform application entity and the case entity is as follows: sharing or collecting cases by the entity of the platform application side;

the relationship between the platform application entity and the designer entity is as follows: the platform application party collects or shares the originality of the designer;

the relationship between the designer entity and the case entity is as follows: a design party issues cases;

the relationship between the company entity and the case entity is as follows: a company collection case;

the relationship between the company entity and the designer entity is as follows: collecting creativity of a designer by an enterprise;

the relationship between the case entity and the brand entity is as follows: cases serve brands;

the relationship between the company entity and the brand entity is as follows: the company is included in the brand;

the relationship between the brand entity and the project entity is as follows: the brand creates an item.

Further, fusing the entities of the knowledge graph to obtain the optimized knowledge graph specifically comprises:

for any one of the company entity and the brand entity, identifying entity names of the company entity and the brand entity by using a named body identification technology;

calculating the similarity between the entity name of the company entity and the entity name of the brand entity;

and if the similarity reaches a preset similarity threshold, fusing the company entity with the brand entity.

In order to achieve the above object, according to another aspect of the present application, there is provided an enterprise case search apparatus including:

the receiving module is used for receiving the search terms;

the processing module is used for searching by adopting a BM25 algorithm based on the search terms to obtain a first recall result sequence of the enterprise cases;

Further, the processing module is further configured to:

constructing a knowledge graph of the marketing field;

Further, the related data comprises behavior data and retrieval data;

the processing module is further configured to:

The processing module is further configured to:

for any of the cases, calculating the correlation includes:

counting the time of clicking;

calculating the difference from the current time in days;

the impact factor for this click = 1-adjustment factor;

correlation value =

；

The processing module is further configured to:

acquiring a target case text to be identified;

The processing module is further configured to:

In a third aspect, the present application further provides an enterprise case retrieval device, including: at least one processor and at least one memory; the memory is to store one or more program instructions; the processor, configured to execute one or more program instructions, is configured to perform the following steps:

receiving a search term;

Further, the processor is further configured to: before receiving a search word, constructing a knowledge graph of the marketing field;

Further, the related data comprises behavior data and retrieval data; the processor is further configured to: acquiring behavior data of the user on the case; and a ranking position of the case on a recall list; the behavior data comprises clicks, collections and shares of the retrieval results by the user;

Further, the processor is further configured to: for any of the cases, it is preferable that,

counting the time of clicking;

calculating the difference from the current time in days;

the impact factor for this click = 1-adjustment factor;

correlation value =

；

Further, the processor is further configured to: acquiring a target case text to be identified;

Further, the processor is further configured to:

Further, the processor is further configured to: for any one of the company entity and the brand entity, identifying entity names of the company entity and the brand entity by using a named body identification technology;

In a fourth aspect, the present application also proposes a computer-readable storage medium having one or more program instructions embodied therein, the one or more program instructions being configured to perform the steps of:

receiving a search term;

Further, before receiving the search term, the method further includes:

constructing a knowledge graph of the marketing field;

Further, the related data comprises behavior data and retrieval data;

collecting behavior data of the user for retrieving the case, comprising:

Further, for any of the cases, calculating the correlation includes:

counting the time of clicking;

calculating the difference from the current time in days;

the impact factor for this click = 1-adjustment factor;

correlation value =

；

Further, the method further comprises:

acquiring a target case text to be identified;

Further, constructing a knowledge graph of the marketing field, comprising:

In the embodiment of the application, through the search terms, the search terms are used for generating corresponding search term vectors through a case search model; sorting the enterprise case samples corresponding to each vector in the vector pool to obtain a second recall result sorting; and finally, comprehensively sequencing the first recall result sequence and the second recall result sequence of the existing method to obtain a sequence list of the enterprise cases. Therefore, the retrieval effect is improved, and the user obtains more cases related to the retrieval words. And further solve the technical problem that the retrieval effect is not high when the cases are retrieved in the prior art.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, serve to provide a further understanding of the application and to enable other features, objects, and advantages of the application to be more apparent. The drawings and their description illustrate the embodiments of the invention and do not limit it. In the drawings:

FIG. 1 is a schematic diagram of a case according to an embodiment of the present application;

fig. 2 is a flow chart of an enterprise case retrieval method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of case ordering according to an embodiment of the application;

fig. 4 is a schematic diagram of another case ordering according to an embodiment of the application;

fig. 5 is a schematic diagram of another case ordering according to an embodiment of the application;

FIG. 6 is a schematic diagram of a structure of a case multitask learning model according to an embodiment of the present application;

FIG. 7 is a schematic illustration of a knowledge-graph according to an embodiment of the present application;

FIG. 8 is a schematic diagram of an architecture according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a closed loop mechanism according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a case searching apparatus according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a case retrieval device according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

For convenience of understanding, terms referred to in the embodiments of the present invention are explained below.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Word vector: word vectors are one way to mathematically transform words in natural language. By training each word in a certain language to map into a short vector with fixed length, putting all the vectors together to form a word vector space, and introducing a "distance" into the space, the similarity (lexical and semantic) between the words can be judged according to the distance between the words. For example, the words "benign" and "lucky" are mapped to a vector with 300 dimensions, which are respectively marked as vector1 and vector 2. The similarity can be determined by calculating the inner product, giving a specific metric value.

Sentence vector-similar to word vector, a sentence is converted into a sentence vector.

Knowledge graph: the map is also called scientific knowledge map, is known as knowledge domain visualization or knowledge domain mapping map in the book information world, and is a series of different graphs for displaying the relationship between the knowledge development process and the structure. The method is used for describing knowledge resources and carriers thereof through visualization technology, and mining, analyzing, constructing, drawing and displaying knowledge and the mutual relation among the knowledge resources and the carriers. Knowledge-graph is essentially a semantic network. Its nodes represent entities (entries) or concepts (concepts), and edges represent various semantic relationships between entities/concepts. Knowledge-graphs are typically represented using a triple structure, i.e., entity-relationship-entity.

For example, a Luxun (entity) -a couple (type of relationship) -a permissive.

In the field of creative marketing, company business personnel often need to search in the process of designing cases for customers, for example, searching cases with the same style and cases with the same type, so as to be helpful for the company business personnel to design and reference. Referring to figure 1 of the drawings, a schematic illustration of a case is shown; a case includes a picture area; a text area; the text area generally includes: service brand, industry field, creative type; the creative types include: pictorial, cartoon, expression package. The style is Chinese wind. However, the search method in the prior art is not ideal. For example, the search term biddi is input, the first few cases in the popped cases are biddi cases, and the last few cases are cases unrelated to the biddi, which is not helpful for business personnel of the company.

Based on this, the present application proposes an enterprise case retrieval method, which is shown in the flowchart of fig. 2; the method comprises the following steps:

step S101, receiving a search term;

for example, a user searches for a search term on an interface of a client, such as a case where the user wants to obtain biddi's relevance, and inputs "biddi" on the search interface.

Step S102, based on the search terms, searching by adopting a BM25 algorithm to obtain a first recall result sequence of the enterprise cases;

the BM25 algorithm is an algorithm proposed in the prior art based on a probabilistic search model, and is used to evaluate the relevance between a search term and a document.

Referring to fig. 3, a schematic diagram of case sorting in case of case retrieval in the prior art is shown; cases 1 to 6 are biddick-related advertising cases, respectively; the order is arranged according to the order of the relevance of the search results to the search terms from large to small.

The relevance of case 1 to the search term is 0.9; the relevance of case 2 to the search term is 0.81; case 3 has a relevance of 0.7 to the search term; case 4 has a relevance of 0.6 to the search term; case 5 has a relevance of 0.5 to the search term; case 6 has a relevance of 0.4 to the term. Since the case 5, 6 has a low degree of correlation, the case 5, 6 is likely not a biddi case, and thus the searching capability in the prior art is not strong.

The enterprise case vector pool is used for vectorizing each enterprise case on the existing network and storing the vectorized enterprise cases in the ES. Including the cases of the customer business owned by the user and the business cases that the user can search from the web.

Step S103, generating a corresponding search term vector by the search term through a case search model;

illustratively, the search term entered by the user is "biddi"; the case retrieval model vectorizes Biedi to be 'w 1, w2, w3, w4 and w5 … wm';

and the vectors in the vector pool are the vectors after the enterprise case samples are converted. The enterprise case sample is publicity cases of enterprises stored by the user and enterprise cases searched on the network. As many cases as possible should be prepared. The more case samples, the more accurate the results of the search.

specifically, the vector cosine distance calculation formula is as follows:

(ii) a Where x and y are two different vectors, respectively.

For example, see another case ordering diagram shown in fig. 4; wherein the content of the first and second substances,

case 7, correlation 0.95; case 8, correlation 0.93; case 9, correlation 0.91;

case 10, correlation 0.85; case 11, correlation 0.82; in case 12, the correlation was 0.80.

And step S104, comprehensively sorting the first recall result sorting and the second recall result sorting to obtain a sorted list of the enterprise cases.

For example, see another example comprehensive ranking diagram shown in fig. 5; wherein the content of the first and second substances,

case 7, correlation 0.95; case 8, correlation 0.93; case 9, correlation 0.91;

case 10, correlation 0.85; case 11, correlation 0.82; in case 2, the correlation was 0.81.

Several cases with low correlation are omitted: case 12, correlation 0.80; case 3, correlation 0.7; case 4, correlation 0.6; case 5, correlation 0.5; in case 6, the correlation was 0.4. Thereby improving the accuracy of case retrieval. It is emphasized that several cases can be displayed in parallel when the correlation degrees of the cases are the same.

According to the method, the first recall list and the second recall list acquired in the prior art are comprehensively ordered, so that the retrieval relevance is improved, the retrieval result is a case closely related to the retrieval word, and irrelevant cases are avoided as much as possible. The retrieval effect is improved.

In one embodiment, the case multitask learning model is shown in fig. 6 as a schematic structural diagram of the case multitask learning model;

the case multitask learning model is named as a SentSim model and is a case text Encode model taking classification as a supervision target.

The case multitask learning model comprises the following steps: an Embedding layer, an encoding layer, an orientation layer and an output layer.

Wherein, the Embedding layer is realized by adopting a BERT (bidirectional Encoder responses from transformations) network; the network can increase the generalization capability of the word vector model and fully describe the character level, the word level, the sentence level and even the relation characteristics among sentences.

The Encode layer is realized by adopting a Bi-directional Gated circulating Unit (Bi-GRU); among them, Bi-GRU is an improved version of the widely used Recurrent Neural Network (RNN), and Bi-GRU is generally better able to express long-term dependence than the original RNN. Bi-GRU is one of the Long Short-Term Memory network (LSTM) models. The Bi-GRU model is mainly constructed by a double-layer model, each layer is of a one-way transmission structure, and each layer comprises a word vector representation module and a feature extraction module. The forward transfer layer can obtain the above information of the input sequence, the backward transfer layer can obtain the below information of the input sequence, for the same input node, the hidden layer states of the forward transfer layer and the backward transfer layer can be combined to be used as the input of the final output layer, and the final semantic code containing the context information can be obtained.

The Attention layer specifically includes a brand Attention layer; a style attention layer; a type attention layer; an industry attention layer; text attention layer. Wherein the text attention layer is used for extracting the text in the case.

The OUTPUT layer includes brand LOSS, style LOSS, type LOSS, industry LOSS, search LOSS.

The above layer structure is mainly used in a scene of model training, and an am-softmax loss function is adopted during training; the inter-class differences can be increased and the intra-class differences can be decreased.

In order to improve the relevance of retrieval and retrieve more relevant cases, in one embodiment, before receiving a retrieval word, a knowledge graph in the marketing field needs to be constructed;

And when the knowledge graph is constructed, constructing the knowledge graph according to the preset entity and the relation between the entities based on the case sample library. Referring to FIG. 7, a schematic diagram of a knowledge graph is shown; wherein the entity comprises: project entity, company entity, case entity, brand entity, designer entity, platform application entity;

illustratively, the company is Starbucks, and the designer is a design company that has advertised cases for Starbucks.

the relationship between the company entity and the brand entity is as follows: the company contains the brand;

illustratively, the business is a steam; but the included brands are popular and universal.

The relationship between the brand entity and the project entity is as follows: a brand creation project;

for example, a large population may have a number of different items that need to be designed; for each project, butting a designer; the design party is in cooperative relationship with the public.

When the knowledge graph is adopted to train the model, the database of the knowledge graph is input into the model, the data is divided into two parts, one part is a test sample, the other part is a standard sample, a loss function value obtained by the test sample is determined through the standard sample, after iterative circulation, the loss function value is reduced within a preset threshold value, the model is determined to be converged, and the training is stopped.

In one embodiment, different convergence thresholds may be set for brand LOSS, style LOSS, type LOSS, industry LOSS, and search LOSS, respectively; when the LOSS function value LOSS does not decrease within the continuous epoch of the preset threshold value, stopping training; the predetermined threshold may be 10, or may be another number, and may be flexibly set.

For example, when the model is trained, the text content in the case of fig. 1 is input into the model, and if the brand in the case of fig. 1 is starbucks, the industry is catering; but the results of the model identification are: the brand is Pacific and the brand is catering; the brand identification is not accurate enough, and training is needed; but the industry class of identification has achieved accuracy and the part may no longer be trained.

Through the knowledge graph, when a user searches a brand, BYD, other related entities, such as items, design parties and cases related to the BYD, can be searched. Thereby allowing the user to gain more knowledge about biddi.

According to the technical scheme, the case multi-task learning model is trained through the knowledge graph, so that the model retrieval precision can be improved; related entities in the knowledge graph, such as related designers, can be searched out, and compared with the prior art, better searching effect can be achieved.

In one embodiment, the relevant data includes behavioral data and search data; collecting behavior data of the user for retrieving the case, comprising:

acquiring behavior data of the user on the case; and a ranking position of the case on a recall list; the behavior data comprises clicks, collections and shares of the retrieval results by the user.

For example, the number of times a certain case is shared may be counted; the number of times of being collected;

supposing that a certain case is not shared or collected historically, the relevance value is as follows;

if the situation that the shared object is collected is considered; the value of the correlation =

X (1 + m); wherein m is a decimal fraction less than 1.

In one embodiment, a table of correspondence between the number of times of sharing and m may be set; see table 1:

number of times of being collected	Number of times of being shared	m
			2	2	0.1
3	3	0.2
			4	4	0.3
5	5	0.4
			6	6	0.5
7	7	0.6

TABLE 1

Specifically, the determination of the correlation value adopts the following steps: counting the number of times of clicking the case; if the click times are more than or equal to two, the strong correlation is defined, and the value of the correlation is 1. If the click is only once, the value is a random number from 0.8 to 0.9. If the number of clicks is 0, weak correlation is defined, and the correlation value is 0.5 or less.

And further subdivision can be performed according to a preset time period, and if the click is performed once but the click is performed two years ago, weak correlation is defined; if the click is performed once and the click is performed in the last week, defining the click as strong correlation; defining the value of the correlation according to the length of the historical time; the longer the clicked historical time is, the smaller the correlation value is; the shorter the history time, the stronger the correlation.

In one embodiment, the correlation for a case is calculated by the following formula;

for each click, counting the time of the click of the case;

calculating the time difference from the current time, and taking days as a unit;

the impact factor for this click = 1-adjustment factor;

correlation value =

；

Wherein xi is the influence factor of the ith clicked; n is the total number of clicks. Preferably, the above calculation formula is more suitable for the case that the case is clicked within one year. The correlation value calculated by the formula can reflect the influence of time factors.

To optimize the knowledge-graph, redundant entities are avoided. In an embodiment, the method further includes fusing entities of the knowledge graph to obtain an optimized knowledge graph, and specifically includes:

for any one company entity and brand entity, identifying entity names of the company entity and the brand entity by using a named body identification ENR technology;

Illustratively, if the name of the brand entity is "Starbucks". And company entity's name "starbuckstarbcks"; calculating the distances between the two jaro winkler; the Jaro-Winkler distance is a string metric that measures the edit distance between two character sequences; generally, the smaller the edit distance, the greater the similarity of two character strings. If the jaro winkler distance is 0, the similarity is 100%. The similarity threshold may be set to 0.9; the specific flexible setting is not limited in the application. If the similarity is greater than 0.9, it is determined that the two should be fused. By fusing, the knowledge graph can be simplified.

Compared with the traditional retrieval algorithm BM25, the model integrates semantic level information, and semantic relevance exists between retrieval and recall texts; compared with the information retrieval based on BERT, the knowledge graph information is added into the model, on one hand, embedded information of case nodes in the graph is merged into BERTE addressing, on the other hand, graph classification is used as side information to be added into training learning, and knowledge constraint of the model is increased.

The method utilizes the knowledge expression capability of the knowledge mAP, increases the semantic relevance between retrieval and recalled text, and increases the retrieval performance mAP value by about 10 percentage points in an evaluation system compared with an ES retrieval method. A model closed-loop mechanism based on a user feedback mechanism is constructed, so that data can be accumulated, an algorithm can grow, services can benefit, and changes can be measured.

Referring to fig. 8, a schematic diagram of a proposed architecture of the present application is shown; the dashed line represents the workflow of the ES; implementing a workflow representing a search model; when OFFLINE, the ES stores the case text; and through the conversion of the model, the vector corresponding to the case is also stored; when online ONLINHE is carried out, the search words are searched by the search model and the BM25 algorithm respectively to obtain a first vector recall result and a second recall result; and comprehensively sorting the two recall results to obtain a final output sorting result.

Referring to fig. 9, a schematic diagram of a closed loop proposed in the present application is shown; the retrieval system obtains user behavior data through the embedded points; using a part of the behavior data to evaluate after data cleaning; calculating the evaluated data to obtain a BI system for evaluation; the other part of the user behavior data is used for training the model algorithm after being cleaned; the trained model is used for retrieval of a retrieval system.

It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.

According to an embodiment of the present invention, there is also provided an apparatus for implementing the above-mentioned enterprise case search, as shown in fig. 10, the apparatus includes:

a receiving module 1001, configured to receive a search term;

the processing module 1002 is configured to perform retrieval by using a BM25 algorithm based on the search term to obtain a first recall result ranking of the enterprise case;

Further, the processing module 1002 is further configured to:

constructing a knowledge graph of the marketing field;

Further, the related data comprises behavior data and retrieval data; the processing module 1002 is further configured to:

The processing module 1002 is further configured to: for any of the cases, calculating the correlation includes:

counting the time of clicking;

calculating the difference from the current time in days;

the impact factor for this click = 1-adjustment factor;

correlation value =

；

The processing module is further configured to:

acquiring a target case text to be identified;

The processing module 1002 is further configured to:

In a third aspect, the present application further provides an enterprise case retrieval device, which is shown in fig. 11 as a schematic structural diagram of the enterprise case retrieval device; the apparatus comprises: at least one processor 1101 and at least one memory 1102; the memory 1102 is used to store one or more program instructions; the processor 1101 is configured to execute one or more program instructions to perform any one of the methods described above.

In a fourth aspect, the present application also proposes a computer-readable storage medium having embodied therein one or more program instructions for executing the method of any one of the above.

In an embodiment of the invention, the processor may be an integrated circuit chip having signal processing capability. The Processor may be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete Gate or transistor logic device, discrete hardware component.

The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The processor reads the information in the storage medium and completes the steps of the method in combination with the hardware.

The storage medium may be a memory, for example, which may be volatile memory or nonvolatile memory, or which may include both volatile and nonvolatile memory.

The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory.

The volatile Memory may be a Random Access Memory (RAM) which serves as an external cache. By way of example and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), SLDRAM (SLDRAM), and Direct Rambus RAM (DRRAM).

The storage media described in connection with the embodiments of the invention are intended to comprise, without being limited to, these and any other suitable types of memory.

Those skilled in the art will appreciate that the functionality described in the present invention may be implemented in a combination of hardware and software in one or more of the examples described above. When software is applied, the corresponding functionality may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.

It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and they may alternatively be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, or fabricated separately as individual integrated circuit modules, or fabricated as a single integrated circuit module from multiple modules or steps. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. An enterprise case retrieval method, comprising:

receiving a search term;

calculating the cosine distance between the search term vector and each vector in the enterprise case vector pool;

ordering the enterprise case samples corresponding to each vector in the enterprise case vector pool according to each cosine distance to obtain a second recall result ordering;

comprehensively sorting the first recall result sorting and the second recall result sorting to obtain a sorted list of the enterprise cases;

before receiving the search term, the method further comprises:

constructing a knowledge graph of the creative marketing field;

constructing the case retrieval model based on the knowledge graph and the related data, and training the case retrieval model by adopting the knowledge graph and the related data;

the related data comprises behavior data and retrieval data;

collecting behavior data of the user for retrieving the enterprise case, comprising:

acquiring behavior data of the user on the enterprise case; and a ranking position of the business case on a recall list; the behavior data comprises clicks, collections and shares of the retrieval results by the user;

the retrieving data includes: acquiring a retrieval word in a buried point system, and obtaining an enterprise case according to the retrieval word, and the correlation between the enterprise case and the retrieval word;

for any one business case, calculating the relevance includes:

counting the time of clicking;

calculating the difference from the current time in days;

the impact factor for this click = 1-adjustment factor;

correlation value =

；

Wherein xi is the influence factor of the ith clicked; n is the total number of clicks;

the method for constructing the knowledge graph of the creative marketing field comprises the following steps:

2. The enterprise case retrieval method of claim 1,

fusing the entities of the knowledge graph to obtain the optimized knowledge graph, which specifically comprises the following steps:

3. An enterprise case retrieval apparatus, comprising:

the receiving module is used for receiving the search terms;

the processing module is further configured to: constructing a knowledge graph of the creative marketing field;

the related data comprises behavior data and retrieval data;

the processing module is further configured to: acquiring behavior data of the user on the enterprise case; and a ranking position of the business case on a recall list; the behavior data comprises clicks, collections and shares of the retrieval results by the user;

the processing module is further configured to: for any one business case, calculating the relevance includes:

counting the time of clicking;

calculating the difference from the current time in days;

the impact factor for this click = 1-adjustment factor;

correlation value =

；

the processing module is further configured to: constructing a knowledge graph according to a preset entity and a relation between the entities based on a case sample library;

4. An enterprise case retrieval apparatus, comprising: at least one processor and at least one memory; the memory is to store one or more program instructions; the processor, configured to execute one or more program instructions to perform the method of any of claims 1-2.

5. A computer-readable storage medium having one or more program instructions embodied therein for performing the method of any of claims 1-2.