CN115203275A - Recall result sorting method, apparatus, device, storage medium and program product - Google Patents

Recall result sorting method, apparatus, device, storage medium and program product Download PDF

Info

Publication number
CN115203275A
CN115203275A CN202210901062.4A CN202210901062A CN115203275A CN 115203275 A CN115203275 A CN 115203275A CN 202210901062 A CN202210901062 A CN 202210901062A CN 115203275 A CN115203275 A CN 115203275A
Authority
CN
China
Prior art keywords
recall
target
attribute
field name
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210901062.4A
Other languages
Chinese (zh)
Inventor
张建兵
甘露
徐增辉
陈亮辉
张新运
龚建
孙珂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202210901062.4A priority Critical patent/CN115203275A/en
Publication of CN115203275A publication Critical patent/CN115203275A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof

Abstract

The present disclosure provides a recall result sorting method, apparatus, device, storage medium and program product, which relates to the artificial intelligence technology field of natural language processing, big data, knowledge graph, intelligent search, etc., and can be applied in the scenes of smart cities and city management. The method comprises the following steps: acquiring an initial recall result corresponding to a target query word or a target query statement; respectively extracting a subject entity from each initial recall result, and determining a target attribute of a slot position to which each noun forming the subject entity belongs and a matched target table field name; determining a first recall evaluation of a corresponding subject entity based on a mapping relation between the target attribute and the field name of the target table; and determining a second recall evaluation of the corresponding initial recall result according to the first recall evaluation of each subject entity, and sequencing the initial recall results according to the height of the second recall evaluation. By applying the method, the presentation sequencing of each initial recall result can be adjusted as accurately as possible.

Description

Recall result sorting method, apparatus, device, storage medium and program product
Technical Field
The present disclosure relates to the field of artificial intelligence, and in particular, to techniques for natural language processing, big data, knowledge maps, and intelligent search, which can be applied in smart cities and city governance scenarios, and in particular, to a recall result ranking method, apparatus, electronic device, computer-readable storage medium, and computer program product.
Background
With the introduction of the big data era, under scenes such as smart cities and city governance, a large number of results are recalled every time a search action is performed based on a query word or a query sentence.
Therefore, how to more accurately sort a large number of recall results to obtain the most valuable and possible recall result in the shortest time is a problem to be solved by those skilled in the art.
Disclosure of Invention
The embodiment of the disclosure provides a recall result sorting method, a recall result sorting device, electronic equipment, a computer-readable storage medium and a computer program product.
In a first aspect, an embodiment of the present disclosure provides a recall result ranking method, including: acquiring an initial recall result corresponding to a target query word or a target query sentence; respectively extracting a subject entity from each initial recall result, and determining a target attribute of a slot position to which each noun forming the subject entity belongs and a matched target table field name; the theme entity is an entity phrase containing a limiting word of the entity serving as the theme; determining a first recall evaluation of a corresponding subject entity based on a mapping relation between the target attribute and the field name of the target table; and determining a second recall evaluation of the corresponding initial recall result according to the first recall evaluation of each subject entity, and sequencing the initial recall results according to the height of the second recall evaluation.
In a second aspect, an embodiment of the present disclosure provides a recall result sorting apparatus, including: an initial recall result acquisition unit configured to acquire an initial recall result corresponding to a target query term or a target query sentence; the initial recall result processing unit is configured to extract a subject entity from each initial recall result respectively and determine a target attribute of a slot position to which each noun forming the subject entity belongs and a matched target table field name; the theme entity is an entity phrase containing a limiting word of the entity serving as the theme; a first recall evaluation determination unit configured to determine a first recall evaluation of a corresponding subject entity based on a mapping relationship between the target attribute and the target table field name; and the second recall evaluation determining and sorting unit is configured to determine a corresponding second recall evaluation of the initial recall result according to the first recall evaluation of each subject entity and sort the initial recall results according to the height of the second recall evaluation.
In a third aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the recall result ranking method described in any implementation form of the first aspect when executed by the at least one processor.
In a fourth aspect, the disclosed embodiments provide a non-transitory computer-readable storage medium storing computer instructions for enabling a computer to implement the recall result ranking method described in any implementation manner of the first aspect.
In a fifth aspect, the disclosed embodiments provide a computer program product comprising a computer program, which when executed by a processor is capable of implementing the steps of the recall result ranking method as described in any of the implementations of the first aspect.
According to the recall result ordering scheme provided by the disclosure, on the basis of obtaining a large number of initial recall results aiming at query words or query sentences, key parts of the recall results are clarified by extracting subject entities containing qualifiers, a first recall evaluation of each subject entity is comprehensively and accurately calculated through mapping relations between attributes of slots to which nouns constituting the subject entities belong and name of the fields in a database, and finally a second recall evaluation of the corresponding initial recall results is determined according to the first recall evaluation of the subject entities, so that the presentation ordering of each initial recall result can be adjusted as accurately as possible based on the second recall evaluation, and the search efficiency of a certain entity as a subject is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
Other features, objects and advantages of the present disclosure will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings:
FIG. 1 is an exemplary system architecture to which the present disclosure may be applied;
FIG. 2 is a flow chart of a recall result ranking method according to an embodiment of the present disclosure;
FIG. 3 is a flowchart of a method for processing a subject entity according to an embodiment of the disclosure;
FIG. 4 is a flow chart of a method of determining a first recall evaluation provided by an embodiment of the present disclosure;
FIG. 5 is a flowchart of a method for adjusting a calculation parameter according to an embodiment of the present disclosure;
fig. 6 is a block diagram of a recall result sorting apparatus according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device adapted to execute a recall result ranking method according to an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of embodiments of the present disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness. It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict.
In the technical scheme of the disclosure, the processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the common customs of public order.
FIG. 1 illustrates an exemplary system architecture 100 to which embodiments of the recall result ranking methods, apparatus, electronic devices, and computer-readable storage media of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 and the server 105 may be installed with various applications for implementing information communication therebetween, such as a search application, a recall result sorting application, an instant messaging application, and the like.
The terminal devices 101, 102, 103 and the server 105 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices with display screens, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like; when the terminal devices 101, 102, and 103 are software, they may be installed in the electronic devices listed above, and they may be implemented as multiple software or software modules, or may be implemented as a single software or software module, and are not limited in this respect. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of multiple servers, or may be implemented as a single server; when the server is software, it may be implemented as multiple software or software modules, or may be implemented as a single software or software module, which is not limited herein.
The server 105 may provide various services through various built-in applications, taking a search class application that may provide a recall result ranking service as an example, the server 105 may implement the following effects when running the search class application: firstly, receiving query words or query sentences which are transmitted by terminal equipment 101, 102 and 103 through a network 104; then, obtaining an initial recall result according to the query words or the query sentences; then, analyzing the initial recall result to obtain recall evaluation; and finally, sequencing all the initial recall results according to the recall evaluation so as to present the sequenced initial recall results.
It should be noted that the query term, the query statement, or the initial recall result may be acquired from the terminal apparatuses 101, 102, 103 through the network 104, or may be stored locally in the server 105 in advance in various ways. Thus, when the server 105 detects that these data are already stored locally (e.g., pending sorting tasks remaining before starting processing), it may choose to obtain these data directly from locally, in which case the exemplary system architecture 100 may also not include the terminal devices 101, 102, 103 and the network 104.
Since analyzing and sorting a large number of initial recall results requires occupying more computing resources and stronger computing power, the recall result sorting method provided in the following embodiments of the present disclosure is generally executed by the server 105 having stronger computing power and more computing resources, and accordingly, the recall result sorting apparatus is generally disposed in the server 105. However, it should be noted that, when the terminal devices 101, 102, and 103 also have computing capabilities and computing resources meeting the requirements, the terminal devices 101, 102, and 103 may also complete the above-mentioned operations performed by the server 105 through the search application installed thereon, and then output the same result as the server 105. Especially, when there are a plurality of terminal devices having different computation capabilities at the same time, but the search application determines that the terminal device has a strong computation capability and a large amount of computation resources remaining, the terminal device may execute the above computation, so as to appropriately reduce the computation pressure of the server 105, and accordingly, the recall result sorting means may be provided in the terminal devices 101, 102, and 103. In such a case, the exemplary system architecture 100 may also not include the server 105 and the network 104.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring to fig. 2, fig. 2 is a flowchart of a recall result ranking method according to an embodiment of the disclosure, where the process 200 includes the following steps:
step 201: acquiring an initial recall result corresponding to a target query word or a target query sentence;
this step is intended to acquire an initial recall result corresponding to a target query term or a target query statement by an executing body of the recall result ranking method (e.g., the server 105 shown in fig. 1).
The target query term is a term representing the query intention, and the target query statement is a statement representing the query intention, and is generally composed of a plurality of words or phrases. Specific examples may be: a male wearing clothes with color B who sits on the same flight as thirty-three pieces of clothes with color A; the temporary place is XX city XX district XX women in the district XX nearby, and the like.
The initial recall results are returned results matched with at least one part of the target query words or target query sentences by the search tool or the search engine based on mass data recorded in a preset database, and the number of the returned results is usually more, so that the returned initial recall results are ranked in a mode meeting the requirements of the user in the face of need, and the efficiency of obtaining accurate search results is improved.
Step 202: respectively extracting a subject entity from each initial recall result, and determining a target attribute of a slot position to which each noun forming the subject entity belongs and a matched target table field name;
on the basis of step 201, this step is intended to extract the subject entity contained in each initial recall result by the execution subject mentioned above first, and then further determine the target attribute of the slot to which each noun constituting the subject entity belongs and the name of the matching target table field.
Wherein, different from the concept of "entity" representing mutually different things in the conventional sense, the subject entity is based on the entity as the subject and further comprises a qualifier part for the entity, taking the query statement of "male wearing B-color clothing for a flight with thirty-three seats of a-color clothing as an example, wherein the subject entity comprises: the male wearing clothes with color A, three Zhang about thirty or so, and the male wearing clothes with color B who sits on the same flight as the three Zhang, namely, a query word or a query sentence at least comprises a subject entity, and the recall result also comprises the same. That is, for an entity that is the subject of "person", the qualifier for the entity "person" includes: for describing the colors of the clothing worn, "a color" and "B color," for describing "about thirty for the age," for describing "sit-one flight" for describing the interactive experience.
Of course, besides the entity taking "person" as the subject, the entity taking the subject may also be "car", "file", "hotel", "event", and the like, and the subject entity is used in the present application, mainly in combination with the precise search requirements of the applicable specific application scenarios such as smart city, city management, and the like on some specific types of entities, so as to improve the search efficiency and screen out invalid recall results by predetermining the entity taking the subject.
After the subject entities are extracted from the initial recall result, the execution main body further determines each noun forming each subject entity, further determines the attribute of the slot to which the noun belongs, and finally determines the name of the target table field matched with the target attribute. The database is used for classifying the stored data in advance, mass table field names are generated in advance, the table field names are field names for storing different types of data in a table form, different attributes can be identified from the slot position to which the noun forming each subject entity belongs based on different identification modes, the target attribute is a more accurate attribute of the slot position to which the target attribute belongs, and the target table field name is a table field name with the highest matching degree with the target attribute.
Because the attributes of the slot to which the same noun belongs can be identified as different attributes in different identification manners, for example, according to the context semantic environment, according to the superior-inferior concept, according to the membership relationship, according to the user-defined correspondence relationship, and the like, under the condition that different suspected or alternative attributes exist, each suspected attribute or alternative attribute can also determine the table field name matched with the suspected or alternative attribute, how to determine which kind of mapping relationship between the attributes and the table field names is more in line with the actual requirement and has the highest probability is the key implementation process of the step. For example, the matching probability of each suspected mapping relation can be determined by different retrieval methods, and the matching probabilities of multiple matching methods are integrated to improve the accuracy.
Step 203: determining a first recall evaluation of a corresponding subject entity based on a mapping relation between the target attribute and the field name of the target table;
on the basis of step 202, this step is intended to extract, by the execution subject, features for determining the degree of matching based on the mapping relationship established between the target attribute and the target table field name, and determine the first recall evaluation of the corresponding subject entity from the extracted features.
Since the mapping relationship describes the association between the target attribute of the slot to which the noun belongs and the target table field name, multiple features related to the noun and the corresponding table field name can be extracted based on the mapping relationship, such as an importance parameter of the noun in the subject entity to which the noun belongs, a text matching degree and a semantic matching degree between the target attribute and the target table field name, a text matching degree and a semantic matching degree between an attribute value specifically represented by the noun under the target attribute and field content under the target table field name, an attribute confidence of the slot, and the like. That is, this step is intended to determine the recall evaluation of the corresponding subject entity based on all the features contained in the mapping relationship, which can determine the matching degree of the two from the text level, the semantic level or other levels, and according to the specific matching degree of the features, that is, the higher the recall evaluation, that is, the more the subject entity extracted from the initial recall result is matched with the query word or query sentence.
Step 204: and determining a second recall evaluation of the corresponding initial recall result according to the first recall evaluation of each subject entity, and sequencing the initial recall results according to the height of the second recall evaluation.
On the basis of step 203, this step is intended to determine, by the execution main body, a second recall evaluation of the corresponding initial recall result according to the first recall evaluation of the subject entities constituting each initial recall result, and then sort the initial recall results according to the level of the second recall evaluation, that is, the higher the second recall evaluation, the earlier the presentation order of the initial recall results is seen by the user, and otherwise, the later the presentation order is seen by the user.
Specifically, for an initial recall result only containing one subject entity, the second recall evaluation is determined and obtained only based on the first recall evaluation of the subject entity only contained; for the initial recall result containing a plurality of subject entities at the same time, the second recall evaluation thereof is determined to be obtained based on the first recall evaluation of each contained subject entity at the same time, and in order to realize that a plurality of subject entities are considered at the same time, the second recall evaluation thereof is obviously different from the maximum value of the first recall evaluation which can be obtained by one subject entity.
One implementation, including and not limited to, may be: in response to the target initial recall result containing a plurality of different subject entities, a product of first recall evaluations of the plurality of different subject entities comprising the target initial recall result is determined to be a second recall evaluation of the target initial recall result. The second recall evaluation can obviously distinguish the number of the contained subject entities from the numerical size level by multiplying the plurality of first recall evaluations, and then the second recall evaluation is placed at the position as far as possible according to the sequence of the recall results containing the plurality of subject entities.
It should be noted that, when the product calculation method is adopted, in order to avoid an abnormal result in which a certain first recall evaluation is zero and the final result is zero, it is also necessary to perform non-zero correction on each first recall evaluation as a multiplication calculation factor (for example, in the case where the first recall evaluation is set to be a non-negative value, a correction value with a fixed value of 1 is added).
The first recall evaluation and the second recall evaluation have various calculation modes, and only different actual conditions can be distinguished on the same reference, and the calculation mode is not limited to be specifically adopted.
According to the recall result sorting method provided by the embodiment of the disclosure, on the basis of acquiring a large number of initial recall results for a query word or a query sentence, key parts of the recall results are clarified by extracting a subject entity containing a limiting word, a first recall evaluation of each subject entity is comprehensively and accurately calculated by mapping relation between attributes of slot positions to which each noun forming the subject entity belongs and name of a table field in a database, and finally, a second recall evaluation of the corresponding initial recall result is determined according to the first recall evaluation of the subject entity, so that presentation sorting of each initial recall result is adjusted as accurately as possible based on the second recall evaluation, and the search efficiency of a certain entity serving as a subject is improved.
Referring to fig. 3, fig. 3 is a flowchart of a method for processing a subject entity according to an embodiment of the disclosure, where the process 300 includes the following steps:
step 301: performing word segmentation processing on the subject entity to obtain each noun forming the subject entity;
the step is to perform word segmentation processing on each subject entity by the execution main body, so as to obtain each noun forming the subject entity by splitting, and taking the subject entity of "about thirty three pieces of clothes with a color of a" as an example, the noun can be split: the three terms of "wear A color clothes", "thirty" and "Zhang three".
Step 302: determining at least one suspected attribute of the slot to which each noun belongs;
on the basis of step 301, this step aims to determine at least one suspected attribute of the slot to which each noun belongs by the execution subject, and taking the noun "a city" as an example, the suspected attribute may be: a place, an address, a place of residence, a place of business, a destination, a place of birth, etc.
Step 303: and determining a target attribute and a target table field name matched with the target attribute in the at least one suspected attribute respectively in a precise retrieval mode and a full-text retrieval mode.
On the basis of step 302, this step is intended to determine, by the execution subject, a target attribute with a higher probability and a target table field name matching the target attribute in the at least one suspected attribute respectively by means of an accurate search and a full-text search. In other words, through two retrieval modes of comprehensive accurate retrieval and full-text retrieval, the mapping relation which is formed by the target attribute and the field name of the target table is found in the suspected mapping relation formed by the suspected attribute and the field name of the corresponding alternative table.
The accurate retrieval mode is a full-matching retrieval mode, namely whether the field name of the alternative table completely identical to the text with the suspected attribute exists or not is checked; correspondingly, the full-text retrieval mode is a partial matching retrieval mode, and the full-text retrieval mode does not require the use of a complete text with suspected attributes, that is, only part of table field names in the text with suspected attributes need to be used as the alternative table field names, so that compared with the accurate retrieval mode, the full-text retrieval mode can recall more alternative results, but the accuracy is not as good as the accurate retrieval mode.
That is, the present embodiment obtains an accurate mapping relationship as comprehensively as possible by integrating the full-matching accurate retrieval method and the partial-matching full-text retrieval method.
An implementation, including and not limited to, may be:
judging whether a first alternative table field name completely identical to a suspected attribute exists in a database;
if yes, respectively determining the suspected attribute and the corresponding first alternative table field name as a target attribute and a target table field name;
if not, determining a second candidate list field name comprising at least a part of the corresponding noun in the database, and further determining the matching degree between the corresponding suspected attribute and the second candidate list field name;
and respectively determining the suspected attribute with the highest matching degree and the second alternative table field name as the target attribute and the target table field name.
In other words, in this implementation, first, it is checked in an accurate retrieval manner whether completely consistent table field names can be matched, if the accurate retrieval can be matched, the accurate retrieval is directly regarded as the target attribute and the target table field name, and only in the case that the accurate retrieval cannot be matched, the matching degree between the suspected attribute and the second candidate table field name is checked in a full-text retrieval manner, and the target attribute and the target table field name are determined based on the matching degree.
Of course, in addition to the determination methods provided in the above embodiments, in other embodiments, the accurate search result and the full-text search result may be obtained in parallel and independently, and the user target attribute and the target table field name may be finally determined based on the results of the accurate search result and the full-text search result.
Referring to fig. 4, fig. 4 is a flowchart of a method for determining a first recall evaluation according to an embodiment of the disclosure, wherein the process 400 includes the following steps:
step 401: establishing a mapping relation between the target attribute and the field name of the target table;
step 402: determining a first matching degree between the target attribute and the field name of the target table, and a second matching degree between the attribute value under the target attribute and the field content under the field name of the target table according to the mapping relation;
the step aims to determine a first matching degree between the target attribute and the field name of the target table and a second matching degree between the attribute value under the target attribute and the field content under the field name of the target table according to the mapping relation by the execution main body. The first matching degree and the second matching degree may include a text matching degree between texts, or a semantic matching degree of the first matching degree and the second matching degree at a semantic level, that is, a matching degree between semantics expressed by table field names given by semantic analysis of contexts of the nouns in the corresponding subject entities.
Step 403: determining a first recall rating for the respective subject entity based on the first degree of match, the second degree of match, the universal recall score, and the importance of the respective noun within the subject entity to which it pertains.
On the basis of step 402, this step is intended to determine, by the executing body described above, a first recall evaluation of the corresponding subject entity based on the first degree of matching, the second degree of matching, the universal recall score and the importance of the corresponding noun within the subject entity to which it belongs. The general recall score is a recall score given by an elastic search tool according to preset logic for a corresponding noun, and the importance is an importance parameter calculated by using a word Frequency (TF, term Frequency) and an Inverse text Frequency Index (IDF) (i.e., a parameter calculated by using a TF-IDF algorithm).
The embodiment provides a calculation method for calculating the recall evaluation of the subject entity by combining a plurality of characteristics, so as to consider the influence of characteristics at different angles on the matching degree as much as possible and improve the accuracy of the recall evaluation.
Referring to fig. 5, fig. 5 is a flowchart of a method for adjusting a calculation parameter according to an embodiment of the disclosure, where the process 500 includes the following steps:
step 501: receiving a sorting adjustment instruction returned for the sorted recall results according to the second recall evaluation;
the step aims to receive a sorting adjustment instruction returned according to the recall result sorted according to the second recall evaluation by the execution main body, namely the sorting adjustment instruction is used for representing the view that the user sending the instruction considers the previously presented sorting unreasonable based on actual conditions.
Step 502: determining a target recall result of abnormal sorting according to the sorting adjustment instruction;
in step 501, the execution entity determines the target recall result of the abnormal ordering according to the ordering adjustment instruction, that is, determines which recall results, such as some recall results or all recall results, the ordering adjustment instruction specifically points to.
Step 503: and adjusting the calculation parameters for calculating the second recall evaluation of the target recall result.
On the basis of step 502, this step is intended to adjust, by the execution main body described above, the calculation parameters for calculating the second recall evaluation that yields the target recall result, so that the adjusted calculation parameters can yield a recall result ranking that is as close as possible to the latest ranking corresponding to the ranking adjustment instruction.
On the basis of any of the above embodiments, in combination with scenes such as a smart city and city governance which can be specifically applied, under the condition that an initial recall result of a target query word or a target query sentence is obtained, a recall result sequence corresponding to a subject entity is accurately presented by using the entity which is targeted under the current scene, so that a relevant result of the target entity which is targeted under the current scene is determined quickly and efficiently, and the search efficiency is improved.
With further reference to fig. 6, as an implementation of the methods shown in the above-mentioned figures, the present disclosure provides an embodiment of a recall result ranking apparatus, which corresponds to the method embodiment shown in fig. 2, and which can be applied in various electronic devices.
As shown in fig. 6, the recall result sorting apparatus 600 of the present embodiment may include: an initial recall result acquiring unit 601, an initial recall result processing unit 602, a first recall evaluation determining unit 603, and a second recall evaluation determining and sorting unit 604. Wherein, the initial recall result acquiring unit 601 is configured to acquire an initial recall result corresponding to a target query term or a target query statement; an initial recall result processing unit 602 configured to extract a subject entity from each initial recall result, and determine a target attribute of a slot to which each noun constituting the subject entity belongs and a matched target table field name; the theme entity is an entity phrase containing a limiting word of the entity serving as the theme; a first recall evaluation determining unit 603 configured to determine a first recall evaluation of a corresponding subject entity based on a mapping relationship between the target attribute and the target table field name; the second recall evaluation determining and ranking unit 604 is configured to determine a second recall evaluation of the corresponding initial recall result according to the first recall evaluation of each subject entity, and rank the initial recall results according to the height of the second recall evaluation.
In the present embodiment, in the recall result ranking apparatus 600: the detailed processing and the technical effects of the initial recall result obtaining unit 601, the initial recall result processing unit 602, the first recall evaluation determining unit 603, and the second recall evaluation determining and sorting unit 604 can refer to the related descriptions of steps 201 to 204 in the corresponding embodiment of fig. 2, and are not repeated herein.
In some optional implementations of this embodiment, the initial recall result processing unit 602 may include a processing subunit configured to determine a target attribute of a slot to which each noun constituting the subject entity belongs and a target table field name that matches, and the processing subunit may include:
the word segmentation module is configured to perform word segmentation on the subject entity to obtain each noun forming the subject entity;
a suspected attribute determination module configured to determine at least one suspected attribute of the slot to which each noun belongs;
and the multi-retrieval mode comprehensive determination module is configured to determine the target attribute and the field name of the target table matched with the target attribute in the at least one suspected attribute respectively through an accurate retrieval mode and a full-text retrieval mode.
In some optional implementations of this embodiment, the multiple search mode comprehensive determination module may be further configured to:
in response to a first alternative table field name which is identical to a suspected attribute and exists in a database, respectively determining the suspected attribute and the corresponding first alternative table field name as a target attribute and a target table field name;
in response to the fact that no alternative table field names identical to one suspected attribute exist in the database, determining a second alternative table field name containing at least one part of a corresponding noun in the database, and determining the matching degree between the corresponding suspected attribute and the second alternative table field name;
and respectively determining the suspected attribute with the highest matching degree and the second alternative table field name as the target attribute and the target table field name.
In some optional implementations of this embodiment, the first recall evaluation determination unit 603 may be further configured to:
establishing a mapping relation between the target attribute and the field name of the target table;
determining a first matching degree between the target attribute and the field name of the target table, and a second matching degree between the attribute value under the target attribute and the field content under the field name of the target table according to the mapping relation; the matching degree comprises a text matching degree and a semantic matching degree;
determining a first recall rating for the respective subject entity based on the first degree of match, the second degree of match, the universal recall score, and the importance of the respective noun within the subject entity to which it pertains.
In some optional implementations of this embodiment, the universal recall score is a recall score given by the ElasticSearch for the corresponding noun; the semantic matching degree is the matching degree given by semantic analysis according to the context of the nouns in the corresponding subject entities; the importance is an importance parameter obtained by calculating word frequency and an inverse text frequency index.
In some optional implementations of this embodiment, the second recall evaluation determining and ranking unit 604 may include a ranking subunit configured to determine a second recall evaluation of the respective initial recall result from the first recall evaluation of each subject entity, and the ranking subunit may be further configured to:
in response to the target initial recall result containing a plurality of different subject entities, a product of first recall evaluations of the plurality of different subject entities comprising the target initial recall result is determined to be a second recall evaluation of the target initial recall result.
In some optional implementations of this embodiment, the recall result ranking apparatus 600 may further include:
a sorting adjustment instruction receiving unit adapted to receive a sorting adjustment instruction returned for the recall result sorted by the second recall evaluation;
a target recall result determination unit configured to determine a target recall result of the abnormal sorting according to the sorting adjustment instruction;
and a calculation parameter adjusting unit configured to adjust a calculation parameter for calculating a second recall evaluation that results in the target recall result.
The recall result ranking apparatus provided in this embodiment is an apparatus embodiment corresponding to the above method embodiment, and based on obtaining a large number of initial recall results for a query term or a query statement, the apparatus provided in this embodiment extracts a subject entity containing a qualifier to specify an important part of a recall result, and obtains a first recall evaluation of each subject entity through comprehensive and accurate calculation of a mapping relationship between attributes of slots to which each noun constituting the subject entity belongs and name of a table field in a database, and finally determines a second recall evaluation of a corresponding initial recall result according to the first recall evaluation of the subject entity, so as to adjust presentation ranking of each initial recall result as accurately as possible based on the second recall evaluation, and improve search efficiency for a certain entity serving as a subject.
According to an embodiment of the present disclosure, the present disclosure also provides an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the recall result ranking method described in any of the embodiments above.
According to an embodiment of the present disclosure, the present disclosure further provides a readable storage medium storing computer instructions for enabling a computer to implement the recall result ranking method described in any of the above embodiments when executed.
According to an embodiment of the present disclosure, there is also provided a computer program product, which when executed by a processor, is capable of implementing the recall result ranking method described in any of the embodiments above.
FIG. 7 shows a schematic block diagram of an example electronic device 700 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not intended to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
A number of components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 701 performs the respective methods and processes described above, such as the recall result ranking method. For example, in some embodiments, the recall result ranking method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 708. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 700 via ROM 702 and/or communications unit 709. When the computer program is loaded into the RAM 703 and executed by the computing unit 701, one or more steps of the recall result ranking method described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the recall result ranking method in any other suitable manner (e.g., by way of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server may be a cloud Server, which is also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service extensibility in the conventional physical host and Virtual Private Server (VPS) service.
According to the technical scheme of the embodiment of the disclosure, on the basis of obtaining a large number of initial recall results aiming at query words or query sentences, key parts of the recall results are clarified by extracting subject entities containing limiting words, a first recall evaluation of each subject entity is comprehensively and accurately calculated through mapping relations between attributes of slots to which each noun constituting the subject entities belongs and field names in a database, and finally a second recall evaluation of the corresponding initial recall result is determined according to the first recall evaluation of the subject entities, so that the presentation sequence of each initial recall result is adjusted as accurately as possible based on the second recall evaluation, and the search efficiency of a certain entity as a subject is improved.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (17)

1. A recall result ranking method comprising:
acquiring an initial recall result corresponding to a target query word or a target query statement;
respectively extracting a subject entity from each initial recall result, and determining a target attribute of a slot position to which each noun forming the subject entity belongs and a matched target table field name; wherein, the subject entity is an entity phrase containing a limiting word of the entity as the subject;
determining a first recall evaluation of a corresponding subject entity based on a mapping relation between the target attribute and the target table field name;
and determining a second recall evaluation of the corresponding initial recall result according to the first recall evaluation of each subject entity, and sequencing the initial recall results according to the height of the second recall evaluation.
2. The method of claim 1, wherein the determining a target attribute of a slot and a matching target table field name to which each noun constituting the subject entity belongs respectively comprises:
performing word segmentation processing on the subject entity to obtain each noun forming the subject entity;
determining at least one suspected attribute of the slot to which each noun belongs;
and determining the target attribute and a target table field name matched with the target attribute in at least one suspected attribute respectively in an accurate retrieval mode and a full-text retrieval mode.
3. The method of claim 2, wherein said determining said target attribute and a target table field name matching said target attribute in at least one of said suspect attributes by means of a precision search and a full text search, respectively, comprises:
in response to a first alternative table field name which is identical to a suspected attribute and exists in a database, determining the suspected attribute and the corresponding first alternative table field name as the target attribute and the target table field name respectively;
in response to the database not having a table field name identical to a suspected attribute, determining a second table field name in the database that contains at least a portion of the corresponding noun, and determining a degree of match between the corresponding suspected attribute and the second table field name;
and respectively determining the suspected attribute with the highest matching degree and the second candidate table field name as the target attribute and the target table field name.
4. The method of claim 1, wherein said determining a first recall evaluation of a respective subject entity based on a mapping between said target attribute and said target table field name comprises:
establishing a mapping relation between the target attribute and the field name of the target table;
determining a first matching degree between the target attribute and the field name of the target table and a second matching degree between the attribute value under the target attribute and the field content under the field name of the target table according to the mapping relation; the matching degree comprises a text matching degree and a semantic matching degree;
determining a first recall rating for a respective subject entity based on the first degree of match, the second degree of match, a universal recall score, and an importance of the respective noun within the subject entity to which it pertains.
5. The method of claim 4, wherein the universal recall score is a recall score given by an elastic search for the corresponding noun; the semantic matching degree is the matching degree given by semantic analysis according to the context of the nouns in the corresponding subject entities; the importance is an importance parameter obtained by calculating word frequency and an inverse text frequency index.
6. The method of claim 1, wherein said determining a second recall evaluation of the respective initial recall result from the first recall evaluation of each of the subject entities comprises:
in response to a target initial recall result including a plurality of different subject entities, determining a product of first recall evaluations of the plurality of different subject entities comprising the target initial recall result as a second recall evaluation of the target initial recall result.
7. The method of any of claims 1-6, further comprising:
receiving a sorting adjustment instruction returned for the sorted recall result according to the second recall evaluation;
determining a target recall result of abnormal sorting according to the sorting adjustment instruction;
and adjusting a calculation parameter for calculating a second recall evaluation for obtaining the target recall result.
8. A recall result ranking apparatus comprising:
an initial recall result acquisition unit configured to acquire an initial recall result corresponding to a target query term or a target query sentence;
the initial recall result processing unit is configured to extract a subject entity from each initial recall result respectively and determine a target attribute of a slot position to which each noun forming the subject entity belongs and a matched target table field name; wherein, the subject entity is an entity phrase containing a limiting word of the entity as the subject;
a first recall evaluation determination unit configured to determine a first recall evaluation of a corresponding subject entity based on a mapping relationship between the target attribute and the target table field name;
and the second recall evaluation determining and sorting unit is configured to determine a second recall evaluation of the corresponding initial recall result according to the first recall evaluation of each subject entity and sort the initial recall results according to the height of the second recall evaluation.
9. The apparatus according to claim 8, wherein the initial recall result processing unit includes a processing subunit configured to determine a target attribute of a slot to which each noun constituting the subject entity belongs and a matching target table field name, respectively, the processing subunit including:
the word segmentation module is configured to perform word segmentation on the subject entity to obtain each noun forming the subject entity;
a suspected attribute determination module configured to determine at least one suspected attribute of the slot to which each said noun belongs;
and the multi-retrieval mode comprehensive determining module is configured to determine the target attribute and a target table field name matched with the target attribute in at least one suspected attribute respectively through an accurate retrieval mode and a full-text retrieval mode.
10. The apparatus of claim 9, wherein the multiple retrieval approach comprehensive determination module is further configured to:
in response to a first alternative table field name which is identical to a suspected attribute and exists in a database, determining the suspected attribute and the corresponding first alternative table field name as the target attribute and the target table field name respectively;
in response to the fact that no alternative table field names identical to one suspected attribute exist in the database, determining a second alternative table field name containing at least one part of a corresponding noun in the database, and determining the matching degree between the corresponding suspected attribute and the second alternative table field name;
and respectively determining the suspected attribute with the highest matching degree and the second alternative table field name as the target attribute and the target table field name.
11. The apparatus of claim 8, wherein the first recall evaluation determination unit is further configured to:
establishing a mapping relation between the target attribute and the field name of the target table;
determining a first matching degree between the target attribute and the field name of the target table and a second matching degree between the attribute value under the target attribute and the field content under the field name of the target table according to the mapping relation; the matching degree comprises a text matching degree and a semantic matching degree;
determining a first recall rating for the respective subject entity based on the first degree of match, the second degree of match, a universal recall score, and an importance of the respective noun within the subject entity to which it pertains.
12. The apparatus of claim 11, wherein the universal recall score is a recall score given by ElasticSearch for the corresponding noun; the semantic matching degree is the matching degree given by semantic analysis according to the context of the noun in the corresponding subject entity; the importance is an importance parameter obtained by calculating word frequency and an inverse text frequency index.
13. The apparatus of claim 8, wherein said second recall evaluation determination and ranking unit comprises a ranking subunit configured to determine a second recall evaluation of a respective initial recall result from the first recall evaluation of each of said subject entities, said ranking subunit further configured to:
in response to a target initial recall result including a plurality of different subject entities, determining a product of first recall evaluations of the plurality of different subject entities comprising the target initial recall result as a second recall evaluation of the target initial recall result.
14. The apparatus of any of claims 8-13, further comprising:
a ranking adjustment instruction receiving unit configured to receive a ranking adjustment instruction returned for the recall result ranked according to the second recall evaluation;
a target recall result determining unit configured to determine a target recall result of an abnormal ranking according to the ranking adjustment instruction;
a calculation parameter adjustment unit configured to adjust a calculation parameter for calculating a second recall evaluation that resulted in the target recall result.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the recall result ranking method of any of claims 1-7.
16. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the recall result ranking method of any of claims 1-7.
17. A computer program product comprising a computer program which, when executed by a processor, carries out the steps of the recall result ranking method according to any one of claims 1 to 7.
CN202210901062.4A 2022-07-28 2022-07-28 Recall result sorting method, apparatus, device, storage medium and program product Pending CN115203275A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210901062.4A CN115203275A (en) 2022-07-28 2022-07-28 Recall result sorting method, apparatus, device, storage medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210901062.4A CN115203275A (en) 2022-07-28 2022-07-28 Recall result sorting method, apparatus, device, storage medium and program product

Publications (1)

Publication Number Publication Date
CN115203275A true CN115203275A (en) 2022-10-18

Family

ID=83583744

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210901062.4A Pending CN115203275A (en) 2022-07-28 2022-07-28 Recall result sorting method, apparatus, device, storage medium and program product

Country Status (1)

Country Link
CN (1) CN115203275A (en)

Similar Documents

Publication Publication Date Title
CN107436875B (en) Text classification method and device
CN113836314B (en) Knowledge graph construction method, device, equipment and storage medium
CN113326420B (en) Question retrieval method, device, electronic equipment and medium
CN113407850B (en) Method and device for determining and acquiring virtual image and electronic equipment
CN112560461A (en) News clue generation method and device, electronic equipment and storage medium
CN114116997A (en) Knowledge question answering method, knowledge question answering device, electronic equipment and storage medium
CN112926298A (en) News content identification method, related device and computer program product
EP3992814A2 (en) Method and apparatus for generating user interest profile, electronic device and storage medium
CN114444514B (en) Semantic matching model training method, semantic matching method and related device
KR20220024251A (en) Method and apparatus for building event library, electronic device, and computer-readable medium
CN115292506A (en) Knowledge graph ontology construction method and device applied to office field
CN115203275A (en) Recall result sorting method, apparatus, device, storage medium and program product
CN112926297A (en) Method, apparatus, device and storage medium for processing information
CN116628004B (en) Information query method, device, electronic equipment and storage medium
CN115033701B (en) Text vector generation model training method, text classification method and related device
CN113656393B (en) Data processing method, device, electronic equipment and storage medium
CN113595874B (en) Instant messaging group searching method and device, electronic equipment and storage medium
CN114281981B (en) News brief report generation method and device and electronic equipment
CN116610782B (en) Text retrieval method, device, electronic equipment and medium
CN116069914B (en) Training data generation method, model training method and device
CN116258138B (en) Knowledge base construction method, entity linking method, device and equipment
CN113377921B (en) Method, device, electronic equipment and medium for matching information
CN116010571A (en) Knowledge base construction method, information query method, device and equipment
CN116257690A (en) Resource recommendation method and device, electronic equipment and storage medium
CN114218478A (en) Recommendation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination