CN112989814B - Search map construction method, search device, search apparatus, and storage medium - Google Patents

Search map construction method, search device, search apparatus, and storage medium Download PDF

Info

Publication number
CN112989814B
CN112989814B CN202110212739.9A CN202110212739A CN112989814B CN 112989814 B CN112989814 B CN 112989814B CN 202110212739 A CN202110212739 A CN 202110212739A CN 112989814 B CN112989814 B CN 112989814B
Authority
CN
China
Prior art keywords
information
sentence
description
search
main
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110212739.9A
Other languages
Chinese (zh)
Other versions
CN112989814A (en
Inventor
章春芳
王欣晟
余岱
肖涛
张青清
李航
黄珊珊
杨立新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Unionpay Co Ltd
Original Assignee
China Unionpay Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Unionpay Co Ltd filed Critical China Unionpay Co Ltd
Priority to CN202110212739.9A priority Critical patent/CN112989814B/en
Publication of CN112989814A publication Critical patent/CN112989814A/en
Application granted granted Critical
Publication of CN112989814B publication Critical patent/CN112989814B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a search map construction method, a search device, search equipment and a storage medium, and belongs to the field of data processing. The method comprises the following steps: acquiring service information resources; inputting the business information resource into a main description generation model to obtain an output main description corresponding to the business information resource, wherein the main description generation model is used for extracting and processing the input information to output the main description corresponding to the input information, and the main description is used for representing the input information; according to the service information resource, the main body description corresponding to the service information resource and the knowledge information related to the main body description, a search map of the service field to which the service information resource belongs is constructed, wherein the search map comprises keywords, knowledge information related to the keywords and information sources of the keywords, the keywords comprise the main body description corresponding to the service information resource, and the information sources of the keywords comprise the service information resource. According to the embodiment of the application, the service processing efficiency can be improved.

Description

Search map construction method, search device, search apparatus, and storage medium
Technical Field
The application belongs to the field of data processing, and particularly relates to a search map construction method, a search device, search equipment and a storage medium.
Background
Under the condition of processing a plurality of businesses needing to refer to a large amount of professional data, professionals are required to review the large amount of professional data and combine own experience to obtain business results. For example, in the process of auditing by a professional auditor, the professional auditor needs to review regulatory regulations, regulatory requirements, etc., to determine the focus of attention on the current audit project.
Manual review of large amounts of specialized materials to determine business results can take a significant amount of time, resulting in inefficient business processes.
Disclosure of Invention
The embodiment of the application provides a search map construction method, a search device, search equipment and a storage medium, which can improve the efficiency of service processing.
In a first aspect, an embodiment of the present application provides a search map construction method, including: acquiring service information resources; inputting the business information resource into a main description generation model to obtain main description corresponding to the business information resource output by the main description generation model, wherein the main description generation model is used for extracting and processing the input information to output main description corresponding to the input information, and the main description is used for representing the input information; according to the service information resource, the main body description corresponding to the service information resource and the knowledge information related to the main body description, a search map of the service field to which the service information resource belongs is constructed, wherein the search map comprises keywords, knowledge information related to the keywords and information sources of the keywords, the keywords comprise the main body description corresponding to the service information resource, and the information sources of the keywords comprise the service information resource.
In a second aspect, an embodiment of the present application provides a retrieval method, including: receiving search information input by a user; inputting the retrieval information into a main body description generation model to obtain a target main body description corresponding to the retrieval information, which is output by the main body description generation model, wherein the main body description generation model is used for extracting the input information to output a main body description corresponding to the input information, and the main body description is used for representing the input information; matching the target subject description with a pre-generated search spectrum to obtain target keywords matched with the target subject description in the search spectrum, wherein the search spectrum comprises keywords associated with the service field to which the search information belongs, knowledge information associated with the keywords and information sources of the keywords; and outputting the target keyword sentence and knowledge information associated with the target keyword sentence.
In a third aspect, an embodiment of the present application provides a search map construction apparatus, including: the resource acquisition module is used for acquiring service information resources; the generating module is used for inputting the business information resources into the main description generating model to obtain main description corresponding to the business information resources output by the main description generating model, the main description generating model is used for extracting and processing the input information to output main description corresponding to the input information, and the main description is used for representing the input information; the map construction module is used for constructing a search map of the service field to which the service information resource belongs according to the service information resource, the main description corresponding to the service information resource and the knowledge information associated with the main description, wherein the search map comprises keywords and sentences, the knowledge information associated with the keywords and the information sources of the keywords and sentences, the keywords and sentences comprise the main description corresponding to the service information resource, and the information sources of the keywords and the sentences comprise the service information resource.
In a fourth aspect, an embodiment of the present application provides a search apparatus, including: the receiving module is used for receiving the search information input by the user; the generation module is used for inputting the search information into the main description generation model to obtain a target main description corresponding to the search information, which is output by the main description generation model, wherein the main description generation model is used for extracting the input information to output a main description corresponding to the input information, and the main description is used for representing the input information; the matching module is used for matching the target main body description with a pre-generated search map to obtain target keywords matched with the target main body description in the search map, wherein the search map comprises keywords associated with the service field to which the search information belongs, knowledge information associated with the keywords and information sources of the keywords; and the output module is used for outputting the target keyword and sentence and knowledge information related to the target keyword and sentence.
In a fifth aspect, an embodiment of the present application provides a search map construction apparatus, including: a processor and a memory storing computer program instructions; the processor, when executing the computer program instructions, implements the search map construction method of the first aspect.
In a sixth aspect, an embodiment of the present application provides a retrieval apparatus, including: a processor and a memory storing computer program instructions; the processor, when executing the computer program instructions, implements the retrieval method of the second aspect.
In a seventh aspect, embodiments of the present application provide a computer-readable storage medium, on which computer program instructions are stored, which when executed by a processor implement the search map construction method of the first aspect and/or the search method of the second aspect.
The application provides a search map construction method, a search device, search equipment and a storage medium. And constructing a search map of the business field of the business information resource according to the business information resource, the main description corresponding to the business information resource and the knowledge information associated with the main description in the business information resource. The body description can characterize important content in the business information resource. The main body description is used as a keyword sentence in the search spectrum, the indication information associated with the main body description is used as knowledge information associated with the keyword sentence in the search spectrum, and the service information resource is used as an information source of the keyword sentence in the search spectrum. The search pattern is used for searching, and matched keywords, information sources of the keywords, knowledge information related to the keywords and the like can be quickly searched through the search pattern, so that a user does not need to manually consult a large amount of professional data to search for contents needed by processing the service, the service processing time is shortened, and the service processing efficiency is improved.
Drawings
In order to more clearly illustrate the technical solution of the embodiments of the present application, the drawings that are needed to be used in the embodiments of the present application will be briefly described, and it is possible for a person skilled in the art to obtain other drawings according to these drawings without inventive effort.
FIG. 1 is a flowchart of an embodiment of a search map construction method according to the first aspect of the present application;
FIG. 2 is a schematic diagram of an example of a subject description generation model provided by an embodiment of the present application;
FIG. 3 is a schematic diagram of an example of a portion of a search spectrum provided by an embodiment of the present application;
FIG. 4 is a flowchart of another embodiment of the method for constructing a search map according to the first aspect of the present application;
FIG. 5 is a flow chart of an embodiment of a retrieval method according to a second aspect of the present application;
FIG. 6 is a schematic structural diagram of an embodiment of a search map construction apparatus according to a third aspect of the present application;
FIG. 7 is a schematic diagram of another embodiment of a search map construction apparatus according to a third aspect of the present application;
FIG. 8 is a schematic structural diagram of an embodiment of a retrieving device according to a fourth aspect of the present application;
FIG. 9 is a schematic diagram of an embodiment of a search map construction apparatus according to a fifth aspect of the present application;
Fig. 10 is a schematic structural diagram of an embodiment of a retrieving device according to a sixth aspect of the present application.
Detailed Description
Features and exemplary embodiments of various aspects of the present application will be described in detail below, and in order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail below with reference to the accompanying drawings and the detailed embodiments. It should be understood that the particular embodiments described herein are meant to be illustrative of the application only and not limiting. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the application by showing examples of the application.
Some business projects require a professional to manually review a large amount of professional data for processing, and review of professional data can take a significant amount of time in business processing, thereby reducing business processing efficiency. For example, in the case where the business scenario is an audit business scenario, a professional is required to review data such as regulations, industry requirements, and the like, so as to evaluate the evaluated party. But it takes a lot of time to consult regulations and industry requirements, affecting the efficiency and accuracy of business processing.
The application provides a search map construction method, a search device, search equipment and a storage medium, which can construct a search map and search by using the search map. The search map comprises keywords and sentences in the service information resource obtained through processing, indication information related to the keywords and information elements of the keywords and sentences, so that the search is convenient, the time spent on consulting related data can be reduced, the time required by service processing is shortened, and the service processing efficiency is improved.
A first aspect of the present application provides a search map construction method executable by a search map construction apparatus. Fig. 1 is a flowchart of an embodiment of a search map construction method according to the first aspect of the present application. As shown in fig. 1, the search map construction method may include steps S101 to S103.
In step S101, a service information resource is acquired.
The service information resource includes service data related to the service domain. The service information resource may specifically include structured data describing service data and/or unstructured data describing service data, which are not limited herein.
In some examples, the business information resources may include business information files and/or business history information.
The service information file may be implemented as a structured file and/or an unstructured file, which is not limited herein. The service information file may include an information file related to a service area. The service information files may include regulatory files, service requirement files, and the like. For example, where the business segment is an audit segment, the business information file may include basic regulations, financial regulations, tax regulations, industry regulatory requirements, administrative regulations, in-house control regulations, and the like.
The service history information is history information accumulated by processing services in the service field. The traffic history information may be implemented as structured data and/or unstructured data, and is not limited in this regard. The business history information may include, but is not limited to, empirical knowledge files written by the processor, search history entries, and the like. For example, in the case where the business domain is an audit domain, the business history information may include information of an audit method, an audit criterion, an audit modification tracking rule, a level of a history audit term, a frequency of occurrence of the history audit term, a root cause of the history audit term, and the like.
In the case where the traffic information resource includes unstructured data, the unstructured data may be converted into structured data through a conversion process, and then subjected to subsequent processes. The method technique employed to convert unstructured data into structured data is not limited herein. For example, the business information resources include pictures from which information can be extracted and otherwise consolidated using optical character recognition (Optical Character Recognition, OCR) techniques to convert to structured data.
In step S102, the service information resource is input into the body description generation model, and the body description corresponding to the service information resource output by the body description generation model is obtained.
The body description generation model is used for extracting and processing the input information to output a body description corresponding to the input information. The body describes information characterizing the input. The body description is much shorter than the entered information. The number of the subject description generation model output subject descriptions may be one or two or more, and is not limited herein. The number of subject descriptions output by the subject description generation model is related to the information of the input of the subject description generation model. For example, the information input by the subject description generation model includes a regulation file, and correspondingly, the subject description generation model may output a plurality of subject descriptions, each of which may correspond to a part of the content in the regulation file. For another example, the information input by the subject description generation model includes a history search term, and the subject description generation model may output a subject description, respectively.
In some examples, the subject description generation model may be implemented by natural language processing (Natural Language Processing, NLP) methods to extract key information in the business information resources and convert to corresponding subject descriptions, with which a search spectrum is constructed. The subject description generation model can be trained by the business information resource sample marked with the subject description.
Specifically, the subject description generation model may include a Long Short-Term Memory (LSTM) model, a Long-Term memory+conditional random field (Conditional Random Field, CRF) model (may be abbreviated as lstm+crf model), a two-way Long-Term memory+conditional random field model (may be abbreviated as Bi-lstm+crf model), and the like, without limitation.
For example, fig. 2 is a schematic diagram of an example of a subject description generation model provided in an embodiment of the present application. As shown in FIG. 2, the subject description generative model includes an input layer, an LSTM layer, and an output layer. The LSTM layer includes a plurality of LSTM cells. The input layer can convert the input business information resource into a vector sequence, and the vector sequence is respectively input into the forward-transmitted LSTM unit and the reverse-transmitted LSTM unit. The forward-transferred LSTM unit and the reverse-transferred LSTM unit input and output the obtained contents to and from the layers. The output layer outputs words corresponding to the input business information resources, and the words are marked with parts of speech. The main description corresponding to the business information resource can be obtained through the combination of the words output by the output layer.
In some examples, a business information resource is input into a subject description generation model, and split processing is performed on the business information resource through the subject description generation model to obtain more than two fragments of the business information resource. And generating a model through the main description, segmenting the segment, and labeling the part of speech of the words obtained by segmentation. And combining continuous nouns obtained by segmentation through a main body description generation model to obtain a first combined word and sentence. And obtaining and outputting the main description according to the first combined words and sentences.
The service information source can be split according to the titles of each level in the service information resource to obtain a plurality of fragments. For example, the content under each minimum level title or the minimum level title may be regarded as a corresponding clip according to the titles of each level of each chapter of the A1 file. The segmentation of the segment also includes removing punctuation, spaces, and the like, and is not limited in this regard. The words obtained by word segmentation are labeled with parts of speech, such as labeling words as nouns, verbs, numbers, and graduated words, and the like, which are not limited herein. Since nouns in the segments are more capable of representing the content of the segments. Nouns obtained by word segmentation can be combined to obtain a first combined word and sentence. The number of the first combined words may be one or two or more, and is not limited herein. And under the condition that the number of the first combined words and sentences corresponding to the fragment is more than two, selecting one from the more than two first combined words and sentences as a main body description corresponding to the fragment.
For example, the service information resource includes an A1 file, a chapter in the A1 file is titled "four and standard small micro-merchant order-receiving service management", the A1 file is input into the main body description generating model, and the main body description generating model can output the "small micro-merchant order-receiving service" as the main body description corresponding to the titled "four and standard small micro-merchant order-receiving service management".
In some examples, the first combined term may be associated with a subject weight. The main weight of the first combined word and sentence is used for representing the importance of the first combined word and sentence. The magnitude of the main body weight of the first combined word and sentence is in positive correlation with the importance of the first combined word and sentence, namely the higher the main body weight of the first combined word and sentence is, the higher the importance of the first combined word and sentence is. The higher the importance of the first combined term, the more accurate the representation of the fragment by the first combined term. And calculating the main body weight associated with each first combined word and sentence, determining the first combined word and sentence with the highest main body weight corresponding to the fragment as the main body description corresponding to the fragment, and outputting the main body description.
For example, the first combined phrase corresponding to a certain segment includes "small micro-merchant order service" and "merchant order service". The main weight of the small micro-merchant order-receiving service is 2.0384, and the main weight of the merchant order-receiving service is 1.232. Correspondingly selecting 'small micro business order-receiving business' with higher main body weight as main body description and outputting.
In some examples, when the first combined term corresponding to the segment does not include the subject information of the service information resource to which the segment belongs, the subject information and the first combined term are combined, and the combined subject information and the first combined term are determined to be the subject description corresponding to the segment and are output.
The segment belongs to a service information resource, the accuracy of the first combined expression of the main body information of the service information resource which the segment belongs to is not high enough, and the accuracy of the main body description of the segment can be improved by combining the main body information of the service information resource which the segment belongs to with the first combined expression.
For example, the first combined phrase of "the standard small micro merchant order receiving service management" of the fourth chapter header in the document "the notification about the standard payment innovative service" is "the small micro merchant order receiving service", and the main body information "the payment innovation" of the document "the notification about the standard payment innovative service" and the first combined phrase "the small micro merchant order receiving service" can be combined to obtain and output the main body description "the payment innovative small micro merchant order receiving service".
The description of the main body can be obtained comprehensively according to the weight of the main body and the main body information of the service information resource to which the fragment belongs, which is not limited herein. For example, the first combined phrase of "canonical small merchant order service management" of the fourth chapter title in the document "notification about canonical payment innovative service" includes "small micro merchant order service" and "merchant order service". The main weight of the small micro-merchant order-receiving service is 2.0384, and the main weight of the merchant order-receiving service is 1.232. Correspondingly selecting 'small micro merchant order receiving service' with higher main weight, combining main information 'payment innovation' of a file 'notification about standard payment innovation service' with 'small micro merchant order receiving service' into main description 'payment innovation small micro merchant order receiving service', and outputting.
In step S103, a search map of the service domain to which the service information resource belongs is constructed according to the service information resource, the main description corresponding to the service information resource, and the knowledge information associated with the main description.
The search map comprises keywords and sentences, knowledge information related to the keywords and sentences and an information source of the keywords and sentences. The keyword sentence comprises the main body description corresponding to the service information resource, and the main body description corresponding to the service information resource can be used as the keyword sentence in the search map. The information sources of the keywords include business information resources, namely, the business information resources are used as the information sources of the keywords corresponding to the business information resources in the search spectrum. The search map may further include attribute information of the information source, such as, but not limited to, a release mechanism of the information source, an activation date of the information source, an expiration date of the information source, and the like.
There is a subordinate relationship between a keyword sentence and an information source, that is, the keyword sentence belongs to the information source of the keyword sentence. There is an association relationship between knowledge information associated with the keyword sentence and the keyword sentence. And obtaining the relation among the keyword sentence, the associated knowledge information and the information source according to the keyword sentence, the associated knowledge information and the information source. According to the keyword sentences, the associated knowledge information and the information sources and the relations among the keyword sentences, the associated knowledge information and the information sources, a retrieval map can be constructed. The search map is used for searching, and according to the search information input by the user, keywords matched with the search information and indication information associated with the keywords can be obtained from the search map.
For example, the business domain is an audit domain, and fig. 3 is a schematic diagram of an example of a part of a search spectrum provided in an embodiment of the present application. As shown in fig. 3, the search map includes an information source 281 number file, a release mechanism of the information source 281 number file, an enabling date, an expiration date, keywords and sentences in the 281 number file, and knowledge information associated with the keywords and sentences. The key words and phrases in the 281 number file include "payment innovation business" and "payment innovation micro-merchant order-receiving business", etc. The keyword 'pay innovation business' is associated with knowledge information of four aspects of admission conditions, forbidden terms, quantized content and management requirements. The keyword of the payment innovation micro-merchant order receiving service is associated with knowledge information of four aspects of admission conditions, forbidden terms, quantized content and management requirements.
In the embodiment of the application, the acquired business information resource is input into the main description generation model to obtain the main description corresponding to the business information resource. And constructing a search map of the business field of the business information resource according to the business information resource, the main description corresponding to the business information resource and the knowledge information associated with the main description in the business information resource. The body description can characterize important content in the business information resource. The main body description is used as a keyword sentence in the search spectrum, the indication information associated with the main body description is used as knowledge information associated with the keyword sentence in the search spectrum, and the service information resource is used as an information source of the keyword sentence in the search spectrum. The search pattern is used for searching, and matched keywords, information sources of the keywords, knowledge information related to the keywords and the like can be quickly searched through the search pattern, so that a user does not need to manually consult a large amount of professional data to search for contents needed by processing the service, the service processing time is shortened, and the service processing efficiency is improved.
In some examples, a subject weight associated with a keyword sentence may also be added to the search spectrum. The main weight of the keyword sentence can be used to represent the importance of the keyword sentence. The magnitude of the main weight of the keyword sentence and the importance of the keyword sentence are in positive correlation. That is, the greater the main weight of the keyword sentence, the higher the importance of the keyword sentence. The importance here can be understood as the importance of the impact on the business process.
The subject weights associated with the keywords may vary, for example, as may the search information retrieved using the search map. If the keyword sentence in the search map is successfully matched with the search information, the main weight of the successfully matched keyword sentence can be continued, and the specific calculation aspect is described in the following. If matching of the keyword and the search information in the search map fails, the body weight associated with the keyword may be set to a default value, for example, may be set to "0" or "x", which is not limited herein.
Fig. 4 is a flowchart of another embodiment of the search map construction method provided in the first aspect of the present application. Fig. 4 is different from fig. 1 in that the search map construction method shown in fig. 4 further includes steps S104 to S108.
In step S104, the occurrence frequency level, execution completion level, and influence range level of the keyword are acquired.
The occurrence frequency level characterizes the occurrence frequency of the keyword sentence in the service information resource and/or the frequency of the keyword sentence being confirmed as a valid retrieval result. The frequency of occurrence level may be understood as a level of frequency of occurrence, and may be specifically obtained by classifying the frequency of occurrence. For example, the occurrence frequency of the keyword sentence in the service information resource is 0 to 5 times, and the occurrence frequency level of the keyword sentence is set as one level; the occurrence frequency of the keyword and sentence in the service information resource is 5 to 10 times, and the occurrence frequency level of the keyword and sentence is set as a second level; the occurrence frequency of the keyword and sentence in the service information resource is more than 10 times, and the occurrence frequency level of the keyword and sentence is set as three levels.
The frequency of occurrence of the keyword sentence in the service information resource may be the number of occurrences of the keyword sentence in the history service information resource within a predetermined time. For example, the frequency of occurrence of a keyword sentence in a traffic information resource may be the number of occurrences of the keyword sentence in the traffic information resource within 3 months from the current time.
In the retrieval process, keywords and sentences matching the retrieval information of the user are output. The user can evaluate the output keyword and sentence to confirm whether the output keyword and sentence is a valid search result. According to the evaluation operation of the user, the frequency with which the keyword sentence is confirmed as a valid search result can be recorded.
In some examples, where the occurrence frequency level characterizes the frequency of occurrence of a keyword sentence in the traffic information resource and the frequency of occurrence of a keyword sentence being confirmed as a valid retrieval result, the occurrence frequency level may specifically characterize a summation of the frequency of occurrence of a keyword sentence in the traffic information resource and the frequency of occurrence of the keyword sentence being confirmed as a valid retrieval result.
The frequency of occurrence of the keywords represented by the occurrence frequency level in the service information resource and/or the frequency of the keywords confirmed as effective retrieval results can influence the importance of the keywords. The higher the frequency of occurrence of the keywords represented by the occurrence frequency level in the service information resource and/or the frequency of the keywords confirmed as effective retrieval results, the higher the importance of the keywords, thereby ensuring the accuracy of the importance represented by the main weight of the keywords. The keyword sentence is confirmed to be the frequency of the effective retrieval result to participate in the calculation of the main body weight, so that the accuracy of the calculation of the main body weight can be further improved.
The execution completion level represents the execution completion level of the business recorded corresponding to the keyword sentence in the business information resource. The service information resource may record the service execution condition. The completion degree of the service execution can be determined according to the service execution condition. In some examples, a higher level of business completion indicates a higher level of business execution completion.
For example, the service information resource is recorded with related descriptions such as "unexecuted", "execution non-normative", "execution insufficient", "execution to be enhanced", and the like, and different execution completion degree levels can be set correspondingly. If relevant description such as 'unexecuted' is recorded in the service information resource, setting the execution completion degree grade as a first grade; related description such as 'execution non-standard' is recorded in the service information resource, and the execution completion degree level is set as a second level; related description such as 'insufficient execution' is recorded in the service information resource, and the execution completion degree level is set to be three-level; the business information resource is recorded with the description of 'execution waiting to be enhanced', and the execution completion degree level is set to be four.
The higher the degree of completion of the service execution, the fewer the places to be noted, and the smaller the influence on the importance of the keyword. I.e. the lower the degree of completion of the service execution, the more places that need to be noted, the greater its impact on the importance of the keyword. The service execution completion degree is introduced into the calculation of the main body weight, so that the accuracy of the importance represented by the main body weight of the keyword sentence can be ensured.
The scope of influence level characterizes the number of business associates affected by the keyword sentence. The scope of influence level may be in positive correlation with the number of business associates affected by the keyword. I.e. the higher the impact range level, the greater the number of business associates that the keyword affects. A business-associated party is an object associated with a business. If the business field is audit business, the business association party can be a merchant or an enterprise, etc., and the business association party is not limited herein.
For example, if the keyword sentence has no influence on the service association party, the influence range level may be set to be a level; if the keyword sentence affects 1 to 30 thousands of service association parties, the influence range level can be set to be a second level; if the keyword sentence affects 30 ten thousand to 50 ten thousand business association parties, the influence range level can be set to be three-level; if the keyword sentence affects more than 50 ten thousand business associates, the influence range level may be set to be four.
The higher the scope of influence level, i.e. the greater the number of business associates affected by the keyword, the higher the influence on the importance of the keyword. The influence range grade is introduced into the calculation of the main weight, so that the accuracy of the importance represented by the main weight of the keyword and sentence can be ensured.
In step S105, a first characterization value corresponding to the frequency of occurrence level is determined according to the frequency of occurrence level.
The first characterization value has a positive correlation with the frequency of occurrence level. That is, the first characterization value and the frequency of occurrence of the keyword sentence in the service information resource and/or the frequency of occurrence of the keyword sentence being confirmed as an effective search result are in positive correlation, and the higher the frequency of occurrence of the keyword sentence in the service information resource and/or the frequency of occurrence of the keyword sentence being confirmed as an effective search result, the larger the first characterization value. The embodiment of the positive correlation between the first characterization value and the frequency of occurrence level in the calculation method is not limited herein.
For example, the occurrence frequency of the keyword sentence in the service information resource is 0 to 5 times, the occurrence frequency level of the keyword sentence is set as one level, and the first characterization value=0.1+0.02×n, n is the occurrence frequency of the keyword sentence in the service information resource; the occurrence frequency of the keyword and the sentence in the service information resource is 5 to 10 times, the occurrence frequency level of the keyword and the sentence is set as a second level, and the first characterization value = 0.2+0.01 x n, n is the occurrence frequency of the keyword and the sentence in the service information resource; the occurrence frequency of the keyword and the sentence in the service information resource is more than 10 times, the occurrence frequency level of the keyword and the sentence is set as three levels, and the first characterization value=0.3.
In step S106, a second characterization value corresponding to the execution completion level is determined according to the execution completion level.
The second characterization value has a negative correlation with the execution completion level. That is, the second characterization value is inversely proportional to the execution completion degree, and the lower the execution completion degree is, the larger the second characterization value is. The embodiment of the negative correlation between the second characterization value and the execution completion level in the calculation method is not limited herein.
For example, the service information resource is recorded with related description such as 'not executed', the execution completion degree level is set to be a first level, and the second characterization value is set to be 0.4; related description such as 'execution non-standard' is recorded in the service information resource, the execution completion degree level is set to be two-level, and the second characterization value is 0.3; related description such as 'insufficient execution' is recorded in the service information resource, the execution completion degree level is set to be three-level, and the second characterization value is 0.2; the service information resource is recorded with related description of 'execution waiting to be enhanced', the execution completion degree level is set to be four, and the second characterization value is 0.1.
In step S107, a third characterization value corresponding to the influence range class is determined from the influence range class.
The third characterization value has a positive correlation with the range of influence level. That is, the third characterization value and the number of the service association parties affected by the keyword and sentence are in positive correlation, and the larger the number of the service association parties affected by the keyword and sentence is, the larger the third characterization value is.
For example, the keyword sentence has no influence on the service association party, the influence range level may be set to be one level, and the third characterization value is 0; the keyword sentence affects 0 to 30 thousands of service association parties, the level of the influence range can be set to be a second level, and the third characterization value is 0.1; if the keyword sentence affects 30 ten thousand to 50 ten thousand business association parties, the level of the influence range can be set to be three-level, and the third characterization value is 0.2; if the keyword and sentence affect more than 50 ten thousand service related parties, the level of the influence range can be set to be four, and the third characterization value is 0.3.
Step S105, step S106, and step S107 are independent of each other, and may be executed synchronously or sequentially, and the order of execution is not limited.
In step S108, the main weight is calculated by using the first characterization value, the second characterization value, and the third characterization value.
In some examples, the sum of the first, second, and third characterization values may be taken as the subject weight.
In other examples, a weighting algorithm may be utilized to set weights for the first, second, and third characterization values, respectively, with the sum of the products of the respective first, second, and third characterization values and the weights being the subject weight.
In still other examples, a first addition of 1 to the first characterization value may be calculated, a second addition of 1 to the second characterization value may be calculated, and a third addition of 1 to the third characterization value may be calculated. And determining the product of the first addition, the second addition and the third addition as a main weight value. I.e. the subject weight = (1+first characterization value) × (1+second characterization value) × (1+third characterization value).
In the above embodiment, the method for calculating the main weight of the first combined term may refer to the method for calculating the main weight of the keyword, which is not described herein.
The appearance frequency level, the execution completion level and the influence range level are quantized into a first characterization value, a second characterization value and a third characterization value respectively, the first characterization value, the second characterization value and the third characterization value are utilized to comprehensively calculate to obtain a main weight, the importance of the key words and sentences is quantized from multiple dimensions, and the accuracy of the importance reflected by the main weight is ensured. The main weight is added into the retrieval map, so that the content of the retrieval map can be enriched, and the accuracy and efficiency of the retrieval map for retrieval are improved.
A second aspect of the present application provides a retrieval method executable by a retrieval device. Fig. 5 is a flowchart of an embodiment of a search method according to the second aspect of the present application. As shown in fig. 5, the search method may include steps S201 to S204.
In step S201, search information input by a user is received.
The search information input by the user may be an entry, a sentence, an article fragment, or the like, which is not limited herein. The specific content of the retrieved information is related to the retrieved service area.
In step S202, the retrieval information is input into the subject description generation model, and the target subject description corresponding to the retrieval information output by the subject description generation model is obtained.
The body description generation model is used for extracting and processing the input information to output a body description corresponding to the input information. The body describes information characterizing the input. The details of the body description generation model can be referred to in the above description of the implementation, and will not be repeated here.
The target subject description is a subject description corresponding to the retrieval information. Specifically, a model is generated through the main description, the search information is segmented, and the words obtained by segmentation are labeled in part of speech. And combining continuous nouns obtained by segmentation through the main body description generation model to obtain a second combined noun. And obtaining and outputting the target subject description according to the second combined noun.
The word segmentation of the search information may also include, but is not limited to, removal of punctuation marks, spaces, and the like. The words obtained by word segmentation are labeled with parts of speech, such as labeling words as nouns, verbs, numbers, and graduated words, and the like, which are not limited herein. Since nouns in the segments can more represent the content of the retrieved information. Nouns obtained by word segmentation can be combined to obtain a second combined word and sentence. The second combined phrase may be described as a body corresponding to the retrieval information.
The details of the main description can be found in the related descriptions of the above embodiments, and are not repeated here.
In step S203, matching is performed with the target subject description and the search spectrum generated in advance, so as to obtain a target keyword sentence matched with the target subject description in the search spectrum.
The search map comprises keywords associated with the service field to which the search information belongs, knowledge information associated with the keywords and information sources of the keywords. The search spectrum is a search spectrum constructed by using the search spectrum construction method in the above embodiment, and the specific content can be referred to the related description of the above embodiment, which is not repeated here.
The target keyword sentence comprises a keyword sentence matched with the target subject description in the search map. The matching here may be embodied by comparing the similarity of the target subject description and the keyword sentence in the search map. The higher the similarity, the higher the matching degree. And selecting the keyword and sentence with the similarity with the target main body description meeting the preset condition as the target keyword and sentence. The number of target keywords may be one or two or more, and is not limited herein.
In some examples, the target subject description may be translated into a target description vector. And converting the key words and sentences in the search map into word and sentence vectors. And calculating the similarity between the target description vector and the word and sentence vector. And determining a target sentence vector matched with the target description vector in the sentence vectors based on the similarity. And determining the keyword sentence corresponding to the target word sentence vector as a target keyword sentence.
Specifically, word2vec algorithm can be used to convert the target subject description into target description vector and the keyword sentence into word sentence vector. The similarity between the object description vector and the phrase vector may include cosine similarity of the vectors, and the like, and is not limited herein.
In some examples, one of the term vectors having a similarity to the target description vector higher than the matching threshold may be regarded as the target term vector. In other examples, a predetermined number of matched word and sentence vectors having the highest similarity to the target description vector may be used as the target word and sentence vector. The manner in which the target phrase vector is selected based on the similarity is not limited herein.
In step S204, the target keyword sentence and knowledge information associated with the target keyword sentence are output.
In order to facilitate the user to acquire detailed contents more intuitively, knowledge information associated with the target keyword sentence may be output on the basis of outputting the target keyword sentence. Knowledge information related to the target keyword sentence may be output at one time or may be output separately, and is not limited herein. For example, a target keyword sentence may be output first, and knowledge information associated with the target keyword sentence may be output in case further operation of the user is received.
In the embodiment of the application, the received retrieval information is input into a main body description generation model, and the target main body description corresponding to the retrieval information output by the main body description generation model is obtained. And matching the target subject description with the search spectrum to obtain target keywords matched with the target subject description in the search spectrum and knowledge information related to the target keywords. The searching map is convenient for quick searching, the target main body description corresponding to the searching information is utilized for searching in the searching map, and the content related to the service processing can be quickly obtained without the need of users to manually consult a large amount of professional data to search for the content needed by the service processing, so that the service processing time is shortened, and the service processing efficiency is improved.
Moreover, by matching with the target main body description, adverse effects of information which is irrelevant to or has low relevance to the service corresponding to the search information on the service processing can be removed, and the information related to the service processing can be accurately determined, so that the accuracy of the service processing is improved.
In some embodiments, keywords may be associated with subject weights. The main weight of the keyword sentence is used for representing the importance of the keyword sentence. The main weight of the keyword sentence is determined according to the appearance frequency grade, the execution completion grade and the influence range grade of the keyword sentence.
The occurrence frequency level characterizes the occurrence frequency of the keyword sentence in the service information resource and/or the frequency of the keyword sentence being confirmed as a valid retrieval result. The execution completion level represents the execution completion level of the business recorded corresponding to the keyword sentence in the business information resource. The scope of influence level characterizes the number of business associates affected by the keyword sentence.
The specific content of determining the weight of the main body can be referred to the related description in the above embodiment, and will not be repeated here.
In some examples, the step S204 may be specifically refined to output the target keyword sentences arranged in the order of the main body weight from high to low, and the knowledge information associated with the target keyword sentences. The target keywords and sentences are arranged in the order from high to low according to the weight of the main body, so that a user can intuitively know the importance of the target keywords and the knowledge information related to the target keywords, the information required by service processing can be positioned more accurately, the time spent by service processing is further reduced, the service processing effect is improved, and the accuracy of service processing can be improved.
In some examples, the step S204 may be specifically refined to output the preset number of target keywords having the highest subject weight and knowledge information associated with the target keywords. The target keywords meeting the matching requirement can be further screened by using the main body weight and the preset number, and the preset number of target keywords with the highest main body weight are target keywords which are matched with the retrieval information and have the highest importance. The information required by the service processing can be more accurately positioned through the main weight and the preset number, so that the time spent by the service processing is further reduced, the service processing effect is improved, and the accuracy of the service processing can be improved.
The search map construction method and the search method in the above-described embodiments may be executed by the same apparatus or may be executed by different apparatuses, and are not limited thereto.
The third aspect of the present application provides a retrieval map construction apparatus. Fig. 6 is a schematic structural diagram of an embodiment of a search map construction apparatus according to a third aspect of the present application. As shown in fig. 6, the search profile construction apparatus 300 may include a resource acquisition module 301, a generation module 302, and a profile construction module 303.
The resource acquisition module 301 may be configured to acquire service information resources.
In some examples, the business information resources include business information files and/or business history information.
The generating module 302 may be configured to input the service information resource into the body description generating model, and obtain a body description corresponding to the service information resource output by the body description generating model.
The body description generation model is used for extracting and processing the input information to output a body description corresponding to the input information. The body describes information characterizing the input;
the map construction module 303 may be configured to construct a search map of a service domain to which the service information resource belongs according to the service information resource, a description of a main body corresponding to the service information resource, and knowledge information associated with the description of the main body.
The search map comprises keywords and sentences, knowledge information related to the keywords and sentences and an information source of the keywords and sentences. The keyword sentence comprises a main body description corresponding to the service information resource, and the information source of the keyword sentence comprises the service information resource.
In the embodiment of the application, the acquired business information resource is input into the main description generation model to obtain the main description corresponding to the business information resource. And constructing a search map of the service field of the service information resource according to the service information resource, the main description corresponding to the service information resource and the knowledge information associated with the main description in the service information resource. The body description can characterize important content in the business information resource. The main body description is used as a keyword sentence in the search spectrum, the indication information associated with the main body description is used as knowledge information associated with the keyword sentence in the search spectrum, and the service information resource is used as an information source of the keyword sentence in the search spectrum. The search pattern is used for searching, and matched keywords, information sources of the keywords, knowledge information related to the keywords and the like can be quickly searched through the search pattern, so that a user does not need to manually consult a large amount of professional data to search for contents needed by processing the service, the service processing time is shortened, and the service processing efficiency is improved.
In some examples, the generation module 302 described above may be used to: inputting the business information resource into a main description generation model; splitting the service information resource through a main body description generation model to obtain more than two fragments of the service information resource; generating a model through the main body description, segmenting the segment, and labeling the part of speech of the words obtained by segmentation; combining continuous nouns obtained by word segmentation through a main body description generation model to obtain a first combined word and sentence; and obtaining and outputting the main description according to the first combined words and sentences.
Specifically, the generating module 303 may be further configured to: determining a first combined word and sentence with the highest main body weight corresponding to the fragment as main body description corresponding to the fragment, and outputting the main body description, wherein the main body weight is associated with the first combined word and sentence, and the main body weight of the first combined word and sentence is used for representing the importance of the first combined word and sentence; and/or, when the first combined expression corresponding to the segment does not contain the main body information of the service information resource to which the segment belongs, combining the main body information and the first combined expression, determining the combined main body information and the first combined expression as the main body description corresponding to the segment, and outputting.
In some examples, the keywords are associated with a subject weight. The main weight of the keyword sentence is used for representing the importance of the keyword sentence. Fig. 7 is a schematic structural diagram of another embodiment of the search map construction apparatus according to the third aspect of the present application. Fig. 7 differs from fig. 6 in that the retrieval map construction device 300 shown in fig. 7 may further include a factor acquisition module 304 and a calculation module 305.
The factor obtaining module 304 may be configured to obtain a frequency level of occurrence, a level of completion of execution, and a level of influence range of the keyword.
The occurrence frequency level characterizes the occurrence frequency of the keyword sentence in the service information resource and/or the frequency of the keyword sentence being confirmed as a valid retrieval result. The execution completion level represents the execution completion level of the business recorded corresponding to the keyword sentence in the business information resource. The scope of influence level characterizes the number of business associates affected by the keyword sentence.
The calculation module 305 may be configured to determine, according to the frequency of occurrence level, a first characterization value corresponding to the frequency of occurrence level, where the first characterization value has a positive correlation with the frequency of occurrence level; determining a second characterization value corresponding to the execution completion degree grade according to the execution completion degree grade, wherein the second characterization value and the execution completion degree grade are in negative correlation; determining a third characterization value corresponding to the influence range grade according to the influence range grade, wherein the third characterization value and the influence range grade are in positive correlation; and calculating to obtain the main body weight by using the first characterization value, the second characterization value and the third characterization value.
Specifically, the computing module 305 may be configured to: calculating a first sum of 1 and the first characterization value; calculating a second sum of 1 and the second characterization value; calculating a third sum of 1 and the third characterization value; and determining the product of the first addition, the second addition and the third addition as a main weight value.
A fourth aspect of the present application provides a retrieval device. Fig. 8 is a schematic structural diagram of an embodiment of a search device according to a fourth aspect of the present application. As shown in fig. 8, the retrieval device 400 may include a receiving module 401, a generating module 402, a matching module 403, and an output module 404.
The receiving module 401 may be used to receive the retrieval information input by the user.
The generating module 402 may be configured to input the search information into the subject description generating model, and obtain a target subject description corresponding to the search information output by the subject description generating model.
The body description generation model is used for extracting and processing the input information to output a body description corresponding to the input information. The body describes information characterizing the input;
the matching module 403 may be configured to match the target subject description with a pre-generated search spectrum, so as to obtain a target keyword sentence matched with the target subject description in the search spectrum.
The search map comprises keywords associated with the service field to which the search information belongs, knowledge information associated with the keywords and information sources of the keywords.
The output module 404 may be used to output the target keyword sentence and knowledge information associated with the target keyword sentence.
In the embodiment of the application, the received retrieval information is input into a main body description generation model, and the target main body description corresponding to the retrieval information output by the main body description generation model is obtained. And matching the target subject description with the search spectrum to obtain target keywords matched with the target subject description in the search spectrum and knowledge information related to the target keywords. The searching map is convenient for quick searching, the target main body description corresponding to the searching information is utilized for searching in the searching map, and the content related to the service processing can be quickly obtained without the need of users to manually consult a large amount of professional data to search for the content needed by the service processing, so that the service processing time is shortened, and the service processing efficiency is improved.
In some examples, the generating module 402 may be configured to generate a model through the subject description, segment the search information, and label the part of speech of the segmented word; combining continuous nouns obtained by word segmentation through a main body description generation model to obtain a second combined noun; and obtaining and outputting the target subject description according to the second combined noun.
In some examples, the matching module 403 may be used to translate the target subject description into a target description vector; converting the key words and sentences in the search map into word and sentence vectors; calculating the similarity between the target description vector and the word and sentence vector; determining a target word and sentence vector matched with the target description vector in the word and sentence vectors based on the similarity; and determining the keyword sentence corresponding to the target word sentence vector as a target keyword sentence.
In some examples, the keywords are associated with a subject weight. The main weight of the keyword sentence is used for representing the importance of the keyword sentence.
The output module 404 may be configured to output the target keyword sentences arranged in the order of the main weight from high to low, and knowledge information associated with the target keyword sentences; and/or outputting a preset number of target keywords with highest main body weight and knowledge information associated with the target keywords.
Specifically, the main body weight of the keyword sentence is determined according to the occurrence frequency level, the execution completion level and the influence range level of the keyword sentence.
Wherein the occurrence frequency level characterizes the occurrence frequency of the keyword sentence in the service information resource and/or the frequency of the keyword sentence confirmed as an effective search result. The execution completion level represents the execution completion level of the business recorded corresponding to the keyword sentence in the business information resource. The scope of influence level characterizes the number of business associates affected by the keyword sentence.
The search map construction device 300 and the search device 400 in the above embodiment may be the same device or different devices, and will not be described herein.
A fifth aspect of the present application provides a retrieval map construction apparatus. Fig. 9 is a schematic structural view of an embodiment of a search map construction apparatus according to a fifth aspect of the present application. As shown in fig. 9, the search map construction apparatus 500 includes a memory 501, a processor 502, and a computer program stored on the memory 501 and executable on the processor 502.
In one example, the processor 502 may include a Central Processing Unit (CPU), or an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured as one or more integrated circuits that implement embodiments of the present application.
Memory 501 may include Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic disk storage media devices, optical storage media devices, flash Memory devices, electrical, optical, or other physical/tangible Memory storage devices. Thus, in general, the memory comprises one or more tangible (non-transitory) computer-readable storage media (e.g., memory devices) encoded with software comprising computer-executable instructions and when the software is executed (e.g., by one or more processors) it is operable to perform the operations described with reference to the search map construction method according to the present application.
The processor 502 runs a computer program corresponding to the executable program code by reading the executable program code stored in the memory 501 for realizing the search map construction method in the above-described embodiment.
In one example, the retrieval map construction device 500 may also include a communication interface 503 and a bus 504. As shown in fig. 9, the memory 501, the processor 502, and the communication interface 503 are connected to each other via a bus 504 and perform communication with each other.
The communication interface 503 is mainly used to implement communication between each module, apparatus, unit and/or device in the embodiments of the present application. Input devices and/or output devices may also be accessed through communication interface 503.
Bus 504 includes hardware, software, or both, coupling the components of search profile construction device 500 to one another. By way of example, and not limitation, bus 504 may include an accelerated graphics port (Accelerated Graphics Port, AGP) or other graphics Bus, an enhanced industry standard architecture (Enhanced Industry Standard Architecture, EISA) Bus, a Front Side Bus (FSB), a HyperTransport (HT) interconnect, an industry standard architecture (Industrial Standard Architecture, ISA) Bus, an Infiniband interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a micro channel architecture (Micro Channel Architecture, MCa) Bus, a peripheral component interconnect (Peripheral Component Interconnect, PCI) Bus, a PCI-Express (PCI-X) Bus, a serial advanced technology attachment (Serial Advanced Technology Attachment, SATA) Bus, a video electronics standards association local (Video Electronics Standards Association Local Bus, VLB) Bus, or other suitable Bus, or a combination of two or more of these. Bus 504 may include one or more buses, where appropriate. Although embodiments of the application have been described and illustrated with respect to a particular bus, the application contemplates any suitable bus or interconnect.
A sixth aspect of the application provides a retrieval device. Fig. 10 is a schematic structural diagram of an embodiment of a retrieving device according to a sixth aspect of the present application. As shown in fig. 10, the retrieval device 600 includes a memory 601, a processor 602, and a computer program stored on the memory 601 and executable on the processor 602.
In one example, the processor 602 may include a Central Processing Unit (CPU), or an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured as one or more integrated circuits that implement embodiments of the present application.
The Memory 601 may include Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic disk storage media devices, optical storage media devices, flash Memory devices, electrical, optical, or other physical/tangible Memory storage devices. Thus, in general, the memory comprises one or more tangible (non-transitory) computer-readable storage media (e.g., memory devices) encoded with software comprising computer-executable instructions and when the software is executed (e.g., by one or more processors) it is operable to perform the operations described with reference to the retrieval method in accordance with the present application.
The processor 602 executes a computer program corresponding to the executable program code by reading the executable program code stored in the memory 601 for realizing the retrieval method in the above-described embodiment.
In one example, recall device 600 may also include a communication interface 603 and a bus 604. As shown in fig. 10, the memory 601, the processor 602, and the communication interface 603 are connected to each other through the bus 604 and perform communication with each other.
The communication interface 603 is mainly used for implementing communication between each module, apparatus, unit and/or device in the embodiment of the present application. Input devices and/or output devices may also be accessed through the communication interface 603.
Bus 604 includes hardware, software, or both, coupling the components of recall device 600 to one another. By way of example, and not limitation, bus 604 may include an accelerated graphics port (Accelerated Graphics Port, AGP) or other graphics Bus, an enhanced industry standard architecture (Enhanced Industry Standard Architecture, EISA) Bus, a Front Side Bus (FSB), a HyperTransport (HT) interconnect, an industry standard architecture (Industrial Standard Architecture, ISA) Bus, an Infiniband interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a micro channel architecture (Micro Channel Architecture, MCa) Bus, a peripheral component interconnect (Peripheral Component Interconnect, PCI) Bus, a PCI-Express (PCI-X) Bus, a serial advanced technology attachment (Serial Advanced Technology Attachment, SATA) Bus, a video electronics standards association local (Video Electronics Standards Association Local Bus, VLB) Bus, or other suitable Bus, or a combination of two or more of these. Bus 604 may include one or more buses, where appropriate. Although embodiments of the application have been described and illustrated with respect to a particular bus, the application contemplates any suitable bus or interconnect.
A seventh aspect of the present application provides a computer readable storage medium, on which computer program instructions are stored, which when executed by a processor, can implement the search map construction method and/or the search method in the above embodiments, and achieve the same technical effects, and in order to avoid repetition, a detailed description is omitted herein. The computer readable storage medium may include a non-transitory computer readable storage medium, such as Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk, and the like, but is not limited thereto.
It should be understood that, in the present specification, each embodiment is described in an incremental manner, and the same or similar parts between the embodiments are all referred to each other, and each embodiment is mainly described in a different point from other embodiments. For apparatus embodiments, device embodiments, computer readable storage medium embodiments, the relevant points may be found in the description of method embodiments. The application is not limited to the specific steps and structures described above and shown in the drawings. Those skilled in the art will appreciate that various alterations, modifications, and additions may be made, or the order of steps may be altered, after appreciating the spirit of the present application. Also, a detailed description of known method techniques is omitted here for the sake of brevity.
Aspects of the present application are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to being, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware which performs the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that the above-described embodiments are exemplary and not limiting. The different technical features presented in the different embodiments may be combined to advantage. Other variations to the disclosed embodiments can be understood and effected by those skilled in the art in view of the drawings, the description, and the claims. In the claims, the term "comprising" does not exclude other means or steps; the word "a" does not exclude a plurality; the terms "first," "second," and the like, are used for designating a name and not for indicating any particular order. Any reference signs in the claims shall not be construed as limiting the scope. The functions of the various elements presented in the claims may be implemented by means of a single hardware or software module. The presence of certain features in different dependent claims does not imply that these features cannot be combined to advantage.

Claims (14)

1. The search map construction method is characterized by comprising the following steps of:
acquiring service information resources;
inputting the business information resource into a main description generation model to obtain main description corresponding to the business information resource output by the main description generation model, wherein the main description generation model is used for extracting and processing input information to output the main description corresponding to the input information, and the main description is used for representing the input information;
constructing a search map of the service field to which the service information resource belongs according to the service information resource, the main description corresponding to the service information resource and the knowledge information associated with the main description, wherein the search map comprises a keyword sentence, knowledge information associated with the keyword sentence and an information source of the keyword sentence, the keyword sentence comprises the main description corresponding to the service information resource, and the information source of the keyword sentence comprises the service information resource;
the key words and sentences are associated with main body weights, and the main body weights of the key words and sentences are used for representing the importance of the key words and sentences;
The method further comprises the steps of:
obtaining the occurrence frequency grade, execution completion degree grade and influence range grade of the keywords, wherein the occurrence frequency grade represents the occurrence frequency of the keywords in the business information resource and/or the frequency of the keywords confirmed as effective retrieval results, the execution completion degree grade represents the business execution completion degree recorded corresponding to the keywords in the business information resource, and the influence range grade represents the number of business association parties influenced by the keywords;
determining a first characterization value corresponding to the frequency level according to the frequency level, wherein the first characterization value and the frequency level are in positive correlation;
determining a second characterization value corresponding to the execution completion degree grade according to the execution completion degree grade, wherein the second characterization value and the execution completion degree grade are in negative correlation;
determining a third characterization value corresponding to the influence range grade according to the influence range grade, wherein the third characterization value and the influence range grade are in positive correlation;
calculating to obtain the main body weight by using the first characterization value, the second characterization value and the third characterization value;
The calculating to obtain the main weight value by using the first characterization value, the second characterization value and the third characterization value includes:
calculating a first sum of 1 and the first characterization value;
calculating a second sum of 1 and the second characterization value;
calculating a third sum of 1 and the third characterization value;
and determining the product of the first addition and the second addition and the third addition as the main weight value.
2. The method according to claim 1, wherein inputting the service information resource into a body description generation model to obtain a body description corresponding to the service information resource output by the body description generation model includes:
inputting the business information resource into a main description generation model;
splitting the service information resource through the main body description generation model to obtain more than two fragments of the service information resource;
generating a model through the main body description, segmenting the segment, and labeling the part of speech of the words obtained by segmentation;
combining continuous nouns obtained by segmentation through the main body description generation model to obtain a first combined word and sentence;
And obtaining and outputting the main description according to the first combined words and sentences.
3. The method of claim 2, wherein the deriving and outputting the subject description from the first combined phrase includes:
determining the first combined word and sentence with the highest main body weight corresponding to the fragment as the main body description corresponding to the fragment and outputting the main body description, wherein the main body weight is associated with the first combined word and sentence, and the main body weight of the first combined word and sentence is used for representing the importance of the first combined word and sentence;
and/or the number of the groups of groups,
and when the first combined word and sentence corresponding to the fragment does not contain the main body information of the service information resource to which the fragment belongs, combining the main body information and the first combined word and sentence, determining the combined main body information and the first combined word and sentence as the main body description corresponding to the fragment, and outputting.
4. The method according to claim 1, wherein the service information resource comprises a service information file and/or service history information.
5. A retrieval method, comprising:
receiving search information input by a user;
Inputting the search information into a main description generation model to obtain a target main description corresponding to the search information, which is output by the main description generation model, wherein the main description generation model is used for extracting and processing the input information to output a main description corresponding to the input information, and the main description is used for representing the input information;
matching the target subject description with a pre-generated search map to obtain target keywords matched with the target subject description in the search map, wherein the search map comprises keywords associated with the service field to which the search information belongs, knowledge information associated with the keywords and information sources of the keywords;
and outputting the target keyword sentence and knowledge information associated with the target keyword sentence.
6. The method of claim 5, wherein inputting the search information into a subject description generation model to obtain a target subject description corresponding to the search information output by the subject description generation model, comprises:
generating a model through the main body description, segmenting the search information, and labeling the part of speech of the words obtained by segmentation;
Combining continuous nouns obtained by word segmentation through the main body description generation model to obtain a second combined noun;
and obtaining and outputting the target subject description according to the second combined noun.
7. The method according to claim 5, wherein the matching the target subject description with a pre-generated search pattern to obtain a target keyword sentence in the search pattern, the target keyword sentence matching the target subject description, comprises:
converting the target subject description into a target description vector;
converting the keyword sentence in the search map into a word sentence vector;
calculating the similarity between the target description vector and the word and sentence vector;
determining a target sentence vector matched with the target description vector in the sentence vectors based on the similarity;
and determining the keyword sentence corresponding to the target word sentence vector as the target keyword sentence.
8. The method of claim 5, wherein the keywords are associated with a subject weight, the subject weight of the keywords being used to characterize the importance of the keywords;
the outputting the target keyword sentence and knowledge information associated with the target keyword sentence includes:
Outputting the target keyword sentences and knowledge information associated with the target keyword sentences which are arranged according to the sequence of the main weight from high to low;
and/or the number of the groups of groups,
and outputting a preset number of target keywords and knowledge information associated with the target keywords, wherein the preset number of the target keywords is the highest in the main body weight.
9. The method of claim 8, wherein the subject weights of the keywords are determined based on a frequency of occurrence level, a degree of completion level, and an influence range level of the keywords,
the occurrence frequency grade represents the occurrence frequency of the keyword sentence in the service information resource and/or the frequency of the keyword sentence confirmed as an effective search result, the execution completion degree grade represents the service execution completion degree recorded corresponding to the keyword sentence in the service information resource, and the influence range grade represents the number of service association parties influenced by the keyword sentence.
10. A search map construction apparatus, comprising:
the resource acquisition module is used for acquiring service information resources;
the generation module is used for inputting the business information resource into a main description generation model to obtain main description corresponding to the business information resource, which is output by the main description generation model, wherein the main description generation model is used for extracting and processing the input information to output the main description corresponding to the input information, and the main description is used for representing the input information;
The map construction module is used for constructing a search map of the service field to which the service information resource belongs according to the service information resource, the main description corresponding to the service information resource and the knowledge information related to the main description, wherein the search map comprises a keyword sentence, knowledge information related to the keyword sentence and an information source of the keyword sentence, the keyword sentence comprises the main description corresponding to the service information resource, the information source of the keyword sentence comprises the service information resource, the main weight is related to the keyword sentence, and the main weight of the keyword sentence is used for representing the importance of the keyword sentence;
the factor obtaining module is used for obtaining the appearance frequency grade, the execution completion degree grade and the influence range grade of the keyword and sentence, wherein the appearance frequency grade represents the appearance frequency of the keyword and sentence in the service information resource and/or the frequency of the keyword and sentence confirmed as an effective retrieval result, the execution completion degree grade represents the service execution completion degree recorded corresponding to the keyword and sentence in the service information resource, and the influence range grade represents the number of service association parties influenced by the keyword and sentence;
The computing module is used for determining a first characterization value corresponding to the frequency of occurrence grade according to the frequency of occurrence grade, and the first characterization value and the frequency of occurrence grade are in positive correlation; and determining a second characterization value corresponding to the execution completion degree grade according to the execution completion degree grade, wherein the second characterization value and the execution completion degree grade are in negative correlation; the third characterization value is used for determining a third characterization value corresponding to the influence range grade according to the influence range grade, and the third characterization value and the influence range grade are in positive correlation; the main body weight is calculated by using the first characterization value, the second characterization value and the third characterization value;
the computing module is specifically configured to: calculating a first sum of 1 and the first characterization value; calculating a second sum of 1 and the second characterization value; calculating a third sum of 1 and the third characterization value; and determining the product of the first addition and the second addition and the third addition as the main weight value.
11. A search device, comprising:
the receiving module is used for receiving the search information input by the user;
The generation module is used for inputting the search information into a main body description generation model to obtain a target main body description corresponding to the search information, which is output by the main body description generation model, wherein the main body description generation model is used for extracting and processing the input information to output a main body description corresponding to the input information, and the main body description is used for representing the input information;
the matching module is used for matching the target subject description with a pre-generated search spectrum to obtain target keywords matched with the target subject description in the search spectrum, and the search spectrum comprises keywords associated with the service field to which the search information belongs, knowledge information associated with the keywords and information sources of the keywords;
and the output module is used for outputting the target keyword sentence and knowledge information related to the target keyword sentence.
12. A search map construction apparatus, characterized by comprising: a processor and a memory storing computer program instructions;
the processor, when executing the computer program instructions, implements the search map construction method according to any one of claims 1 to 4.
13. A retrieval apparatus, characterized by comprising: a processor and a memory storing computer program instructions;
the processor, when executing the computer program instructions, implements a retrieval method as claimed in any one of claims 5 to 9.
14. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon computer program instructions, which when executed by a processor, implement the search map construction method according to any one of claims 1 to 4 and/or the search method according to any one of claims 5 to 9.
CN202110212739.9A 2021-02-25 2021-02-25 Search map construction method, search device, search apparatus, and storage medium Active CN112989814B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110212739.9A CN112989814B (en) 2021-02-25 2021-02-25 Search map construction method, search device, search apparatus, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110212739.9A CN112989814B (en) 2021-02-25 2021-02-25 Search map construction method, search device, search apparatus, and storage medium

Publications (2)

Publication Number Publication Date
CN112989814A CN112989814A (en) 2021-06-18
CN112989814B true CN112989814B (en) 2023-08-18

Family

ID=76350788

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110212739.9A Active CN112989814B (en) 2021-02-25 2021-02-25 Search map construction method, search device, search apparatus, and storage medium

Country Status (1)

Country Link
CN (1) CN112989814B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145102A (en) * 2018-09-06 2019-01-04 杭州安恒信息技术股份有限公司 Intelligent answer method and its knowledge mapping system constituting method, device, equipment
WO2019212729A1 (en) * 2018-05-03 2019-11-07 Microsoft Technology Licensing, Llc Generating response based on user's profile and reasoning on contexts
CN110909364A (en) * 2019-12-02 2020-03-24 西安工业大学 Source code bipolar software security vulnerability map construction method

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017100970A1 (en) * 2015-12-14 2017-06-22 Microsoft Technology Licensing, Llc Facilitating discovery of information items using dynamic knowledge graph
US11593671B2 (en) * 2016-09-02 2023-02-28 Hithink Financial Services Inc. Systems and methods for semantic analysis based on knowledge graph
US10984333B2 (en) * 2016-11-08 2021-04-20 Microsoft Technology Licensing, Llc Application usage signal inference and repository
US20180129372A1 (en) * 2016-11-08 2018-05-10 Microsoft Technology Licensing, Llc Dynamic insight objects for user application data
US11663497B2 (en) * 2019-04-19 2023-05-30 Adobe Inc. Facilitating changes to online computing environment by assessing impacts of actions using a knowledge base representation
US20200342056A1 (en) * 2019-04-26 2020-10-29 Tencent America LLC Method and apparatus for natural language processing of medical text in chinese
US11475059B2 (en) * 2019-08-16 2022-10-18 The Toronto-Dominion Bank Automated image retrieval with graph neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019212729A1 (en) * 2018-05-03 2019-11-07 Microsoft Technology Licensing, Llc Generating response based on user's profile and reasoning on contexts
CN109145102A (en) * 2018-09-06 2019-01-04 杭州安恒信息技术股份有限公司 Intelligent answer method and its knowledge mapping system constituting method, device, equipment
CN110909364A (en) * 2019-12-02 2020-03-24 西安工业大学 Source code bipolar software security vulnerability map construction method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
打造智能互联网化的支撑客服系统的设计与实现;仇建飞;;电子技术与软件工程(11);全文 *

Also Published As

Publication number Publication date
CN112989814A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
CN111222305B (en) Information structuring method and device
CN109872162B (en) Wind control classification and identification method and system for processing user complaint information
CN109299228B (en) Computer-implemented text risk prediction method and device
CN112035599B (en) Query method and device based on vertical search, computer equipment and storage medium
CN107102993B (en) User appeal analysis method and device
CN107229627B (en) Text processing method and device and computing equipment
JP2012118977A (en) Method and system for machine-learning based optimization and customization of document similarity calculation
CN110675269B (en) Text auditing method and device
CN113254643B (en) Text classification method and device, electronic equipment and text classification program
CN114116973A (en) Multi-document text duplicate checking method, electronic equipment and storage medium
CN111723192B (en) Code recommendation method and device
Huynh et al. When to use OCR post-correction for named entity recognition?
CN113986950A (en) SQL statement processing method, device, equipment and storage medium
CN115840808A (en) Scientific and technological project consultation method, device, server and computer-readable storage medium
CN111325033A (en) Entity identification method, entity identification device, electronic equipment and computer readable storage medium
CN113934834A (en) Question matching method, device, equipment and storage medium
CN113011156A (en) Quality inspection method, device and medium for audit text and electronic equipment
CN112989814B (en) Search map construction method, search device, search apparatus, and storage medium
CN116029290A (en) Text matching method, device, equipment, medium and product
CN109344388A (en) A kind of comment spam recognition methods, device and computer readable storage medium
CN111753540B (en) Method and system for collecting text data to perform Natural Language Processing (NLP)
KR102321871B1 (en) Pattern Dictionary Establish Method by Data Classify and Analysis
CN114117047A (en) Method and system for classifying illegal voice based on C4.5 algorithm
CN107729509A (en) The chapter similarity decision method represented based on recessive higher-dimension distributed nature
JP5824429B2 (en) Spam account score calculation apparatus, spam account score calculation method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant