CN112989814A - Retrieval map construction method, retrieval device, retrieval equipment and storage medium - Google Patents

Retrieval map construction method, retrieval device, retrieval equipment and storage medium Download PDF

Info

Publication number
CN112989814A
CN112989814A CN202110212739.9A CN202110212739A CN112989814A CN 112989814 A CN112989814 A CN 112989814A CN 202110212739 A CN202110212739 A CN 202110212739A CN 112989814 A CN112989814 A CN 112989814A
Authority
CN
China
Prior art keywords
sentence
information
retrieval
keyword
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110212739.9A
Other languages
Chinese (zh)
Other versions
CN112989814B (en
Inventor
章春芳
王欣晟
余岱
肖涛
张青清
李航
黄珊珊
杨立新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Unionpay Co Ltd
Original Assignee
China Unionpay Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Unionpay Co Ltd filed Critical China Unionpay Co Ltd
Priority to CN202110212739.9A priority Critical patent/CN112989814B/en
Publication of CN112989814A publication Critical patent/CN112989814A/en
Application granted granted Critical
Publication of CN112989814B publication Critical patent/CN112989814B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a retrieval map construction method, a retrieval device, retrieval equipment and a storage medium, and belongs to the field of data processing. The method comprises the following steps: acquiring service information resources; inputting the service information resource into a main body description generation model to obtain an output main body description corresponding to the service information resource, wherein the main body description generation model is used for extracting the input information to output the main body description corresponding to the input information, and the main body description is used for representing the input information; and constructing a retrieval map of the service field to which the service information resource belongs according to the service information resource, the main description corresponding to the service information resource and the knowledge information associated with the main description, wherein the retrieval map comprises a keyword sentence, the knowledge information associated with the keyword sentence and an information source of the keyword sentence, the keyword sentence comprises the main description corresponding to the service information resource, and the information source of the keyword sentence comprises the service information resource. According to the embodiment of the application, the efficiency of service processing can be improved.

Description

Retrieval map construction method, retrieval device, retrieval equipment and storage medium
Technical Field
The application belongs to the field of data processing, and particularly relates to a search map construction method, a search device and a storage medium.
Background
Under the condition of processing some services which need to refer to a large amount of professional data, professional personnel are required to consult the large amount of professional data and obtain service results by combining self experience. For example, in the process of auditing by professional auditors, the professional auditors need to refer to regulations, regulatory requirements and the like so as to determine the focus of attention of the current auditing project.
Manual review of large amounts of professional data to determine business results can take a significant amount of time, resulting in inefficient business processing.
Disclosure of Invention
The embodiment of the application provides a retrieval map construction method, a retrieval device, retrieval equipment and a storage medium, and can improve the efficiency of service processing.
In a first aspect, an embodiment of the present application provides a search map construction method, including: acquiring service information resources; inputting the business information resource into a main body description generation model to obtain a main body description which is output by the main body description generation model and corresponds to the business information resource, wherein the main body description generation model is used for extracting input information to output a main body description corresponding to the input information, and the main body description is used for representing the input information; and constructing a retrieval map of the service field to which the service information resource belongs according to the service information resource, the main description corresponding to the service information resource and the knowledge information associated with the main description, wherein the retrieval map comprises a keyword sentence, the knowledge information associated with the keyword sentence and an information source of the keyword sentence, the keyword sentence comprises the main description corresponding to the service information resource, and the information source of the keyword sentence comprises the service information resource.
In a second aspect, an embodiment of the present application provides a retrieval method, including: receiving retrieval information input by a user; inputting the retrieval information into a subject description generation model to obtain a target subject description which is output by the subject description generation model and corresponds to the retrieval information, wherein the subject description generation model is used for extracting the input information to output a subject description corresponding to the input information, and the subject description is used for representing the input information; matching the target main body description with a pre-generated retrieval map to obtain a target keyword sentence matched with the target main body description in the retrieval map, wherein the retrieval map comprises a keyword sentence associated with the business field to which the retrieval information belongs, knowledge information associated with the keyword sentence and an information source of the keyword sentence; and outputting the target keyword sentence and knowledge information associated with the target keyword sentence.
In a third aspect, an embodiment of the present application provides a retrieval map construction apparatus, including: the resource acquisition module is used for acquiring service information resources; the generation module is used for inputting the business information resources into the main body description generation model to obtain a main body description which is output by the main body description generation model and corresponds to the business information resources, the main body description generation model is used for extracting input information to output a main body description corresponding to the input information, and the main body description is used for representing the input information; the map construction module is used for constructing a retrieval map of the business field to which the business information resource belongs according to the business information resource, the main description corresponding to the business information resource and the knowledge information associated with the main description, wherein the retrieval map comprises key words and sentences, the knowledge information associated with the key words and sentences and the information source of the key words and sentences, the key words and sentences comprise the main description corresponding to the business information resource, and the information source of the key words and sentences comprises the business information resource.
In a fourth aspect, an embodiment of the present application provides a retrieval apparatus, including: the receiving module is used for receiving retrieval information input by a user; the generating module is used for inputting the retrieval information into a main body description generating model to obtain a target main body description which is output by the main body description generating model and corresponds to the retrieval information, the main body description generating model is used for extracting the input information to output a main body description which corresponds to the input information, and the main body description is used for representing the input information; the matching module is used for matching the target main body description with a pre-generated retrieval map to obtain a target keyword sentence matched with the target main body description in the retrieval map, wherein the retrieval map comprises the keyword sentence associated with the business field to which the retrieval information belongs, knowledge information associated with the keyword sentence and an information source of the keyword sentence; and the output module is used for outputting the target keyword sentence and the knowledge information associated with the target keyword sentence.
In a fifth aspect, an embodiment of the present application provides a retrieval map construction apparatus, including: a processor and a memory storing computer program instructions; the processor, when executing the computer program instructions, implements the retrieval map construction method of the first aspect.
In a sixth aspect, an embodiment of the present application provides a retrieval device, including: a processor and a memory storing computer program instructions; the processor, when executing the computer program instructions, implements the retrieval method of the second aspect.
In a seventh aspect, an embodiment of the present application provides a computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the retrieval map construction method of the first aspect and/or the retrieval method of the second aspect are/is implemented.
The application provides a retrieval map construction method, a retrieval device, a retrieval apparatus and a storage medium, wherein a subject description corresponding to a service information resource is obtained by inputting an acquired service information resource into a subject description generation model. And constructing a retrieval map of the service field of the service information resource according to the service information resource, the main body description corresponding to the service information resource and the associated knowledge information of the main body description in the service information resource. The body description can characterize important content in the traffic information resource. The main body description is used as a keyword sentence in the retrieval map, the associated indication information is used as knowledge information associated with the keyword sentence in the retrieval map, and the service information resource is used as an information source of the keyword sentence in the retrieval map. The retrieval map is used for retrieval, matched keyword sentences, information sources of the keyword sentences, knowledge information related to the keyword sentences and the like can be quickly retrieved through the retrieval map, and a user does not need to manually look up a large amount of professional data to find contents needed by processing services, so that the service processing time is shortened, and the service processing efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of an embodiment of a search map construction method provided in the first aspect of the present application;
FIG. 2 is a schematic diagram of an example of a subject description generative model provided by an embodiment of the present application;
FIG. 3 is a schematic diagram of an example of a portion of a search atlas provided by an embodiment of the application;
fig. 4 is a flowchart of another embodiment of a search map construction method provided in the first aspect of the present application;
FIG. 5 is a flow chart of an embodiment of a retrieval method provided by the second aspect of the present application;
FIG. 6 is a schematic structural diagram of an embodiment of a search map constructing apparatus according to a third aspect of the present application;
FIG. 7 is a schematic structural diagram of another embodiment of a search map constructing apparatus according to a third aspect of the present application;
fig. 8 is a schematic structural diagram of an embodiment of a retrieval apparatus according to a fourth aspect of the present application;
FIG. 9 is a schematic structural diagram of an embodiment of a search map construction apparatus provided in the fifth aspect of the present application;
fig. 10 is a schematic structural diagram of an embodiment of a retrieval apparatus according to a sixth aspect of the present application.
Detailed Description
Features and exemplary embodiments of various aspects of the present application will be described in detail below, and in order to make objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are intended to be illustrative only and are not intended to be limiting. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present application by illustrating examples thereof.
Some business projects require professionals to manually refer to a large amount of professional data for processing, and referring to the professional data can take a large amount of time in business processing, so that the business processing efficiency is reduced. For example, in the case that the service scenario is an audit service scenario, a professional is required to review the data of regulations, industry requirements and the like so as to evaluate the evaluated party. But consulting the regulations and industry requirements can take a lot of time, affecting the efficiency and accuracy of business processing.
The application provides a retrieval map construction method, a retrieval device, retrieval equipment and a storage medium, which can construct a retrieval map and utilize the retrieval map to perform retrieval. The retrieval map comprises the processed keyword sentences in the service information resources, the indication information related to the keyword sentences and the information elements of the keyword sentences, so that the retrieval is convenient, the time spent on consulting the related data can be reduced, the time required by service processing is shortened, and the efficiency of the service processing is improved.
A first aspect of the present application provides a search map construction method that is executable by a search map construction apparatus. Fig. 1 is a flowchart of an embodiment of a search map construction method provided in the first aspect of the present application. As shown in fig. 1, the search map construction method may include steps S101 to S103.
In step S101, a service information resource is acquired.
The service information resources include service data related to the service domain. The service information resource may specifically include, but is not limited to, structured data describing service profiles and/or unstructured data describing service profiles.
In some examples, the traffic information resource may include a traffic information file and/or traffic history information.
The service information file may be implemented as a structured file and/or an unstructured file, but is not limited thereto. The service information file may include information files related to service areas. The service information file may include a regulation file, a service requirement file, and the like. For example, in the case where the business field is the auditing field, the business information file may include basic regulations, financial regulations, tax regulations, industry regulatory requirements, administrative regulations, internal enterprise control regulations, and the like.
The service history information is history information accumulated by processing services in the service field. The business history information may be implemented as structured data and/or unstructured data, and is not limited herein. The business history information may include, but is not limited to, an empirical knowledge file written by a processor, a search history entry, and the like. For example, in the case that the service field is the audit field, the service history information may include information such as an audit method, an audit criterion, an audit modification tracking rule, a level of a history audit entry, an occurrence frequency of the history audit entry, and a root cause of the history audit entry.
Under the condition that the service information resource comprises unstructured data, the unstructured data can be converted into structured data through conversion processing, and then subsequent processing is carried out. The method and technique used to convert unstructured data to structured data is not limited herein. For example, the service information resource includes a picture, and information can be extracted from the picture by using an Optical Character Recognition (OCR) technology and other corresponding arrangement can be performed to convert the information into structured data.
In step S102, the service information resource is input into the subject description generation model, and a subject description corresponding to the service information resource and output by the subject description generation model is obtained.
The subject description generation model is used for performing extraction processing on the input information to output a subject description corresponding to the input information. The body describes information that characterizes the input. The subject description is much shorter than the information entered. The number of the subject description generation model output subject descriptions may be one, or two or more, and is not limited herein. The number of subject descriptions output by the subject description generative model is related to the information input by the subject description generative model. For example, the information input by the subject description generation model includes a regulation file, and correspondingly, the subject description generation model may output a plurality of subject descriptions, each of which may correspond to a part of the contents in the regulation file. For another example, the information input by the body description generation model includes a history search term, and correspondingly, the body description generation model can output a body description.
In some examples, the body description generation model may be implemented by a Natural Language Processing (NLP) method to extract key information in the business information resource and convert the key information into corresponding body descriptions, and construct a retrieval map using the body descriptions. The subject description generation model can be obtained by training a service information resource sample labeled with subject description.
Specifically, the subject description generation model may include a Long Short-Term Memory (LSTM) model, a Long Short-Term Memory + Conditional Random Field (CRF) model (which may be abbreviated as LSTM + CRF model), a bidirectional Long Short-Term Memory + Conditional Random Field model (which may be abbreviated as Bi-LSTM + CRF model), and the like, which are not limited herein.
For example, fig. 2 is a schematic diagram of an example of a subject description generative model provided in an embodiment of the present application. As shown in fig. 2, the subject description generative model includes an input layer, an LSTM layer, and an output layer. The LSTM layer includes a plurality of LSTM units. The input layer can convert the input service information resource into vector sequences which are respectively input into the forward-transmitted LSTM unit and the backward-transmitted LSTM unit. The forward passed LSTM units and the backward passed LSTM units input the resulting content to the output layer. And the output layer outputs words corresponding to the input service information resources, and the words are marked with parts of speech. Through the combination of the words output by the output layer, the main body description corresponding to the service information resource can be obtained.
In some examples, a service information resource is input into a subject description generation model, and the service information resource is split through the subject description generation model to obtain two or more segments of the service information resource. And generating a model through the main body description, segmenting the fragments, and performing part-of-speech tagging on the words obtained by segmenting the words. And combining continuous nouns obtained by the participles through a main body description generation model to obtain a first combined word and sentence. And obtaining and outputting the main body description according to the first combined words and sentences.
The service information source can be split according to the titles of all levels in the service information resource to obtain a plurality of fragments. For example, according to the titles at each level of each chapter of the document No. a1, the content or the title at the minimum level under each title at the minimum level may be regarded as the corresponding segment. The word segmentation of the segment also includes removing punctuation marks, spaces, etc., and is not limited herein. The part-of-speech tagging is performed on the words obtained by the word segmentation, and the tagged words are nouns, verbs, numerals, quantifiers, and the like, which is not limited herein. The nouns in the fragments can better reflect the content of the fragments. The words obtained by word segmentation can be combined to obtain a first combined word and sentence. The number of the first combined phrases may be one, or two or more, and is not limited herein. And when the number of the first combined sentences corresponding to the segment is more than two, selecting one of the more than two first combined sentences as the main description corresponding to the segment.
For example, the service information resource includes a file No. a1, a chapter in a file No. a1 is entitled "four, standard small business order collection service management", the file No. a1 is input into a subject description generation model, and the subject description generation model can output a subject description corresponding to "small business order collection service" as a title "four, standard small business order collection service management".
In some examples, the first combined term may be associated with a body weight. The main body weight of the first combined sentence is used for representing the importance of the first combined sentence. The magnitude of the main weight of the first combined word and sentence is in positive correlation with the importance of the first combined word and sentence, that is, the higher the main weight of the first combined word and sentence is, the higher the importance of the first combined word and sentence is. The higher the importance of the first combined word and sentence is, the more accurate the first combined word and sentence embodies the segment. And calculating the main weight associated with each first combined word and sentence, determining the first combined word and sentence with the highest main weight corresponding to the segment as the main description corresponding to the segment, and outputting.
For example, the first combined words and phrases corresponding to a certain segment include "mini-merchant billing service" and "merchant billing service". The main weight of the "small and micro merchant billing service" is 2.0384, and the main weight of the "merchant billing service" is 1.232. Correspondingly, the 'small micro-business order receiving service' with higher main body weight is selected as the main body description and output.
In some examples, in a case where a first combined word corresponding to a fragment does not include body information of a service information resource to which the fragment belongs, the body information and the first combined word are combined, and the combined body information and the first combined word are determined as a body description corresponding to the fragment and output.
The fragments belong to the service information resources, the accuracy of the first combined words and phrases, which do not include the main body information of the service information resources to which the fragments belong, on the fragments is not high enough, and the main body description mode can be obtained by combining the main body information of the service information resources to which the fragments belong and the first combined words and phrases, so that the accuracy of the main body description on the fragments is improved.
For example, the first combined word of "standard small merchant billing service management" in the fourth chapter of the document "notification about standard payment innovation service" may be "small merchant billing service", and the main body information "payment innovation" of the document "notification about standard payment innovation service" may be combined with the first combined word of "small merchant billing service", to obtain and output the main body description "payment innovation small merchant billing service".
The body description may also be obtained comprehensively according to the body weight and the body information of the service information resource to which the segment belongs, which is not limited herein. For example, the first combined phrase of "standard small merchant billing service management" in chapter four of the document "notice on standard payment innovation service" includes "small merchant billing service" and "merchant billing service". The main weight of the "small and micro merchant billing service" is 2.0384, and the main weight of the "merchant billing service" is 1.232. Correspondingly selecting the 'small micro merchant order-receiving service' with higher main body weight, and combining the main body information 'payment innovation' and 'small micro merchant order-receiving service' of the 'notice about standard payment innovation service' into a main body description 'payment innovation small micro merchant order-receiving service' and outputting the main body description.
In step S103, a retrieval map of the service domain to which the service information resource belongs is constructed according to the service information resource, the subject description corresponding to the service information resource, and the knowledge information associated with the subject description.
The retrieval map comprises the keyword sentences, knowledge information related to the keyword sentences and information sources of the keyword sentences. The keyword sentence comprises a main body description corresponding to the service information resource, namely, the main body description corresponding to the service information resource can be used as the keyword sentence in the retrieval map. The information source of the keyword sentence comprises a service information resource, namely the service information resource is used as the information source of the keyword sentence corresponding to the service information resource in the retrieval map. The retrieval map may further include attribute information of the information source, such as a publishing organization of the information source, an activation date of the information source, an expiration date of the information source, and the like, which are not limited herein.
There is an affiliation between a keyword sentence and an information source, i.e., the keyword sentence belongs to the information source of the keyword sentence. There is an association between the knowledge information associated with the keyword sentence and the keyword sentence. And obtaining the relation among the keyword sentences, the associated knowledge information and the information source according to the keyword sentences, the associated knowledge information and the information source. And constructing a retrieval map according to the keyword sentence, the associated knowledge information and the information source and the relationship among the keyword sentence, the associated knowledge information and the information source. The retrieval map is used for retrieval, and the keyword sentences matched with the retrieval information and the instruction information related to the keyword sentences can be obtained from the retrieval map according to the retrieval information input by the user.
For example, the business domain is an auditing domain, and fig. 3 is a schematic diagram of an example of a part of a search graph provided by the embodiment of the present application. As shown in fig. 3, the retrieval map includes information source 281 file, release mechanism, activation date, expiration date of information source 281 file, keyword sentence in 281 file, and knowledge information related to keyword sentence. The keyword sentences in the document No. 281 include "payment innovation service" and "payment innovation small and micro merchant receipt service" and the like. The keyword sentence 'payment innovation business' is associated with knowledge information of four aspects of admission conditions, forbidden clauses, quantitative contents and management requirements. The keyword sentence 'payment innovation small micro-merchant receipt service' is associated with knowledge information in four aspects of admission conditions, forbidden clauses, quantitative contents and management requirements.
In the embodiment of the application, the obtained service information resource is input into the subject description generation model, so that the subject description corresponding to the service information resource is obtained. And constructing a retrieval map of the service field of the service information resource according to the service information resource, the main body description corresponding to the service information resource and the associated knowledge information of the main body description in the service information resource. The body description can characterize important content in the traffic information resource. The main body description is used as a keyword sentence in the retrieval map, the associated indication information is used as knowledge information associated with the keyword sentence in the retrieval map, and the service information resource is used as an information source of the keyword sentence in the retrieval map. The retrieval map is used for retrieval, matched keyword sentences, information sources of the keyword sentences, knowledge information related to the keyword sentences and the like can be quickly retrieved through the retrieval map, and a user does not need to manually look up a large amount of professional data to find contents needed by processing services, so that the service processing time is shortened, and the service processing efficiency is improved.
In some examples, a body weight associated with the keyword sentence may also be added in the retrieval map. The main weight of the keyword sentence can be used for representing the importance of the keyword sentence. The magnitude of the main weight of the keyword sentence is in positive correlation with the importance of the keyword sentence. That is, the greater the weight of the main body of the keyword sentence, the higher the importance of the keyword sentence. Importance here is understood to be the importance of the impact on the business process.
The subject weight associated with a keyword sentence may vary, for example, with search information retrieved using a search graph. If the keyword sentence in the retrieval map is successfully matched with the retrieval information, the main weight of the successfully matched keyword sentence can be continued, and the specific calculation aspect is described in the following. If the keyword sentence in the search graph fails to match the search information, the main weight associated with the keyword sentence may be set to a default value, for example, may be set to "0" or "x", which is not limited herein.
Fig. 4 is a flowchart of another embodiment of a search map construction method provided in the first aspect of the present application. Fig. 4 is different from fig. 1 in that the retrieval map construction method shown in fig. 4 further includes step S104 to step S108.
In step S104, the appearance frequency level, the execution completion degree level, and the influence range level of the keyword sentence are acquired.
The frequency of occurrence grade represents the frequency of occurrence of the keyword sentences in the service information resources and/or the frequency of the keyword sentences confirmed as effective retrieval results. The appearance frequency grade can be understood as the grade of the appearance frequency, and can be obtained by grading the appearance frequency. For example, the frequency of occurrence of a keyword sentence in a service information resource is 0 to 5 times, and the frequency level of occurrence of the keyword sentence is set as one level; the frequency of occurrence of the keyword sentence in the service information resource is 5 to 10 times, and the frequency level of occurrence of the keyword sentence is set as two levels; the frequency of occurrence of the keyword sentence in the service information resource is more than 10 times, and the frequency level of occurrence of the keyword sentence is set to be three levels.
The frequency of the keyword sentence appearing in the service information resource may be the number of times the keyword sentence appears in the historical service information resource within a predetermined time. For example, the frequency of occurrence of a keyword sentence in a business information resource may be the number of occurrences of a keyword sentence in a business information resource within 3 months from the current time.
In the process of retrieval, keyword sentences matched with the retrieval information of the user are output. The user can evaluate the output keyword sentence and confirm whether the output keyword sentence is a valid retrieval result. According to the evaluation operation of the user, the frequency of confirming the keyword sentence as a valid retrieval result can be recorded.
In some examples, where the frequency of occurrence ranking characterizes the frequency with which a keyword sentence occurs in a business information resource and the frequency with which a keyword sentence is confirmed as a valid search result, the frequency of occurrence ranking may specifically characterize the sum of the frequency with which a keyword sentence occurs in a business information resource and the frequency with which the keyword sentence is confirmed as a valid search result.
The frequency of occurrence of the keyword sentences represented by the frequency level in the service information resource and/or the frequency of confirming the keyword sentences as effective retrieval results can influence the importance of the keyword sentences. The higher the frequency of occurrence of the keyword sentences represented by the frequency grade in the service information resources and/or the higher the frequency of confirming that the keyword sentences are effective retrieval results, the higher the importance of the keyword sentences, thereby ensuring the accuracy of the importance represented by the main weight of the keyword sentences. The frequency of the keyword sentences confirmed as effective retrieval results participates in the calculation of the main body weight, so that the accuracy of the calculation of the main body weight can be further improved.
The execution completion degree grade represents the business execution completion degree recorded in the business information resources corresponding to the keyword sentence. The service information resource may record the service execution. The service execution completion degree can be determined through the service execution condition. In some examples, a higher level of business completion indicates a higher level of business execution completion.
For example, the service information resource may be described with the relevant descriptions such as "not executed", "not normalized for execution", "not enough for execution", "execution to be strengthened", and may be set with different execution completion levels. If the service information resource records the relevant description of 'non-execution', the execution completion degree grade is set as one grade; the service information resource is recorded with relevant description such as 'execution non-standard', and the execution completion degree grade is set to be two grade; the service information resource is recorded with the relevant description of 'insufficient execution' and the like, and the execution completion degree grade is set to be three grade; the service information resource is recorded with the relevant description of 'execution to be enhanced' and the like, and the execution completion degree grade is set to four grades.
The higher the completion degree of the business execution, the fewer places to be noticed, and the smaller the influence on the importance of the keyword sentence. That is, the lower the completion degree of the business execution, the more places to be noticed, and the greater the influence on the importance of the keyword sentence. The completion degree of the business execution is introduced into the calculation of the main body weight, so that the accuracy of the importance represented by the main body weight of the keyword sentence can be ensured.
The influence range grade represents the number of business related parties influenced by the keyword and the sentence. The influence range grade can be in positive correlation with the number of business related parties influenced by the keyword and the sentence. That is, the higher the influence range level is, the more the number of business related parties influenced by the keyword sentence is. The service correlation party is an object associated with the service. If the business field is an audit business, the business related party may be a merchant or an enterprise, and the like, which is not limited herein.
For example, if the keyword sentence has no influence on the service associated party, the influence range level can be set as one level; if the keyword sentence affects 1 to 30 ten thousand service correlation parties, the level of the influence range can be set as second level; if the keyword and sentence affect 30 to 50 thousand service correlation parties, the level of the influence range can be set to three levels; if the keyword sentence affects more than 50 ten thousand business related parties, the influence range level can be set to four levels.
The higher the influence range level, i.e. the greater the number of business related parties influenced by the keyword sentence, the higher the influence on the importance of the keyword sentence. The influence range grade is introduced into the calculation of the main body weight, so that the accuracy of the importance represented by the main body weight of the keyword sentence can be ensured.
In step S105, a first characterizing value corresponding to the appearance frequency level is determined according to the appearance frequency level.
The first characteristic value is in positive correlation with the frequency level of occurrence. That is, the first characteristic value is positively correlated with the frequency of the keyword sentence appearing in the service information resource and/or the frequency of the keyword sentence being confirmed as an effective retrieval result, and the higher the frequency of the keyword sentence appearing in the service information resource and/or the frequency of the keyword sentence being confirmed as an effective retrieval result, the larger the first characteristic value. The positive correlation between the first characteristic value and the occurrence frequency level is not limited in the embodiment of the calculation method.
For example, the frequency of occurrence of a keyword sentence in the service information resource is 0 to 5 times, the frequency level of occurrence of the keyword sentence is set as one level, the first characteristic value is 0.1+0.02 × n, and n is the frequency of occurrence of the keyword sentence in the service information resource; the frequency of occurrence of the keyword sentence in the service information resource is 5 to 10 times, the frequency level of occurrence of the keyword sentence is set as two levels, the first characteristic value is 0.2+0.01 xn, and n is the frequency of occurrence of the keyword sentence in the service information resource; the frequency of occurrence of the keyword sentence in the service information resource is more than 10 times, the frequency level of occurrence of the keyword sentence is set to be three levels, and the first characteristic value is 0.3.
In step S106, a second characterizing value corresponding to the execution completion degree grade is determined according to the execution completion degree grade.
The second characterization value is inversely related to the execution completion level. That is, the second characterization value is inversely proportional to the execution completion degree, and the lower the execution completion degree, the larger the second characterization value. The embodiment of the negative correlation between the second token and the execution completion level in the calculation method is not limited herein.
For example, the service information resource is recorded with the relevant description of "not executed", etc., the execution completion degree level is set to be one level, and the second characterization value is 0.4; the service information resource is recorded with relevant description such as 'execution non-standard', the execution completion degree grade is set to be two levels, and the second characterization value is 0.3; the service information resource is recorded with relevant description of 'insufficient execution' and the like, the execution completion degree grade is set to be three grades, and a second characterization value is 0.2; the service information resource is recorded with the relevant description of 'execution to be enhanced', the execution completion degree grade is set to four grades, and the second characterization value is 0.1.
In step S107, a third eigenvalue corresponding to the influence range level is determined according to the influence range level.
The third characteristic value is in positive correlation with the influence range grade. That is, the third eigenvalue has a positive correlation with the number of the business association parties influenced by the keyword sentence, and the larger the number of the business association parties influenced by the keyword sentence is, the larger the third eigenvalue is.
For example, the keyword sentence has no influence on the service associated party, the influence range level can be set as one level, and the third eigenvalue is 0; the keyword sentence affects 0 to 30 ten thousand service correlation parties, the level of the influence range can be set as two levels, and the third eigenvalue is 0.1; if the keyword and the sentence affect 30 to 50 thousand service correlation parties, the level of the influence range can be set to three levels, and the third eigenvalue is 0.2; if the keyword sentence affects more than 50 ten thousand business related parties, the influence range level can be set to four levels, and the third eigenvalue is 0.3.
Step S105, step S106 and step S107 are independent of each other, and may be executed synchronously, or sequentially, and the execution order is not limited.
In step S108, a main weight is calculated by using the first, second and third tokens.
In some examples, a sum of the first token value, the second token value, and the third token value may be taken as the subject weight.
In other examples, a weighting algorithm may be used to set weights for the first, second, and third tokens, respectively, and the sum of the products of each of the first, second, and third tokens with the weights may be used as the subject weight.
In still other examples, a first sum of 1 and a first token value may be calculated, a second sum of 1 and a second token value may be calculated, and a third sum of 1 and a third token value may be calculated. The product of the first sum, the second sum, and the third sum is determined as a body weight. That is, the body weight is (1+ first token) × (1+ second token) × (1+ third token).
In the above embodiments, the method for calculating the main weight of the first combined word and sentence may refer to the method for calculating the main weight of the keyword and sentence, which is not described herein again.
The occurrence frequency level, the execution completion degree level and the influence range level are quantized into a first token value, a second token value and a third token value respectively, a main body weight is obtained through comprehensive calculation of the first token value, the second token value and the third token value, the importance of the keyword sentence is quantized from multiple dimensions, and the accuracy of the importance of the main body weight is guaranteed. The main body weight is added into the retrieval map, so that the content of the retrieval map can be enriched, and the accuracy and efficiency of the retrieval map for retrieval are improved.
A second aspect of the present application provides a retrieval method, which is executable by a retrieval apparatus. Fig. 5 is a flowchart of an embodiment of a retrieval method provided in the second aspect of the present application. As shown in fig. 5, the retrieving method may include steps S201 to S204.
In step S201, retrieval information input by a user is received.
The search information input by the user may be a vocabulary entry, a sentence, an article fragment, etc., and is not limited herein. The specific content of the retrieval information is related to the retrieved business domain.
In step S202, the search information is input to the subject description generation model, and a target subject description corresponding to the search information output by the subject description generation model is obtained.
The subject description generation model is used for performing extraction processing on the input information to output a subject description corresponding to the input information. The body describes information that characterizes the input. For the specific content of the subject description generation model, reference may be made to the relevant description in the above implementation, and details are not repeated here.
The target subject description is a subject description corresponding to the retrieval information. Specifically, the main body description generation model is used for segmenting the retrieval information and performing part-of-speech tagging on the words obtained by segmenting. And combining continuous nouns obtained by the segmentation through a main body description generation model to obtain a second combined noun. And according to the second combination noun, obtaining and outputting the target subject description.
The word segmentation of the search information may further include removing punctuation marks, spaces, and the like, which is not limited herein. The part-of-speech tagging is performed on the words obtained by the word segmentation, and the tagged words are nouns, verbs, numerals, quantifiers, and the like, which is not limited herein. The nouns in the segments can better reflect the content of the retrieval information. The words obtained by word segmentation can be combined to obtain a second combined word and sentence. The second combined sentence can be used as the main body description corresponding to the retrieval information.
The specific content of the main description can be referred to the related description in the above embodiments, and is not repeated herein.
In step S203, the target subject description is matched with a pre-generated search map, and a target keyword sentence matched with the target subject description in the search map is obtained.
The retrieval map comprises key words and sentences related to the business field to which the retrieval information belongs, knowledge information related to the key words and sentences and information sources of the key words and sentences. The search map is the search map constructed by the search map construction method in the above embodiment, and specific contents may refer to relevant descriptions of the above embodiment, which are not described herein again.
The target keyword sentences comprise keyword sentences matched with the target subject description in the retrieval map. The matching can be embodied as comparing the similarity between the target subject description and the keyword sentence in the retrieval map. The higher the similarity, the higher the matching degree. And selecting the keyword sentence with the similarity meeting the preset condition with the target subject description as the target keyword sentence. The number of target keyword sentences may be one, or two or more, and is not limited herein.
In some examples, the target subject description may be converted to a target description vector. And converting the key words and sentences in the retrieval map into word and sentence vectors. And calculating the similarity between the target description vector and the word and sentence vector. And determining a target word and sentence vector matched with the target description vector in the word and sentence vectors based on the similarity. And determining the keyword sentences corresponding to the target word sentence vectors as target keyword sentences.
Specifically, word2vec algorithm can be adopted to convert the target subject description into a target description vector and convert the keyword sentence into a word and sentence vector. The similarity between the target description vector and the sentence vector may include a cosine similarity of the vector, and is not limited herein.
In some examples, a word vector having a similarity higher than a matching threshold with the target description vector in the word vector may be used as the target word vector. In other examples, a preset matching number of word and sentence vectors with the highest similarity to the target description vector in the word and sentence vectors may be used as the target word and sentence vector. The method of selecting the target word and sentence vector based on the similarity is not limited herein.
In step S204, the target keyword sentence and knowledge information associated with the target keyword sentence are output.
In order to facilitate a user to more intuitively acquire detailed contents, on the basis of outputting the target keyword sentence, knowledge information associated with the target keyword sentence can be output. The target keyword sentence and the knowledge information related to the target keyword sentence may be output at one time or may be output separately, and this is not limited herein. For example, the target keyword sentence may be output first, and in a case where a further operation by the user is received, the knowledge information associated with the target keyword sentence may be output.
In the embodiment of the application, the received retrieval information is input into a subject description generation model, and a target subject description corresponding to the retrieval information and output by the subject description generation model is obtained. And matching the target main body description with the retrieval map to obtain a target keyword sentence matched with the target main body description in the retrieval map and knowledge information associated with the target keyword sentence. The retrieval map is convenient for quick retrieval, the target subject description corresponding to the retrieval information is utilized to retrieve in the retrieval map, and the content related to the service processing can be quickly obtained without manually referring to a large amount of professional data by a user to find the content required by the service processing, so that the service processing time is shortened, and the service processing efficiency is improved.
And through the matching with the target subject description, the adverse effect of information which is irrelevant to the service corresponding to the retrieval information or has low relevance on the service processing can be removed, and the information relevant to the service processing can be accurately determined, so that the accuracy of the service processing is improved.
In some embodiments, the keyword sentences may be associated with body weights. The main weight of the keyword sentence is used for representing the importance of the keyword sentence. And determining the main weight of the keyword sentence according to the appearance frequency grade, the execution completion degree grade and the influence range grade of the keyword sentence.
The frequency of occurrence grade represents the frequency of occurrence of the keyword sentences in the service information resources and/or the frequency of the keyword sentences confirmed as effective retrieval results. The execution completion degree grade represents the business execution completion degree recorded in the business information resources corresponding to the keyword sentence. The influence range grade represents the number of business related parties influenced by the keyword and the sentence.
For specific contents of determining the body weight, reference may be made to the relevant description in the above embodiments, and details are not repeated herein.
In some examples, the step S204 may be specifically refined to output the target keyword sentences arranged in the order of the main weight from high to low, and the knowledge information associated with the target keyword sentences. The target keyword sentences are arranged in the sequence from high to low according to the main weight, so that the user can more visually know the importance of the target keyword sentences and the knowledge information associated with the target keyword sentences, the information required by service processing is more accurately positioned, the time spent by the service processing is further reduced, the service processing effect is improved, and the accuracy of the service processing can also be improved.
In some examples, the step S204 may be specifically refined to output a preset number of target keyword sentences with the highest subject weight and knowledge information associated with the target keyword sentences. Target keyword sentences meeting the matching requirements can be further screened by utilizing the main body weight and the preset number, and the preset number of target keywords with the highest main body weight are the target keywords which are matched with the retrieval information and have the highest importance. The information required by the service processing can be more accurately positioned through the main weight and the preset number, the time spent by the service processing is further reduced, the service processing effect is improved, and the accuracy of the service processing can also be improved.
The search map construction method and the search method in the above embodiments may be executed by the same apparatus, or may be executed by different apparatuses, and are not limited thereto.
A third aspect of the present application provides a retrieval map construction apparatus. Fig. 6 is a schematic structural diagram of an embodiment of a search map construction apparatus according to a third aspect of the present application. As shown in fig. 6, the retrieval map constructing apparatus 300 may include a resource acquiring module 301, a generating module 302, and a map constructing module 303.
The resource obtaining module 301 may be configured to obtain service information resources.
In some examples, the service information resource includes a service information file and/or service history information.
The generating module 302 may be configured to input the service information resource into the subject description generating model, and obtain a subject description corresponding to the service information resource and output by the subject description generating model.
The subject description generation model is used for performing extraction processing on the input information to output a subject description corresponding to the input information. The subject describes information used to characterize the input;
the map construction module 303 may be configured to construct a retrieval map of the service field to which the service information resource belongs, according to the service information resource, the subject description corresponding to the service information resource, and the knowledge information associated with the subject description.
The retrieval map comprises the keyword sentences, knowledge information related to the keyword sentences and information sources of the keyword sentences. The keyword sentence comprises a main body description corresponding to the service information resource, and the information source of the keyword sentence comprises the service information resource.
In the embodiment of the application, the obtained service information resource is input into the subject description generation model, so that the subject description corresponding to the service information resource is obtained. And constructing a retrieval map of the service field of the service information resource according to the service information resource, the main body description corresponding to the service information resource and the associated knowledge information of the main body description in the service information resource. The body description can characterize important content in the traffic information resource. The main body description is used as a keyword sentence in the retrieval map, the associated indication information is used as knowledge information associated with the keyword sentence in the retrieval map, and the service information resource is used as an information source of the keyword sentence in the retrieval map. The retrieval map is used for retrieval, matched keyword sentences, information sources of the keyword sentences, knowledge information related to the keyword sentences and the like can be quickly retrieved through the retrieval map, and a user does not need to manually look up a large amount of professional data to find contents needed by processing services, so that the service processing time is shortened, and the service processing efficiency is improved.
In some examples, the generation module 302 described above may be configured to: inputting the service information resource into a subject description generation model; splitting the service information resource through a main body description generation model to obtain more than two segments of the service information resource; generating a model through subject description, segmenting the fragments, and performing part-of-speech tagging on words obtained by segmenting; combining continuous nouns obtained by word segmentation through a main body description generation model to obtain a first combined word and sentence; and obtaining and outputting the main body description according to the first combined words and sentences.
Specifically, the generating module 303 may further be configured to: determining a first combined word and sentence with the highest main weight corresponding to the segment as a main description corresponding to the segment and outputting the main description, wherein the main weight is associated with the first combined word and sentence, and the main weight of the first combined word and sentence is used for representing the importance of the first combined word and sentence; and/or combining the main body information and the first combined words and sentences under the condition that the first combined words and sentences corresponding to the fragments do not contain the main body information of the service information resources to which the fragments belong, and determining the combined main body information and the first combined words and sentences as the main body description corresponding to the fragments and outputting the main body description.
In some examples, the keyword sentences are associated with body weights. The main body weight of the keyword sentence is used for representing the importance of the keyword sentence. Fig. 7 is a schematic structural diagram of another embodiment of a search map construction apparatus according to a third aspect of the present application. Fig. 7 is different from fig. 6 in that the retrieval map construction apparatus 300 shown in fig. 7 may further include a factor acquisition module 304 and a calculation module 305.
The factor obtaining module 304 may be configured to obtain a frequency level, an execution completion level, and an influence range level of the keyword sentence.
The frequency of occurrence grade represents the frequency of occurrence of the keyword sentences in the service information resources and/or the frequency of the keyword sentences confirmed as effective retrieval results. The execution completion degree grade represents the business execution completion degree recorded in the business information resources corresponding to the keyword sentence. The influence range grade represents the number of business related parties influenced by the keyword and the sentence.
The calculating module 305 may be configured to determine a first characteristic value corresponding to the frequency of occurrence grade according to the frequency of occurrence grade, where the first characteristic value is in a positive correlation with the frequency of occurrence grade; determining a second characteristic value corresponding to the execution completion degree grade according to the execution completion degree grade, wherein the second characteristic value and the execution completion degree grade are in a negative correlation relationship; determining a third characteristic value corresponding to the influence range grade according to the influence range grade, wherein the third characteristic value is in positive correlation with the influence range grade; and calculating to obtain the main body weight by using the first representation value, the second representation value and the third representation value.
Specifically, the calculation module 305 may be configured to: calculating a first sum of 1 and the first characterizing value; calculating a second sum of 1 and the second characterization value; calculating a third sum of 1 and a third eigenvalue; the product of the first sum, the second sum, and the third sum is determined as a body weight.
A fourth aspect of the present application provides a retrieval apparatus. Fig. 8 is a schematic structural diagram of an embodiment of a retrieval apparatus according to a fourth aspect of the present application. As shown in fig. 8, the retrieving apparatus 400 may include a receiving module 401, a generating module 402, a matching module 403, and an outputting module 404.
The receiving module 401 may be configured to receive retrieval information input by a user.
The generating module 402 may be configured to input the search information into the subject description generating model, and obtain a target subject description corresponding to the search information and output by the subject description generating model.
The subject description generation model is used for performing extraction processing on the input information to output a subject description corresponding to the input information. The subject describes information used to characterize the input;
the matching module 403 may be configured to match the target subject description with a pre-generated search atlas to obtain a target keyword sentence in the search atlas, where the target keyword sentence matches the target subject description.
The retrieval map comprises key words and sentences related to the business field to which the retrieval information belongs, knowledge information related to the key words and sentences and information sources of the key words and sentences.
The output module 404 may be used to output the target keyword sentence and knowledge information associated with the target keyword sentence.
In the embodiment of the application, the received retrieval information is input into a subject description generation model, and a target subject description corresponding to the retrieval information and output by the subject description generation model is obtained. And matching the target main body description with the retrieval map to obtain a target keyword sentence matched with the target main body description in the retrieval map and knowledge information associated with the target keyword sentence. The retrieval map is convenient for quick retrieval, the target subject description corresponding to the retrieval information is utilized to retrieve in the retrieval map, and the content related to the service processing can be quickly obtained without manually referring to a large amount of professional data by a user to find the content required by the service processing, so that the service processing time is shortened, and the service processing efficiency is improved.
In some examples, the generation module 402 may be configured to generate a model through the body description, perform word segmentation on the search information, and perform part-of-speech tagging on words obtained by the word segmentation; combining continuous nouns obtained by the participle through a main body description generation model to obtain a second combined noun; and according to the second combination noun, obtaining and outputting the target subject description.
In some examples, matching module 403 may be used to convert the target subject description into a target description vector; converting the key words and sentences in the retrieval map into word and sentence vectors; calculating the similarity between the target description vector and the word and sentence vector; determining a target word and sentence vector matched with the target description vector in the word and sentence vectors based on the similarity; and determining the keyword sentences corresponding to the target word sentence vectors as target keyword sentences.
In some examples, the keyword sentences are associated with body weights. The main weight of the keyword sentence is used for representing the importance of the keyword sentence.
The output module 404 may be configured to output target keyword sentences arranged in an order from a high main weight to a low main weight, and knowledge information associated with the target keyword sentences; and/or outputting a preset number of target keyword sentences with the highest main body weight and knowledge information associated with the target keyword sentences.
Specifically, the main weight of the keyword sentence is determined according to the appearance frequency level, the execution completion level and the influence range level of the keyword sentence.
The frequency level represents the frequency of the keyword sentence appearing in the service information resource and/or the frequency of the keyword sentence confirmed as an effective retrieval result. The execution completion degree grade represents the business execution completion degree recorded in the business information resources corresponding to the keyword sentence. The influence range grade represents the number of business related parties influenced by the keyword and the sentence.
The searching map constructing apparatus 300 and the searching apparatus 400 in the above embodiments may be the same apparatus or different apparatuses, and are not described herein again.
A fifth aspect of the present application provides a retrieval map constructing apparatus. Fig. 9 is a schematic structural diagram of an embodiment of a search map construction apparatus provided in the fifth aspect of the present application. As shown in fig. 9, the retrieval map constructing apparatus 500 includes a memory 501, a processor 502, and a computer program stored on the memory 501 and executable on the processor 502.
In one example, the processor 502 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.
The Memory 501 may include Read-Only Memory (ROM), Random Access Memory (RAM), magnetic disk storage media devices, optical storage media devices, flash Memory devices, electrical, optical, or other physical/tangible Memory storage devices. Thus, in general, the memory includes one or more tangible (non-transitory) computer-readable storage media (e.g., a memory device) encoded with software comprising computer-executable instructions and when the software is executed (e.g., by one or more processors), it is operable to perform the operations described with reference to the retrieve map construction method according to the present application.
The processor 502 runs a computer program corresponding to the executable program code by reading the executable program code stored in the memory 501 for implementing the retrieval map construction method in the above-described embodiment.
In one example, the retrieve atlas construction apparatus 500 may also include a communication interface 503 and a bus 504. As shown in fig. 9, the memory 501, the processor 502, and the communication interface 503 are connected to each other via a bus 504 to complete communication therebetween.
The communication interface 503 is mainly used for implementing communication between modules, apparatuses, units and/or devices in the embodiments of the present application. Input devices and/or output devices may also be accessed through communication interface 503.
The bus 504 includes hardware, software, or both to couple the components of the retrieve map building apparatus 500 to each other. By way of example, and not limitation, Bus 504 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) Bus, an InfiniBand interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a Micro Channel Architecture (MCA) Bus, a Peripheral Component Interconnect (PCI) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a Video Electronics Standards Association Local Bus (VLB) Bus, or other suitable Bus, or a combination of two or more of these. Bus 504 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
A sixth aspect of the present application provides a retrieval apparatus. Fig. 10 is a schematic structural diagram of an embodiment of a retrieval apparatus according to a sixth aspect of the present application. As shown in fig. 10, the retrieval apparatus 600 includes a memory 601, a processor 602, and a computer program stored on the memory 601 and executable on the processor 602.
In one example, the processor 602 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.
The Memory 601 may include Read-Only Memory (ROM), Random Access Memory (RAM), magnetic disk storage media devices, optical storage media devices, flash Memory devices, electrical, optical, or other physical/tangible Memory storage devices. Thus, in general, the memory includes one or more tangible (non-transitory) computer-readable storage media (e.g., a memory device) encoded with software comprising computer-executable instructions and when the software is executed (e.g., by one or more processors), it is operable to perform operations described with reference to retrieval methods according to the application.
The processor 602 runs a computer program corresponding to the executable program code by reading the executable program code stored in the memory 601 for implementing the retrieval method in the above-described embodiments.
In one example, the retrieval device 600 may also include a communication interface 603 and a bus 604. As shown in fig. 10, the memory 601, the processor 602, and the communication interface 603 are connected via a bus 604 to complete communication therebetween.
The communication interface 603 is mainly used for implementing communication between modules, apparatuses, units and/or devices in the embodiments of the present application. Input devices and/or output devices are also accessible through communication interface 603.
Bus 604 comprises hardware, software, or both coupling the components of retrieval device 600 to one another. By way of example, and not limitation, Bus 604 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) Bus, an InfiniBand interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a Micro Channel Architecture (MCA) Bus, a Peripheral Component Interconnect (PCI) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a Video Electronics Standards Association Local Bus (VLB) Bus, or other suitable Bus, or a combination of two or more of these. Bus 604 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
A seventh aspect of the present application provides a computer-readable storage medium, where computer program instructions are stored on the computer-readable storage medium, and when the computer program instructions are executed by a processor, the method for constructing a search map and/or the method for searching a search map in the foregoing embodiments can be implemented, and the same technical effects can be achieved, and are not described herein again to avoid repetition. The computer-readable storage medium may include a non-transitory computer-readable storage medium, such as a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like, which is not limited herein.
It should be clear that the embodiments in this specification are described in a progressive manner, and the same or similar parts in the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. For apparatus embodiments, device embodiments, computer-readable storage medium embodiments, reference may be made in the descriptive section to method embodiments. The present application is not limited to the particular steps and structures described above and shown in the drawings. Those skilled in the art may make various changes, modifications and additions or change the order between the steps after appreciating the spirit of the present application. Also, a detailed description of known process techniques is omitted herein for the sake of brevity.
Aspects of the present application are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware for performing the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It will be appreciated by persons skilled in the art that the above embodiments are illustrative and not restrictive. Different features which are present in different embodiments may be combined to advantage. Other variations to the disclosed embodiments can be understood and effected by those skilled in the art upon studying the drawings, the specification, and the claims. In the claims, the term "comprising" does not exclude other means or steps; the word "a" or "an" does not exclude a plurality; the terms "first" and "second" are used to denote a name and not to denote any particular order. Any reference signs in the claims shall not be construed as limiting the scope. The functions of the various parts appearing in the claims may be implemented by a single hardware or software module. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims (16)

1. A search map construction method is characterized by comprising the following steps:
acquiring service information resources;
inputting the service information resource into a subject description generation model to obtain a subject description corresponding to the service information resource and output by the subject description generation model, wherein the subject description generation model is used for extracting input information to output the subject description corresponding to the input information, and the subject description is used for representing the input information;
and constructing a retrieval map of the service field to which the service information resource belongs according to the service information resource, the main body description corresponding to the service information resource and the knowledge information associated with the main body description, wherein the retrieval map comprises a keyword sentence, the knowledge information associated with the keyword sentence and an information source of the keyword sentence, the keyword sentence comprises the main body description corresponding to the service information resource, and the information source of the keyword sentence comprises the service information resource.
2. The method according to claim 1, wherein the keyword sentence is associated with a body weight, and the body weight of the keyword sentence is used for representing the importance of the keyword sentence;
the method further comprises the following steps:
acquiring an appearance frequency grade, an execution completion degree grade and an influence range grade of the keyword sentence, wherein the appearance frequency grade represents the frequency of the keyword sentence appearing in the service information resource and/or the frequency of the keyword sentence confirmed as an effective retrieval result, the execution completion degree grade represents the service execution completion degree recorded in the service information resource corresponding to the keyword sentence, and the influence range grade represents the number of service correlation parties influenced by the keyword sentence;
determining a first characteristic value corresponding to the frequency level according to the frequency level, wherein the first characteristic value is in positive correlation with the frequency level;
determining a second characteristic value corresponding to the execution completion degree grade according to the execution completion degree grade, wherein the second characteristic value and the execution completion degree grade are in a negative correlation relationship;
determining a third eigenvalue corresponding to the influence range grade according to the influence range grade, wherein the third eigenvalue is in positive correlation with the influence range grade;
and calculating to obtain the main body weight by using the first characteristic value, the second characteristic value and the third characteristic value.
3. The method of claim 2, wherein the calculating the body weight using the first token, the second token, and the third token comprises:
calculating a first sum of 1 and the first characterizing value;
calculating a second sum of 1 and the second characterization value;
calculating a third sum of 1 and the third characterization value;
determining a product of the first sum, the second sum, and the third sum as the body weight.
4. The method of claim 1, wherein the inputting the service information resource into a subject description generation model to obtain a subject description corresponding to the service information resource and output by the subject description generation model, comprises:
inputting the service information resource into a subject description generation model;
splitting the service information resource through the main body description generation model to obtain more than two segments of the service information resource;
segmenting the segments by the aid of the main body description generation model, and performing part-of-speech tagging on words obtained by segmenting;
combining continuous nouns obtained by word segmentation through the main body description generation model to obtain a first combined word and sentence;
and obtaining and outputting the main body description according to the first combined words and sentences.
5. The method according to claim 4, wherein the obtaining and outputting the body description according to the first combined sentence comprises:
determining the first combined word and sentence with the highest main weight value corresponding to the segment as the main description corresponding to the segment and outputting the main description, wherein the main weight value is associated with the first combined word and sentence, and the main weight value of the first combined word and sentence is used for representing the importance of the first combined word and sentence;
and/or the presence of a gas in the gas,
and combining the main body information and the first combined words and sentences under the condition that the first combined words and sentences corresponding to the segments do not contain the main body information of the service information resources to which the segments belong, and determining the combined main body information and the first combined words and sentences as the main body description corresponding to the segments and outputting the main body description.
6. The method according to claim 1, wherein the service information resource comprises a service information file and/or service history information.
7. A retrieval method, comprising:
receiving retrieval information input by a user;
inputting the retrieval information into a subject description generation model to obtain a target subject description which is output by the subject description generation model and corresponds to the retrieval information, wherein the subject description generation model is used for extracting input information to output a subject description corresponding to the input information, and the subject description is used for representing the input information;
matching the target main body description with a pre-generated retrieval map to obtain a target keyword sentence matched with the target main body description in the retrieval map, wherein the retrieval map comprises a keyword sentence associated with the business field to which the retrieval information belongs, knowledge information associated with the keyword sentence and an information source of the keyword sentence;
and outputting the target keyword sentence and knowledge information associated with the target keyword sentence.
8. The method of claim 7, wherein inputting the search information into a subject description generation model to obtain a target subject description corresponding to the search information output by the subject description generation model comprises:
segmenting the retrieval information through the main body description generation model, and performing part-of-speech tagging on the segmented words;
combining continuous nouns obtained by the participle through the main body description generation model to obtain a second combined noun;
and obtaining and outputting the target subject description according to the second combination noun.
9. The method according to claim 7, wherein the matching of the target subject description with a pre-generated search map to obtain a target keyword sentence in the search map that matches the target subject description comprises:
converting the target subject description into a target description vector;
converting the key words and sentences in the retrieval map into word and sentence vectors;
calculating the similarity between the target description vector and the word and sentence vector;
determining a target word and sentence vector matched with the target description vector in the word and sentence vectors based on the similarity;
and determining the keyword sentence corresponding to the target word sentence vector as the target keyword sentence.
10. The method according to claim 7, wherein the keyword sentence is associated with a body weight, and the body weight of the keyword sentence is used for representing the importance of the keyword sentence;
the outputting the target keyword sentence and the knowledge information associated with the target keyword sentence includes:
outputting the target keyword sentences arranged in the sequence of the main body weight from high to low and knowledge information associated with the target keyword sentences;
and/or the presence of a gas in the gas,
and outputting the preset number of target keyword sentences with the highest main body weight and knowledge information associated with the target keyword sentences.
11. The method of claim 10, wherein the body weight of the keyword sentence is determined according to a frequency level of occurrence, a performance completion level, and an influence range level of the keyword sentence,
the occurrence frequency level represents the frequency of the keyword sentence appearing in the service information resource and/or the frequency of the keyword sentence confirmed as a valid retrieval result, the execution completion degree level represents the service execution completion degree recorded in the service information resource corresponding to the keyword sentence, and the influence range level represents the number of service association parties influenced by the keyword sentence.
12. A retrieval map construction apparatus characterized by comprising:
the resource acquisition module is used for acquiring service information resources;
the generation module is used for inputting the business information resources into a subject description generation model to obtain a subject description which is output by the subject description generation model and corresponds to the business information resources, the subject description generation model is used for extracting input information to output the subject description corresponding to the input information, and the subject description is used for representing the input information;
the map construction module is used for constructing a retrieval map of the service field to which the service information resource belongs according to the service information resource, the main body description corresponding to the service information resource and the knowledge information associated with the main body description, wherein the retrieval map comprises a keyword sentence, the knowledge information associated with the keyword sentence and an information source of the keyword sentence, the keyword sentence comprises the main body description corresponding to the service information resource, and the information source of the keyword sentence comprises the service information resource.
13. A retrieval apparatus, comprising:
the receiving module is used for receiving retrieval information input by a user;
the generating module is used for inputting the retrieval information into a subject description generating model to obtain a target subject description which is output by the subject description generating model and corresponds to the retrieval information, the subject description generating model is used for extracting input information to output a subject description which corresponds to the input information, and the subject description is used for representing the input information;
the matching module is used for matching the target main body description with a pre-generated retrieval map to obtain a target keyword sentence matched with the target main body description in the retrieval map, wherein the retrieval map comprises a keyword sentence associated with the service field to which the retrieval information belongs, knowledge information associated with the keyword sentence and an information source of the keyword sentence;
and the output module is used for outputting the target keyword sentence and the knowledge information associated with the target keyword sentence.
14. A retrieval map construction apparatus characterized by comprising: a processor and a memory storing computer program instructions;
the processor, when executing the computer program instructions, implements a search graph construction method according to any one of claims 1 to 6.
15. A retrieval device, characterized by comprising: a processor and a memory storing computer program instructions;
the processor, when executing the computer program instructions, implements a retrieval method as recited in any of claims 7 to 11.
16. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon computer program instructions which, when executed by a processor, implement the retrieval map construction method according to any one of claims 1 to 6 and/or the retrieval method according to any one of claims 7 to 11.
CN202110212739.9A 2021-02-25 2021-02-25 Search map construction method, search device, search apparatus, and storage medium Active CN112989814B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110212739.9A CN112989814B (en) 2021-02-25 2021-02-25 Search map construction method, search device, search apparatus, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110212739.9A CN112989814B (en) 2021-02-25 2021-02-25 Search map construction method, search device, search apparatus, and storage medium

Publications (2)

Publication Number Publication Date
CN112989814A true CN112989814A (en) 2021-06-18
CN112989814B CN112989814B (en) 2023-08-18

Family

ID=76350788

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110212739.9A Active CN112989814B (en) 2021-02-25 2021-02-25 Search map construction method, search device, search apparatus, and storage medium

Country Status (1)

Country Link
CN (1) CN112989814B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180129372A1 (en) * 2016-11-08 2018-05-10 Microsoft Technology Licensing, Llc Dynamic insight objects for user application data
US20180129946A1 (en) * 2016-11-08 2018-05-10 Microsoft Technology Licensing, Llc Application usage signal inference and repository
CN109145102A (en) * 2018-09-06 2019-01-04 杭州安恒信息技术股份有限公司 Intelligent answer method and its knowledge mapping system constituting method, device, equipment
US20190026372A1 (en) * 2015-12-14 2019-01-24 Microsoft Technology Licensing, Llc Facilitating discovery of information items using dynamic knowledge graph
US20190213488A1 (en) * 2016-09-02 2019-07-11 Hithink Financial Services Inc. Systems and methods for semantic analysis based on knowledge graph
WO2019212729A1 (en) * 2018-05-03 2019-11-07 Microsoft Technology Licensing, Llc Generating response based on user's profile and reasoning on contexts
CN110909364A (en) * 2019-12-02 2020-03-24 西安工业大学 Source code bipolar software security vulnerability map construction method
US20200334545A1 (en) * 2019-04-19 2020-10-22 Adobe Inc. Facilitating changes to online computing environment by assessing impacts of actions using a knowledge base representation
US20200342056A1 (en) * 2019-04-26 2020-10-29 Tencent America LLC Method and apparatus for natural language processing of medical text in chinese
US20210049202A1 (en) * 2019-08-16 2021-02-18 The Toronto-Dominion Bank Automated image retrieval with graph neural network

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190026372A1 (en) * 2015-12-14 2019-01-24 Microsoft Technology Licensing, Llc Facilitating discovery of information items using dynamic knowledge graph
US20190213488A1 (en) * 2016-09-02 2019-07-11 Hithink Financial Services Inc. Systems and methods for semantic analysis based on knowledge graph
US20180129372A1 (en) * 2016-11-08 2018-05-10 Microsoft Technology Licensing, Llc Dynamic insight objects for user application data
US20180129946A1 (en) * 2016-11-08 2018-05-10 Microsoft Technology Licensing, Llc Application usage signal inference and repository
WO2019212729A1 (en) * 2018-05-03 2019-11-07 Microsoft Technology Licensing, Llc Generating response based on user's profile and reasoning on contexts
CN109145102A (en) * 2018-09-06 2019-01-04 杭州安恒信息技术股份有限公司 Intelligent answer method and its knowledge mapping system constituting method, device, equipment
US20200334545A1 (en) * 2019-04-19 2020-10-22 Adobe Inc. Facilitating changes to online computing environment by assessing impacts of actions using a knowledge base representation
US20200342056A1 (en) * 2019-04-26 2020-10-29 Tencent America LLC Method and apparatus for natural language processing of medical text in chinese
US20210049202A1 (en) * 2019-08-16 2021-02-18 The Toronto-Dominion Bank Automated image retrieval with graph neural network
CN110909364A (en) * 2019-12-02 2020-03-24 西安工业大学 Source code bipolar software security vulnerability map construction method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
WENRUI CHEN, DONGPAO HONG, CHAO ZHENG: "Learning knowledge graph embedding with entity descriptions based on LSTM networks", 2020 IEEE ISPCE-CN *
仇建飞;: "打造智能互联网化的支撑客服系统的设计与实现", 电子技术与软件工程, no. 11 *
吕亿林;田宏韬;高建伟;万怀宇;: "结合百科知识与句子语义特征的关系抽取方法", 计算机科学, no. 1 *
葛召华;张中坤;李博;: "基于知识图谱的水利数据垂直搜索应用", 山东水利, no. 05 *
陶兴;张向先;张莉曼;卢恒;: "网络学术社区跨平台用户生成内容知识聚合研究", 情报理论与实践, no. 07 *

Also Published As

Publication number Publication date
CN112989814B (en) 2023-08-18

Similar Documents

Publication Publication Date Title
CN109299228B (en) Computer-implemented text risk prediction method and device
CN107102993B (en) User appeal analysis method and device
CN107229627B (en) Text processing method and device and computing equipment
WO2020113225A1 (en) Systems and methods for identifying an event in data
Al-Ash et al. Fake news identification characteristics using named entity recognition and phrase detection
CN111190997A (en) Question-answering system implementation method using neural network and machine learning sequencing algorithm
CN113254643B (en) Text classification method and device, electronic equipment and text classification program
CN108228567B (en) Method and device for extracting short names of organizations
CN111783450B (en) Phrase extraction method and device in corpus text, storage medium and electronic equipment
CN112765974B (en) Service assistance method, electronic equipment and readable storage medium
EP4060548A1 (en) Method and device for presenting prompt information and storage medium
CN110675269A (en) Text auditing method and device
CN111625621A (en) Document retrieval method and device, electronic equipment and storage medium
CN112347223A (en) Document retrieval method, document retrieval equipment and computer-readable storage medium
CN112183102A (en) Named entity identification method based on attention mechanism and graph attention network
CN114756675A (en) Text classification method, related equipment and readable storage medium
CN114327609A (en) Code completion method, model and tool
CN112668305B (en) Attention mechanism-based thesis reference quantity prediction method and system
CN113011156A (en) Quality inspection method, device and medium for audit text and electronic equipment
CN112989814B (en) Search map construction method, search device, search apparatus, and storage medium
CN115840808A (en) Scientific and technological project consultation method, device, server and computer-readable storage medium
CN107729509A (en) The chapter similarity decision method represented based on recessive higher-dimension distributed nature
KR102321871B1 (en) Pattern Dictionary Establish Method by Data Classify and Analysis
CN114117047A (en) Method and system for classifying illegal voice based on C4.5 algorithm
CN113011162A (en) Reference resolution method, device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant