CN112269816B - Government affair appointment correlation retrieval method - Google Patents

Government affair appointment correlation retrieval method Download PDF

Info

Publication number
CN112269816B
CN112269816B CN202011244701.1A CN202011244701A CN112269816B CN 112269816 B CN112269816 B CN 112269816B CN 202011244701 A CN202011244701 A CN 202011244701A CN 112269816 B CN112269816 B CN 112269816B
Authority
CN
China
Prior art keywords
index
search
appointment
log
relevance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011244701.1A
Other languages
Chinese (zh)
Other versions
CN112269816A (en
Inventor
张超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Cloud Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Cloud Information Technology Co Ltd filed Critical Inspur Cloud Information Technology Co Ltd
Priority to CN202011244701.1A priority Critical patent/CN112269816B/en
Publication of CN112269816A publication Critical patent/CN112269816A/en
Application granted granted Critical
Publication of CN112269816B publication Critical patent/CN112269816B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2468Fuzzy queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a relevance retrieval method for government affair appointment matters, which belongs to the technical field of government affair appointment, and is characterized in that a relevance type index is generated by using timed task induction based on user operation records, and a common type index is generated by basic information maintenance; the generation of the correlation type index uses a statistical model in a scoring form to generate a combined search mode of keyword search, associated word search and correlation ranking of keywords and reservation service for reservation service item search. The invention can accurately position the demands of the office and display related business when the office is subscribed on the internet, thereby improving the efficiency of the subscribed office, and simultaneously, the statistical analysis method based on the data can continuously optimize the query accuracy to improve the performance and improve the experience.

Description

Government affair appointment correlation retrieval method
Technical Field
The invention relates to the technical field of government affair reservation, in particular to a government affair reservation item correlation retrieval method.
Background
With the continuous development and improvement of the government service field and the development of the mobile internet to a new stage, on-line reservation processing based on various channels of webpage ends, app and applets provides concise, convenient and efficient government service processing experience for the office masses, but the demand for intelligent business processing is more and more urgent, more and more masses are more required to process intelligent, personalized and accurate business processing, and meanwhile, in order to realize the conversion from 'workable' to 'fast-office' to 'intelligent-office' of government service capability, the service mode is more required to be changed, the network service is applied based on the statistical analysis capability of user data, the keyword hit rate is improved, and the government service management capability is improved.
Disclosure of Invention
The technical task of the invention is to provide a government affair appointment correlation retrieval method aiming at the defects, which can accurately position the demands of the transacters and display the related services when transacting the services on the internet of the transacters, thereby improving the appointment transacting efficiency, and simultaneously, the statistical analysis method based on the data can continuously optimize the query accuracy to improve the performance and improve the experience.
The technical scheme adopted for solving the technical problems is as follows:
a government affair appointment relativity retrieval method uses timing task induction to generate relativity type index based on user operation record, basic information maintenance to generate common type index;
the generation of the correlation type index uses a statistical model in a scoring form to generate a combined search mode of keyword search, associated word search and correlation ranking of keywords and reservation service for reservation service item search. Under the multi-channel online reservation scene, the quick search of reservation service items and the analysis recommendation of related services are realized, and the accurate prediction and intelligent analysis recommendation of the demands of consumer reservation items are realized, so that the requirements of individuation, intelligence and high accuracy are met, the database search pressure is reduced, and the mass handling efficiency is improved.
The method can accurately position the demands of the office and display related businesses when the office is subscribed on the internet, improves the efficiency of the subscribed business, and can continuously optimize the query accuracy to improve the performance and improve the experience based on the statistical analysis method of the data.
Preferably, an elastomer search engine and a Chinese IK word segmentation device are selected for searching. The Elatics search is an open-source high-expansion distributed full-text search engine, can store and search data in near real time, has good expansibility, occupies the first position in the open-source search field, and has accurate keyword extraction by a Chinese IK word segmentation device, so that a service analysis search method can be provided based on the Elatics search.
The search engine is used for replacing simple database search, the elastic search is a good choice, is an open-source distributed and RESTful-style search and data analysis engine, and the bottom layer is an open-source library Apache Lucene which is used as a distributed full-text search engine, has good expansibility, supports PB-level structured or unstructured data, and can completely adapt to rapid positioning of a reservation item service with huge data volume under the condition of large-scale centralized deployment.
The elastic search has a plurality of excellent word segmenters, the selection is based on a Chinese IK word segmenter, two word segmentation algorithms of ik_smart and ik_max_word are provided, a text can be divided in semantic multilevel mode by using the finest granularity division mode of ik_max_word for the maximum possible positioning of target data of a user, the created index is more, and the positioning accuracy is higher.
Preferably, the RocketMq is selected as a message queue to realize the decoupling of normal service and recording result.
The method has the advantages that higher search precision is realized, the user target search words and search results are required to be collected and summarized, the data are waited for being pulled to an analysis model service by a subsequent timing task, the asynchronous decoupling is carried out by selecting a message queue under the condition that the normal business flow is not influenced, the Rokectmq is a good choice, a transactional message solution is provided, and the correct consumption and storage of each result set are ensured.
Preferably, the relevance type index comprises three index types, namely, a retrieval result set consists of three indexes and a database SQL query, wherein the three index types are respectively:
an index generated based on the association of the keywords generated after the name word segmentation process of the appointment business and the current appointment business is marked as N type (Normal),
the index of the analyzed relevance pattern is summarized regularly based on the user feedback and the triggering behavior log, and is marked as C type (coreaction),
and the reservation service item service index carried by the Related words based on the keywords is marked as R type (Related).
The three index types are different in importance, the search results are orderly ranked according to the relevance of the C-type index results, the N-type index results and the R-type index results, the results are de-duplicated, the reservation items can be displayed according to the relevance ranking, and the search reliability is improved.
Further, the field of each piece of information in the search result set contains a service item name, a service item ID, a service item department, a keyword, an index type and an index ID, and the total result set also contains a UUID of the current search, so that data is provided for subsequent user log collection and recall records.
Preferably, the basic information maintenance generates a normal type index, the management service can have an effect on the basic index (i.e. the N-type index) by adding modification and deletion when maintaining the reservation service item,
after the reservation items are added, the business names are processed in word segmentation, and each keyword and the current reservation business item ID form index data to be stored in the ES service;
after the appointment is modified, deleting the original basic type index according to the appointment ID, and regenerating a new basic type index;
after the appointment is deleted, the original basic type index is deleted according to the appointment ID, and the other two types of indexes are deleted according to the appointment ID, so that the accuracy of data is ensured.
Preferably, the log collection of the user's operation records comprises
After the client-side search request processing process is finished, each piece of data of the search result set is assembled and put into a message queue, and the log service is used as a consumption side of the message to record log information and put into the message queue; the assembled log information field comprises a search word, a keyword, an Es index ID, a matter ID, an index type and a current search UUID;
after the client user obtains the search result, clicking and browsing a certain search information to form a click positioning recall log, and sending the data to a log service for storage through the client, wherein repeated clicking is only recorded once, so that analysis data distortion is prevented; the log information comprises a current search UUID, an index ID, a item ID, a keyword, an index type and a search word;
after the client user obtains the search result, clicking and browsing a certain search information and successfully transacting business to form a successful transacting recall log, and sending the data to log service for storage through the client; the log information comprises a current search UUID, an index ID, an index type, a item ID, keywords and search words;
and collecting the three logs, namely regarding a group of search flow logs if the three logs have the same search UUID, waiting for the timing task scanning to put the group of search flow logs into the analysis model service for analysis processing after the search flow logs respectively enter the log service.
Preferably, a correlation index is generated:
the log service timing task scans the collected logs, packages the three logs according to the search UUID, sends the three logs to the analysis model service for processing, generates a correlation index, calculates the correlation of the correlation index by adopting a numerical statistical rule, and defaults 100 and intervals 0 to 1000 according to the correlation field contained in the attribute of the newly generated correlation index.
The relevance index comprises key information related keywords, corresponding basic index IDs and relevance values, the key words and the basic indexes are mapped in a many-to-many associated mode, the relevance values are mounted, and the basic index IDs associated in the indexes are used for searching the type index to finally point to the basic index; meanwhile, different step values are set according to different log types: browse recall (+1), transact recall (+2) and miss recall (-1), the relevance value is set to change in sequence, and the index with the relevance value of 0 is scanned and deleted regularly, so that the relevance value of each relevance index is corrected continuously through an analysis service module, and the hit rate is improved;
the types of indexes carried in the logs are different, but the logs carry two information of a keyword and a basic index ID by default, so the processing procedures are approximately the same; checking whether a correlation index exists, if so, modifying a numerical value according to the setp rule, and if not, generating a new index according to the generation rule;
through the continuous calibration of the model based on a large amount of data, the distribution of relevance values of all keywords of one reservation service item approximately accords with normal distribution, and the relevance values are used for determining display sorting priority.
Generating an associated word index:
the relevant vocabulary index contains fields which are keywords and relevant word arrays thereof, the data source is that the reservation service search interface puts the IK word segmentation result into a message queue after processing the search result, and the analysis model service is used as a consumer side to process and generate the relevant index, and the message queue realizes asynchronous decoupling.
Only keywords and relevance are used for searching, but query results are still possibly inaccurate, so that a function of deducing and recommending is needed to be achieved by using related word indexes, and searching accuracy is improved.
The searching method realizes the matching of the keyword and the associated vocabulary result when searching the vocabulary fuzzy query, and improves the hit rate of the crowd demand;
according to log records which are browsed and successfully transacted by a user after the search results are displayed, a relevance retrieval mode is provided through recall rate, and the accuracy is improved according to relevance weight display;
the relevance accuracy of the key words and the item information can be improved based on a relevance analysis model of the recall rate;
and the correlation index of the keywords and the associated vocabulary is realized, and the personalized recommendation function is enhanced.
The invention also claims a government affair appointment correlation retrieval device, which comprises: at least one memory and at least one processor;
the at least one memory for storing a machine readable program;
the at least one processor is configured to invoke the machine-readable program to perform the method described above.
The invention also claims a computer readable medium having stored thereon computer instructions which, when executed by a processor, cause the processor to perform the above-described method.
Compared with the prior art, the government affair appointment correlation retrieval method has the following beneficial effects:
the method can not completely rely on database retrieval when the system is in service retrieval and keyword fuzzy search, so that the pressure on the database in the concurrency environment is reduced, and the concurrency capacity is improved;
the data retrieval interface data is from the index of the elastic search and part of database query, so that millimeter-level response speed can be provided, waiting of the clerks is reduced, and the use experience is optimized.
Based on the statistical analysis of the user operation log records and the summary of the timing tasks, the query result is more intelligent, accurate and personalized, replaces the original fuzzy search based on the database, and optimizes the positioning accuracy of the office masses.
Drawings
FIG. 1 is a flow chart of a method for searching relevance of government affair appointment provided by one embodiment of the invention;
fig. 2 is a schematic diagram of client and reservation server retrieval according to an embodiment of the present invention.
Detailed Description
The invention will be further described with reference to the drawings and the specific examples.
In the current government affair service scene, when the office masses accurately find out the corresponding standard item service according to the own demands, the positioning efficiency is often lower and the accuracy is lower under the unfamiliar condition, and the database-based searching mode cannot meet the demands of the production environment more and more. When the business inquiry and retrieval process is based on database search, the required keywords can only be subjected to fuzzy inquiry by using a database LIKE function, under the condition of huge data volume of business transaction matters in the scene of unified reservation in the whole province, the inquiry results are quite different when the keywords input by the business masses slightly come in and go out, and the business transaction matters corresponding to the requirements are difficult to accurately locate, so that the business inquiry and retrieval process is required to have the characteristics of word segmentation function, fuzzy inquiry and complete non-correspondence inquiry and millisecond inquiry in order to improve the situation.
The embodiment of the invention provides a relevance retrieval method for government affair appointment matters, which is based on the basic functions of an elastic search engine and a Chinese IK word segmentation device index, generates a relevance type index based on user operation records by using timed task induction, and generates a common type index by basic information maintenance;
the generation of the correlation type index uses a statistical model of a fractal form, so that a combined search mode of keyword search, associated word search, and correlation ranking of keywords and reservation service is generated for reservation service item search. Under the multi-channel online reservation scene, the quick search of reservation service items and the analysis recommendation of related services are realized, and the accurate prediction and intelligent analysis recommendation of the demands of consumer reservation items are realized, so that the requirements of individuation, intelligence and high accuracy are met, the database search pressure is reduced, and the mass handling efficiency is improved.
The method can accurately position the demands of the office and display related businesses when the office is subscribed on the internet, improves the efficiency of the subscribed business, and can continuously optimize the query accuracy to improve the performance and improve the experience based on the statistical analysis method of the data.
The Elatics search is an open-source high-expansion distributed full-text search engine, can store and search data in near real time, has good expansibility, occupies the first position in the open-source search field, and has accurate keyword extraction by a Chinese IK word segmentation device, so that a service analysis search method can be provided based on the Elatics search.
The search engine is used for replacing simple database search, the elastic search is a good choice, is an open-source distributed and RESTful-style search and data analysis engine, and the bottom layer is an open-source library Apache Lucene which is used as a distributed full-text search engine, has good expansibility, supports PB-level structured or unstructured data, and can completely adapt to rapid positioning of a reservation item service with huge data volume under the condition of large-scale centralized deployment.
The elastic search has a plurality of excellent word segmenters, the selection is based on a Chinese IK word segmenter, two word segmentation algorithms of ik_smart and ik_max_word are provided, a text can be divided in semantic multilevel mode by using the finest granularity division mode of ik_max_word for the maximum possible positioning of target data of a user, the created index is more, and the positioning accuracy is higher.
And the RocketMq is selected as a message queue to realize the decoupling of normal service and recording result. The method has the advantages that higher search precision is realized, the user target search words and search results are required to be collected and summarized, the data are waited for being pulled to an analysis model service by a subsequent timing task, the asynchronous decoupling is carried out by selecting a message queue under the condition that the normal business flow is not influenced, the Rokectmq is a good choice, a transactional message solution is provided, and the correct consumption and storage of each result set are ensured.
The occurrences herein are:
search terms: i.e., search sentences or vocabularies input by the user client;
keyword: i.e. each word segmentation result processed by using the IK word segmentation device ik_max_word mode;
related words: i.e., a keyword having the same or similar search results as the keyword, is referred to as a related word to the keyword.
The search result set consists of three indexes and database SQL queries, and the three index types are respectively:
n type: the Normal is based on an index generated by associating keywords generated after the name word segmentation processing of the appointment business with the current appointment business;
type C: the "Correlation" is based on the index of the relevance pattern analyzed by the user feedback and the timing induction of the trigger behavior log;
r type: "Related" is based on the reservation service item service index carried by the associated word of the keyword.
The search flow of the client and the reservation server is briefly described that the reservation server receives search words input by a user, processes the search words into a plurality of keywords by using a Chinese IK word segmentation device ik_max_word mode, calls an ES container service to obtain three types of index results by each keyword, obtains a business item key ID in the index, retrieves basic information of a database obtained business, and forms a result set to return to the client. Reference is made to figure 2.
The three index types are different in importance, the search results are orderly ranked according to the relevance of the C-type index results, the N-type index results and the R-type index results, the results are de-duplicated, the reservation items can be displayed according to the relevance ranking, and the search reliability is improved.
The field of each piece of information in the search result set contains a service item name, a service item ID, a service item department, a keyword, an index type and an index ID, and the total result set also contains a UUID of the search, so that data is provided for subsequent user log collection and recall records.
The analysis model service needs huge user behavior log support, so that three parts of log records about searching exist in the whole process log link of searching analysis, namely, a generation process, information containing and storage processes are as follows:
after the client-side search request processing process is finished, each piece of data of the current search result set is assembled and put into a message queue, and the log service is used as a consumption side of the message to record log information. The assembled log information field comprises search words, keywords, es index ID, item ID, index type and UUID of the current retrieval, and is put into a message queue.
After the client user obtains the search result, clicking and browsing a certain search information to form a click positioning recall log, and sending the data to log service for storage through the client. The log information comprises a current search UUID, an index ID, a item ID, a keyword, an index type and a search word; repeated clicks are recorded only once, preventing distortion of the analysis data.
After the client user obtains the search result, clicking and browsing a certain search information and successfully transacting business to form a successful transacting recall log, and sending the data to log service for storage through the client; the log information includes the current search UUID, index ID, index type, item ID, keywords, and search terms.
And collecting the three logs, namely regarding a group of search flow logs if the three logs have the same search UUID, waiting for the timing task scanning to put the group of search flow logs into the analysis model service for analysis processing after the search flow logs respectively enter the log service.
The management service maintains a base index:
the management service, when maintaining the subscription service matters, adds modification and deletion to the basic index (i.e. the N-type index) are influenced,
after the reservation items are added, the business names are processed in word segmentation, and each keyword and the current reservation business item ID form index data to be stored in the ES service;
after the appointment is modified, deleting the original basic type index according to the appointment ID, and regenerating a new basic type index;
after the appointment is deleted, the original basic type index is deleted according to the appointment ID, and the other two types of indexes are deleted according to the appointment ID, so that the accuracy of data is ensured.
The analytical model service generates a relevance index:
the log service timing task scans the collected logs, packages the three logs according to the search UUID, sends the three logs to the analysis model service for processing, generates a correlation index, calculates the correlation of the correlation index by adopting a numerical statistical rule, and defaults 100 and intervals 0 to 1000 according to the correlation field contained in the attribute of the newly generated correlation index.
The relevance index comprises key information related keywords, corresponding basic index IDs and relevance values, and is essentially that the key words and the basic indexes are mapped in a many-to-many associated mode, the relevance values are mounted, and the basic index IDs associated in the indexes are used for searching the type index to finally point to the basic index. Meanwhile, different step values are set according to different log types: browse recall (+1), transact recall (+2) and miss recall (-1), the relevance value is set to change in sequence, and the index with the relevance value of 0 is scanned and deleted regularly, so that the relevance value of each relevance index is corrected continuously through an analysis service module, and the hit rate is improved;
the types of indexes carried in the logs are different, but the processing procedure is approximately the same because the logs carry two information of keywords and basic index IDs by default. And checking whether a correlation index exists, if so, modifying the numerical value according to the setp rule, and if not, generating a new index according to the generation rule.
Through the continuous calibration of the model based on a large amount of data, the distribution of relevance values of all keywords of one reservation service item approximately accords with normal distribution, and the relevance values are used for determining display sorting priority.
The analysis model generates an associated word index:
the relevant vocabulary index contains fields which are keywords and relevant word arrays thereof, the data source is that the reservation service search interface puts the IK word segmentation result into a message queue after processing the search result, and the analysis model service is used as a consumer side to process and generate the relevant index, and the message queue realizes asynchronous decoupling.
The IK word segmentation result comprises a search word segmentation result, for example, when the search word is 'vehicle annual inspection', the keyword word segmentation is 'vehicle', 'annual inspection', 'vehicle annual inspection', 'vehicle inspection', which are related words, the five keywords are sequentially built into related indexes, each index comprises a keyword and a related vocabulary array thereof, the related words can be found according to the keywords when the search is triggered subsequently, and then the basic index is searched according to the related words. The associated vocabulary array of the special associated type index is continuously supplemented according to the IK word segmentation result, and the efficiency of searching the basic index is prevented from being slow at most by 20 vocabularies.
Only keywords and relevance are used for searching, but query results are still possibly inaccurate, so that a function of deducing and recommending is needed to be achieved by using related word indexes, and searching accuracy is improved.
The searching method realizes the matching of the keyword and the associated vocabulary result when searching the vocabulary fuzzy query, and improves the hit rate of the crowd demand;
according to log records which are browsed and successfully transacted by a user after the search results are displayed, a relevance retrieval mode is provided through recall rate, and the accuracy is improved according to relevance weight display;
the relevance accuracy of the key words and the item information can be improved based on a relevance analysis model of the recall rate;
and the correlation index of the keywords and the associated vocabulary is realized, and the personalized recommendation function is enhanced.
The invention also claims a government affair appointment correlation retrieval device, which comprises: at least one memory and at least one processor;
the at least one memory for storing a machine readable program;
the at least one processor is configured to invoke the machine-readable program and execute the government affair appointment correlation retrieval method.
The embodiment of the invention also provides a computer readable medium, wherein the computer readable medium stores computer instructions, and the computer instructions, when executed by a processor, cause the processor to execute the government affair appointment correlation retrieval method described in the above embodiment of the invention. Specifically, a system or apparatus provided with a storage medium on which a software program code realizing the functions of any of the above embodiments is stored, and a computer (or CPU or MPU) of the system or apparatus may be caused to read out and execute the program code stored in the storage medium.
In this case, the program code itself read from the storage medium may realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code form part of the present invention.
Examples of the storage medium for providing the program code include a floppy disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD+RW), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer by a communication network.
Further, it should be apparent that the functions of any of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform part or all of the actual operations based on the instructions of the program code.
Further, it is understood that the program code read out by the storage medium is written into a memory provided in an expansion board inserted into a computer or into a memory provided in an expansion unit connected to the computer, and then a CPU or the like mounted on the expansion board or the expansion unit is caused to perform part and all of actual operations based on instructions of the program code, thereby realizing the functions of any of the above embodiments.
While the invention has been illustrated and described in detail in the drawings and in the preferred embodiments, the invention is not limited to the disclosed embodiments, and it will be appreciated by those skilled in the art that the code audits of the various embodiments described above may be combined to produce further embodiments of the invention, which are also within the scope of the invention.

Claims (9)

1. A government affair appointment relativity retrieval method is characterized in that a relativity type index is generated by using timed task induction based on user operation records, and a common type index is generated by basic information maintenance;
generating a correlation type index by using a statistical model in a scoring form, and generating a combined retrieval mode of keyword retrieval, associated word retrieval and correlation ranking of keywords and reservation service for reservation service item retrieval;
collecting logs of operation records of users, and generating a relevance index:
the log service timing task scans the collected log, packages the log according to the search UUID, and sends the log to the analysis model service for processing, and generates a relevance index:
the relevance index comprises a keyword, a corresponding basic index ID and a relevance value, the keyword and the basic index are mapped in a many-to-many associated mode, the relevance value is mounted, and the basic index ID associated in the index is used for searching the type index to finally point to the basic index; meanwhile, different step values are set according to different log types: browse recall +1, transact recall +2 and miss recall-1, the correlation value presumes the change sequentially, scan and delete the index with 0 of correlation value regularly;
checking whether a correlation index exists, if so, modifying a numerical value according to the step rule, and if not, generating a new index according to the generation rule;
generating an associated word index:
the relevant vocabulary index contains fields which are keywords and relevant word arrays thereof, the reservation service search interface puts the IK word segmentation result into a message queue after processing the search result, and the analysis model service is used as a consumer to process and generate the relevant index, so that the message queue realizes asynchronous decoupling.
2. The method for searching relevance of government affairs appointment according to claim 1 wherein the search is performed by using an elastomer search engine and a chinese IK word segmentation device.
3. The method for searching relevance of government affair appointment according to claim 1 or 2, wherein the method is characterized in that RocketMq is selected as a message queue to realize decoupling of normal business and recording result.
4. The method for searching relevance of government affairs appointment according to claim 1, wherein the relevance type index comprises three index types, namely:
the index generated based on the association of the keyword generated after the name word segmentation processing of the appointment business and the current appointment business is marked as N type,
based on the user feedback and the trigger behavior log, the index of the analyzed relevance mode is summarized regularly and marked as C type,
and a reservation service item service index carried by the related words based on the keywords, recorded as R type,
and sequencing the search results sequentially according to the relevance of the C-type index results, the N-type index results and the R-type index results, and de-duplicating the results.
5. The method for searching relevance of government affair appointment according to claim 4 wherein the fields of each piece of information in the search result set contain business item names, business item IDs, business item departments, keywords, index types and index IDs, and the total result set also contains UUIDs of the search, so as to provide data for subsequent user log collection and recall records.
6. The method for searching relevance of government affairs appointment as set forth in claim 2 wherein the basic information maintenance generates a general type index,
after the reservation items are added, the business names are processed in word segmentation, and each keyword and the current reservation business item ID form index data to be stored in the ES service;
after the appointment is modified, deleting the original basic type index according to the appointment ID, and regenerating a new basic type index;
after the appointment is deleted, the original basic type index is deleted according to the appointment ID, and the other two types of indexes are deleted according to the appointment ID, so that the accuracy of data is ensured.
7. The method for searching for relevance of government affairs appointment as claimed in claim 1, 2, 4, 5 or 6, wherein the step of collecting the log of the operation record of the user comprises
After the client-side search request processing process is finished, each piece of data of the search result set is assembled and put into a message queue, and the log service is used as a consumption side of the message to record log information and put into the message queue; the assembled log information field comprises a search word, a keyword, an Es index ID, a matter ID, an index type and a current search UUID;
after the client user obtains the search result, clicking and browsing a certain search information to form a click positioning recall log, and sending the data to a log service for storage through the client, wherein repeated clicking is only recorded once; the log information comprises a current search UUID, an index ID, a item ID, a keyword, an index type and a search word;
after the client user obtains the search result, clicking and browsing a certain search information and successfully transacting business to form a successful transacting recall log, and sending the data to log service for storage through the client; the log information comprises a current search UUID, an index ID, an index type, a item ID, keywords and search words;
and collecting the three logs, namely regarding a group of search flow logs if the three logs have the same search UUID, waiting for the timing task scanning to put the group of search flow logs into the analysis model service for analysis processing after the search flow logs respectively enter the log service.
8. A government affair appointment correlation retrieval device, comprising: at least one memory and at least one processor;
the at least one memory for storing a machine readable program;
said at least one processor for invoking said machine readable program to perform the method of any of claims 1 to 7.
9. A computer readable medium having stored thereon computer instructions which, when executed by a processor, cause the processor to perform the method of any of claims 1 to 7.
CN202011244701.1A 2020-11-10 2020-11-10 Government affair appointment correlation retrieval method Active CN112269816B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011244701.1A CN112269816B (en) 2020-11-10 2020-11-10 Government affair appointment correlation retrieval method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011244701.1A CN112269816B (en) 2020-11-10 2020-11-10 Government affair appointment correlation retrieval method

Publications (2)

Publication Number Publication Date
CN112269816A CN112269816A (en) 2021-01-26
CN112269816B true CN112269816B (en) 2023-04-21

Family

ID=74339950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011244701.1A Active CN112269816B (en) 2020-11-10 2020-11-10 Government affair appointment correlation retrieval method

Country Status (1)

Country Link
CN (1) CN112269816B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883246A (en) * 2021-03-09 2021-06-01 数字广东网络建设有限公司 Business item display method, device, equipment and storage medium
CN113377896A (en) * 2021-05-19 2021-09-10 朗新科技集团股份有限公司 Full-text quick retrieval method and device, electronic equipment and storage medium
CN116243833B (en) * 2023-05-08 2023-07-14 北京国信新网通讯技术有限公司 Cloud data-based electronic government platform communication management method and system
CN116975697B (en) * 2023-09-25 2023-12-15 广东赛博威信息科技有限公司 Main data management method, system, equipment and medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107562726A (en) * 2017-09-06 2018-01-09 国家电网公司 A kind of electric service search engine based on hot word
CN110569273A (en) * 2019-07-26 2019-12-13 南京邮电大学 Patent retrieval system and method based on relevance sorting
CN110807138B (en) * 2019-09-10 2022-07-05 国网电子商务有限公司 Method and device for determining search object category
CN111611268A (en) * 2020-05-21 2020-09-01 腾讯科技(深圳)有限公司 Government affair service search processing method and device
CN111859042A (en) * 2020-07-30 2020-10-30 上海妙一生物科技有限公司 Retrieval method and device and electronic equipment

Also Published As

Publication number Publication date
CN112269816A (en) 2021-01-26

Similar Documents

Publication Publication Date Title
CN112269816B (en) Government affair appointment correlation retrieval method
US11663254B2 (en) System and engine for seeded clustering of news events
US11789952B2 (en) Ranking enterprise search results based on relationships between users
US9569506B2 (en) Uniform search, navigation and combination of heterogeneous data
US8706748B2 (en) Methods for enhancing digital search query techniques based on task-oriented user activity
US8126888B2 (en) Methods for enhancing digital search results based on task-oriented user activity
US8990241B2 (en) System and method for recommending queries related to trending topics based on a received query
US8117198B2 (en) Methods for generating search engine index enhanced with task-related metadata
US9262767B2 (en) Systems and methods for generating statistics from search engine query logs
JP5721818B2 (en) Use of model information group in search
CN108304444B (en) Information query method and device
US11126630B2 (en) Ranking partial search query results based on implicit user interactions
US20120246154A1 (en) Aggregating search results based on associating data instances with knowledge base entities
JP2013054755A (en) Method and system for symbolical linkage and intelligent categorization of information
KR101679050B1 (en) Personalized log analysis system using rule based log data grouping and method thereof
US11308177B2 (en) System and method for accessing and managing cognitive knowledge
CA2956627A1 (en) System and engine for seeded clustering of news events
US9552415B2 (en) Category classification processing device and method
CN110188291B (en) Document processing based on proxy log
US20180349500A1 (en) Search engine results for low-frequency queries
CN117033744A (en) Data query method and device, storage medium and electronic equipment
CN114417179A (en) Meta-search engine processing method and device for large-scale knowledge base group
CN116348868A (en) Metadata indexing for information management
CN113342844A (en) Industrial intelligent search system
US20160019204A1 (en) Matching large sets of words

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant