CN116361361A - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN116361361A
CN116361361A CN202310405541.1A CN202310405541A CN116361361A CN 116361361 A CN116361361 A CN 116361361A CN 202310405541 A CN202310405541 A CN 202310405541A CN 116361361 A CN116361361 A CN 116361361A
Authority
CN
China
Prior art keywords
query
semantic
target
data
phrase
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310405541.1A
Other languages
Chinese (zh)
Inventor
冀慎华
吴婷
舒昭
王晓晨
文蓉蓉
赵娥
刘荣
苏宁
王京鹏
郑金中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
CCB Finetech Co Ltd
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202310405541.1A priority Critical patent/CN116361361A/en
Publication of CN116361361A publication Critical patent/CN116361361A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a data processing method, which can be applied to the technical field of big data and the technical field of finance. The method comprises the following steps: responding to the user query request, and obtaining a reference field by analyzing the user query request, wherein the reference field at least comprises a target query period field and a target institution field to which the user belongs; reading target historical query log data which is generated before the target query time period and is associated with the target institution from a database based on the target query time period field and the target institution field; correlating the target historical query log data with a plurality of predefined query semantic phrases to obtain target query semantic phrases; and associating the target query semantic phrase with the data tags of the plurality of pieces of marked data so as to determine the plurality of pieces of target data from the plurality of pieces of marked data. The present disclosure also provides a data processing apparatus, device, storage medium, and program product.

Description

Data processing method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of big data technology and the field of financial technology, and in particular, to a data processing method, apparatus, device, medium, and program product.
Background
Under the scenes of statistical analysis of various data, staff can generate huge business result data based on a plurality of different business blocks in statistical work, historical business result data is often called for reference in the next statistical work, and meanwhile, when the project is summarized, classification statistics is needed on results so as to better remove the statistical business result data from the whole world.
In the process of realizing the method, the device and the system, the business staff often need to spend a great deal of effort to search the information needed by the business staff from the huge amount of accumulated business achievements for years because the business achievements are huge in data quantity and unstructured information, so that the labor cost is high, the working efficiency is low, and the consumption of the computing resources of a computer is increased in the processing process of the huge data. Moreover, when a user inquires, the user cannot predict proper query terms in advance, so that proper data is difficult to be matched through the query terms directly input by the user.
Disclosure of Invention
In view of the foregoing, the present disclosure provides a data processing method, apparatus, device, medium, and program product.
In one aspect of the present disclosure, there is provided a data processing method including:
responding to a user query request, and obtaining a reference field by analyzing the user query request, wherein the reference field at least comprises a target query period field and a target institution field to which a user belongs;
reading target historical query log data which is generated before the target query time period and is associated with the target institution from a database based on the target query time period field and the target institution field;
correlating the target historical query log data with a plurality of predefined query semantic phrases to obtain target query semantic phrases;
and associating the target query semantic phrase with the data labels of the multiple pieces of marked data so as to determine multiple pieces of target data from the multiple pieces of marked data.
According to an embodiment of the present disclosure, associating the target historical query log data with a plurality of predefined query semantic phrases, obtaining the target query semantic phrases includes:
matching the target historical query log data with a plurality of query semantic phrases, and calculating to obtain attention values corresponding to the query semantic phrases;
And determining the query semantic phrase with the attention value larger than a first preset threshold value as the target query semantic phrase.
According to an embodiment of the disclosure, the query semantic phrase includes multiple levels of semantic units, and a business association relationship exists between the multiple levels of semantic units;
matching the target historical query log data with a plurality of query semantic phrases, and calculating to obtain a concern value corresponding to each query semantic phrase comprises the following steps:
determining the occurrence times of the semantic units of each level in the query semantic phrase in the target historical query log data;
and calculating to obtain the attention value corresponding to each query semantic phrase according to the occurrence times of the semantic units of each level.
According to an embodiment of the disclosure, the query semantic phrase includes a first type query semantic phrase and a second type query semantic phrase;
the first-type query semantic phrase includes a plurality of levels of first-type semantic units, and the plurality of levels of first-type semantic units includes: control elements, control contents and control effects;
the second-class query semantic phrase includes a plurality of stages of second-class semantic units, and the plurality of stages of second-class semantic units includes: risk subject, risk action, risk object.
According to an embodiment of the disclosure, the data tag of the marked data includes a multi-level tag field, and the query semantic phrase includes a multi-level semantic unit;
associating the target query semantic phrase with the data tags of the plurality of pieces of marked data so that determining the plurality of pieces of target data from the plurality of pieces of marked data includes:
carrying out semantic matching on the multi-level tag field in the data tag of the marked data and the multi-level semantic unit in the target query semantic phrase, and calculating to obtain the matching degree value of each marked data;
and determining the marked data with the matching degree value larger than a second preset threshold value as the target data.
According to an embodiment of the present disclosure, performing semantic matching on a multi-level tag field in a data tag of the marked data and a multi-level semantic unit in the target query semantic phrase, and calculating a matching degree value of each marked data includes:
respectively performing feature conversion on each level of tag fields in the multi-level tag fields to obtain a plurality of tag feature vectors;
respectively performing feature conversion on each level of semantic units in the multi-level semantic units to obtain a plurality of semantic feature vectors;
Calculating similarity values between the plurality of tag feature vectors and the plurality of semantic feature vectors;
and calculating the matching degree value of each marked data according to the similarity value between the plurality of tag feature vectors and the plurality of semantic feature vectors.
According to an embodiment of the present disclosure, the above data processing method further includes:
obtaining a scoring value from a user client, wherein the scoring value is used for representing the satisfaction degree of a user on the retrieval results reflected by the multiple target data;
under the condition that the scoring value is smaller than a third preset threshold value, receiving a custom query semantic phrase from a client side,
and associating the custom query semantic phrase with the data tags of the plurality of pieces of marked data so as to determine the plurality of pieces of custom query data from the plurality of pieces of marked data.
According to an embodiment of the present disclosure, the above data processing method further includes:
carrying out statistical analysis on the multiple target data and outputting a statistical result;
and visually displaying the statistical result.
Another aspect of the present disclosure provides a data processing apparatus comprising: the device comprises a first acquisition module, a generation module, a first association module and a second association module. The first obtaining module is configured to obtain, in response to a user query request, a reference field by parsing the user query request, where the reference field includes at least a target query period field and a target institution field to which the user belongs. And the generating module is used for reading the target historical query log data which is generated before the target query time period and is associated with the target institution from the database based on the target query time period field and the target institution field. And the first association module is used for associating the target historical query log data with a plurality of predefined query semantic phrases to obtain target query semantic phrases. And the second association module is used for associating the target query semantic phrase with the data labels of the multiple pieces of marked data so as to determine multiple pieces of target data from the multiple pieces of marked data.
Another aspect of the present disclosure provides an electronic device, comprising: one or more processors; and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the data processing method described above.
Another aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above-described data processing method.
Another aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the above-described data processing method.
According to the data processing method, the device, the medium and the program product provided by the disclosure, in response to a user query request, at least a target query period field and a target mechanism field to which a user belongs can be obtained by analyzing the user query request, based on the target query period field and the target mechanism field, target historical query log data which is generated before the target query period and is associated with the target mechanism can be read from a database, the target historical query log data is associated with a plurality of predefined query semantic phrases, the target query semantic phrases can be obtained from the plurality of predefined query semantic phrases, finally, the target query semantic phrases are associated with data labels of the plurality of marked data as search terms, so that a plurality of pieces of target data matched with the target query semantic phrases are determined from the plurality of marked data as information required by the user.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be more apparent from the following description of embodiments of the disclosure with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an application scenario diagram of a data processing method, apparatus, device, medium and program product according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a data processing method according to an embodiment of the disclosure;
FIG. 3 schematically illustrates a schematic diagram of a record table of a first type of query semantic phrase according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a schematic diagram of a record table of a second type of query semantic phrase according to an embodiment of the present disclosure;
FIG. 5 schematically illustrates a flow chart of deriving a matching degree value for marked data according to an embodiment of the present disclosure;
FIG. 6 schematically illustrates a block diagram of a data processing apparatus according to an embodiment of the present disclosure; and
fig. 7 schematically illustrates a block diagram of an electronic device adapted to implement a data processing method according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
In the technical scheme of the disclosure, the related data (such as including but not limited to personal information of a user) are collected, stored, used, processed, transmitted, provided, disclosed, applied and the like, all conform to the regulations of related laws and regulations, necessary security measures are adopted, and the public welcome is not violated.
In the implementation process of the method, the device and the system find that in the related technology, only description record and simple structured storage are carried out on the business result, a specialized business result data system is not established, so that business result data classification is disordered, in the processing process facing huge data, only simple inquiry can be carried out, information cannot be classified and summarized clearly and efficiently, and data analysis of multi-view analysis cannot be formed. Meanwhile, when a user inquires, proper inquiry search terms cannot be predicted in advance, so that proper data is difficult to be matched through the inquiry search terms directly input by the user.
To this end, an embodiment of the present disclosure provides a data processing method, including: responding to the user query request, and obtaining a reference field by analyzing the user query request, wherein the reference field at least comprises a target query period field and a target institution field to which the user belongs; reading target historical query log data which is generated before the target query time period and is associated with the target institution from a database based on the target query time period field and the target institution field; correlating the target historical query log data with a plurality of predefined query semantic phrases to obtain target query semantic phrases; and associating the target query semantic phrase with the data tags of the plurality of pieces of marked data so as to determine the plurality of pieces of target data from the plurality of pieces of marked data.
Fig. 1 schematically illustrates an application scenario diagram of data processing according to an embodiment of the present disclosure.
As shown in fig. 1, an application scenario 100 according to this embodiment may include a first terminal device 101, a second terminal device 102, a third terminal device 103, a network 104, and a server 105. The network 104 is a medium used to provide a communication link between the first terminal device 101, the second terminal device 102, the third terminal device 103, and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 through the network 104 using at least one of the first terminal device 101, the second terminal device 102, the third terminal device 103, to receive or send messages, etc. Various communication client applications, such as a shopping class application, a web browser application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only) may be installed on the first terminal device 101, the second terminal device 102, and the third terminal device 103.
The first terminal device 101, the second terminal device 102, the third terminal device 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
For example, the user may input the inquiry time period information and the department information of the user through the first terminal device 101, the second terminal device 102, and the third terminal device 103, for generating the user inquiry request.
The server 105 may be a server providing various services, such as a background management server (merely an example) providing support for a website browsed by the user with the first terminal apparatus 101, the second terminal apparatus 102, the third terminal apparatus 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.
For example, in response to a user query request, the server 105 may obtain a reference field by parsing the user query request, the reference field including at least a target query period field and a target institution field to which the user belongs; reading target historical query log data which is generated before the target query time period and is associated with the target institution from a database based on the target query time period field and the target institution field; the target historical query log data is associated with a plurality of predefined query semantic phrases, and the target query semantic phrases are obtained; and finally, associating the target query semantic phrase with the data labels of the multiple pieces of marked data so as to determine multiple pieces of target data from the multiple pieces of marked data.
It should be noted that the data processing method provided in the embodiments of the present disclosure may be generally performed by the server 105. Accordingly, the data processing apparatus provided by the embodiments of the present disclosure may be generally provided in the server 105. The data processing method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105. Accordingly, the data processing apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The data processing method of the disclosed embodiment will be described in detail below with reference to fig. 2 to 5 based on the scenario described in fig. 1.
Fig. 2 schematically illustrates a flow chart of a data processing method according to an embodiment of the present disclosure.
As shown in fig. 2, the method 200 includes operations S210 to S240.
In operation S210, in response to the user query request, a reference field is acquired by parsing the user query request, the reference field including at least a target query period field and a target institution field to which the user belongs.
According to the embodiment of the disclosure, the user needs to find the information needed by the user from a huge number of business achievements, and under the condition that the user cannot predict the proper query search word in advance, the user can input the time period information which is needed to be queried and the department organization information where the user is located, so that the user query request is generated. And obtaining a reference field by analyzing the user query request, wherein the reference field can represent a field corresponding to the user input information obtained according to the analysis of the user query request, and can query related historical query data in a database according to the reference field.
According to an embodiment of the present disclosure, the reference field may include at least a target inquiry period field and a target institution field to which the user belongs, and the reference field may further include a responsibility field of the target institution to which the user belongs. The target inquiry period field may characterize inquiry period information entered by the user, and the target institution field may characterize the department institution in which the user entered by the user.
In operation S220, the target history query log data whose generation time is before the target query period and associated with the target institution is read from the database based on the target query period field and the target institution field.
According to the embodiment of the disclosure, according to the target query period field, historical query data generated before the target query period is read from a database, and based on the historical query data, data associated with a target organization in the historical query data is further read according to the target organization field, so that target historical query log data is obtained. For example, the user may input query time period information of 2020 month 6 to 2021 month 6, and the department organization information where the user is located is a first-order line, and then the historical query data of the first-order line in the time period of 2020 month 6 to 2021 month 6 may be read from the database.
According to the embodiment of the disclosure, according to the reference field obtained by analyzing the user query request, historical query data related to information required by the user can be read from the database. The historical query data in the database may include historical query terms and time information that matches the historical query terms. For example, a query term may represent "level row+waste+cash", and time information matching the query term may represent 13:00:00.
In operation S230, the target historical query log data is associated with a plurality of predefined query semantic phrases to obtain a target query semantic phrase.
According to the embodiment of the disclosure, the predefined query semantic phrase can represent the query semantic phrase obtained by dividing the query search term according to a certain rule, and is configured according to the certain rule in advance. The historical query terms in the database and the predefined query semantic terms satisfy the same partitioning rule.
According to an embodiment of the present disclosure, the target historical query log data read from the database is associated with a plurality of predefined query semantic phrases from which the target query semantic phrase may be derived. The target query semantic phrase may characterize a query term that matches information desired by the user.
In operation S240, the target query semantic phrase and the data tags of the plurality of pieces of marked data are associated such that the plurality of pieces of target data are determined from the plurality of pieces of marked data.
According to embodiments of the present disclosure, the labeled data may characterize the data tagged with the query semantic phrase, and the data tag may characterize the query semantic phrase.
According to the embodiment of the disclosure, the target query semantic phrase is used as a query search word to be associated with the data labels of the multiple pieces of marked data, and multiple pieces of target data matched with the target query semantic phrase can be determined from the multiple pieces of marked data. The target data may characterize information required by the user to which the user is matched according to the target query semantic phrase.
According to the embodiment of the disclosure, in response to a user query request, at least a target query period field and a target mechanism field to which a user belongs can be acquired by analyzing the user query request, based on the target query period field and the target mechanism field, target historical query log data which is generated before the target query period and is associated with the target mechanism can be read from a database, the target historical query log data is associated with a plurality of predefined query semantic phrases, the target query semantic phrases can be obtained from the plurality of predefined query semantic phrases, finally the target query semantic phrases are associated with data labels of a plurality of marked data as search terms, so that a plurality of pieces of target data matched with the target query semantic phrases are determined from the plurality of marked data as information required by the user.
According to an embodiment of the present disclosure, associating the target historical query log data with a plurality of predefined query semantic phrases, obtaining the target query semantic phrases includes: matching the target historical query log data with a plurality of query semantic phrases, and calculating to obtain attention values corresponding to the query semantic phrases; and determining the query semantic phrase with the attention value larger than the first preset threshold value as a target query semantic phrase.
According to embodiments of the present disclosure, a plurality of predefined query semantic phrases may be stored in tabular form. And respectively matching the target historical query log data with a plurality of query semantic phrases acquired from the table, so as to obtain the attention value corresponding to each query semantic phrase.
According to the embodiment of the disclosure, the obtained attention value corresponding to each query semantic phrase is compared with the first preset threshold value, and the query semantic phrase with the attention value larger than the first preset threshold value is determined to be the target query semantic phrase. For example, the attention value corresponding to each query semantic phrase may be 300, 200 and 280, and if the first preset threshold is set to 220, the query semantic phrases with attention values of 300 and 280 are determined to be target query semantic phrases.
According to the embodiment of the disclosure, the attention value of each query semantic phrase is calculated, the attention value is compared with the first preset threshold value, the query semantic phrase with the attention value larger than the first preset threshold value is determined to be the target query semantic phrase through the first preset threshold value, the follow-up query with the target query semantic phrase as the query retrieval word of the information required by the user is facilitated, multiple useless queries caused by frequent retrieval word switching of the user are avoided, and the accuracy and the query efficiency of the query can be improved.
According to an embodiment of the present disclosure, matching target historical query log data with a plurality of query semantic phrases, and calculating a focus value corresponding to each query semantic phrase includes: determining the occurrence times of all levels of semantic units in the query semantic phrase in the target historical query log data; and calculating to obtain the attention value corresponding to each query semantic phrase according to the occurrence times of each level of semantic units.
According to an embodiment of the disclosure, the query semantic phrase may include multiple levels of semantic units, between which a business association exists.
According to the embodiment of the disclosure, the number of times that each level of semantic units in each query semantic phrase appear in the obtained target historical query log data is determined, and according to the number of times that each level of semantic units in each query semantic phrase appear in the target historical query log data, the attention value corresponding to each query semantic phrase can be calculated. For example, for a query semantic phrase "one-level-row+waste+cash", each level of semantic units in the query semantic phrase may be represented as "one-level-row", "waste", and "cash", respectively. The times of the occurrence of the first-order branch, the waste and the cash are searched in the target historical query log data respectively, 100 times of the occurrence of the first-order branch, 50 times of the occurrence of the waste and 50 times of the occurrence of the cash can be obtained, and the times of the occurrence of the first-order branch, the waste and the cash can be added to be used as the attention value corresponding to the query semantic phrase, namely the attention value is 200. Similarly, the value obtained by adding the times of occurrence of each level of semantic units in the query semantic phrase may be a preset multiple of the corresponding attention value, for example, if the preset multiple is 50, the sum of the times of occurrence of each level of semantic units in one query semantic phrase is 300, and the corresponding attention value may be 6.
According to an embodiment of the present disclosure, in a case where the attention value may represent a sum of the number of times that each level of semantic units in the query semantic phrase appears, the first preset threshold may be set to a value corresponding to the attention value, for example, the attention value of the query semantic phrase may be 300 and 200, respectively, and then the first preset threshold may be set to 260. Under the condition that the value obtained by adding the times of occurrence of each level of semantic units in the query semantic phrase can be a preset multiple of the corresponding attention value, the first preset threshold can be set to a value corresponding to the attention value, for example, the preset multiple can be set to 50, the sum of the times of occurrence of each level of semantic units in the query semantic phrase is 300, the corresponding attention value can be 6, and the first preset threshold can be set to 5.
According to the embodiment of the disclosure, the attention value corresponding to each query semantic phrase is calculated by determining the occurrence times of each level of semantic units in the query semantic phrases in the target historical query log data, so that the target query semantic phrases are determined from a plurality of query semantic phrases to serve as the subsequent query retrieval words according to the calculated attention values.
According to embodiments of the present disclosure, the query semantic phrase may include a first type of query semantic phrase and a second type of query semantic phrase. The first type query semantic phrase may include a plurality of levels of first type semantic units, the plurality of levels of first type semantic units may include: control elements, control contents and control effects. The second category of query semantic terms may include a plurality of levels of second category semantic units, which may include: risk subject, risk action, risk object.
According to the embodiment of the disclosure, the first category of query semantic phrases is obtained by dividing according to rules of control elements, control contents and control effects, and the second category of query semantic phrases is obtained by dividing according to rules of risk subjects, risk actions and risk objects.
FIG. 3 schematically illustrates a schematic diagram of a configuration table of a first type of query semantic phrase according to an embodiment of the present disclosure.
As shown in fig. 3, a configuration table 300 is provided for a first type of query semantic phrase. The first type of query semantic phrase may characterize semantic phrases related to internal control defect issues. The first-class query semantic phrase can comprise a plurality of levels of first-class semantic units, wherein each level of first-class semantic units can comprise three parts of control elements, control contents and control effects, namely, the first-class query semantic phrase can consist of three parts of control elements, control contents and control effects.
According to embodiments of the present disclosure, the control elements may include internal environments, internal management, specification class control, object right class control, and technology class control. Control content may include organizational architecture, division of work, coordinated linkage, policy landing, and strategic implementation. Control effects may include uncoordinated, not deep, unreasonable, out of specification, and out of place.
According to an embodiment of the disclosure, as shown in fig. 3, the first type of query semantic phrase may be characterized as being composed of one phrase in the control element part, one phrase in the control content part, and one phrase in the control effect part, for example, the first type of query semantic phrase may include "internal management+division of labor+incompatibility" and may also include "internal management+policy landing+not in place".
FIG. 4 schematically illustrates a schematic diagram of a configuration table of a second type of query semantic phrase according to an embodiment of the present disclosure.
As shown in fig. 4, a configuration table 400 of second class query semantic phrases is provided. The second category of query semantic phrases may characterize semantic phrases related to important risk issues. The second-class query semantic phrase can comprise a plurality of levels of second-class semantic units, wherein each level of second-class semantic units can comprise three parts of a risk main body, a risk action and a risk object, namely the second-class query semantic phrase can be composed of three parts of the risk main body, the risk action and the risk object.
According to embodiments of the present disclosure, a risk subject may include a headquarter, a first level line, a second level line, and an infrastructure. Risk actions may include wastage, misuse, occupancy, and misinformation. Risk objects may include cash, items, products, and systems.
According to embodiments of the present disclosure, as shown in fig. 4, the second type of query semantic phrase may be characterized as being composed of one phrase in the risk body portion, one phrase in the risk action portion, and one phrase in the risk object portion, for example, the second type of query semantic phrase may include "first order line+waste+cash" and may also include "second order line+waste+cash".
According to the embodiment of the disclosure, the first category of query semantic phrases is divided according to the rules of the control elements, the control contents and the control effects, and the second category of query semantic phrases is divided according to the rules of the risk main body, the risk actions and the risk objects, so that the query search terms can be structured, and information required by a user can be conveniently queried.
According to an embodiment of the present disclosure, associating the target query semantic phrase with the data tags of the plurality of pieces of tagged data such that determining the plurality of pieces of target data from the plurality of pieces of tagged data includes: carrying out semantic matching on the multi-level tag fields in the data tags of the marked data and the multi-level semantic units in the target query semantic phrase, and calculating to obtain the matching degree value of each marked data; and determining the marked data with the matching degree value larger than a second preset threshold value as target data.
According to embodiments of the present disclosure, the data tag of the marked data may include a multi-level tag field and the query semantic phrase may include multi-level semantic units.
According to the embodiment of the disclosure, the marked data can represent the data with the query semantic phrase as the label, namely, the corresponding marked data with the query semantic phrase as the label can be obtained by querying through the query semantic phrase.
According to embodiments of the present disclosure, the multi-level tag field may characterize a multi-level semantic unit. And correspondingly matching each level of the multi-level semantic units in the target query semantic phrase with each level of the multi-level tag fields in the data tags of the marked data, and calculating to obtain the matching degree value of each marked data and the target query semantic phrase. For example, the target query semantic phrase may be expressed as "first-order row+waste+cash", and the first-order semantic unit of "first-order row" in the target query semantic phrase is semantically matched with the corresponding risk main body part in the multi-level tag field in the data tag of each marked data; carrying out semantic matching on a level semantic unit of waste in the target query semantic phrase and a corresponding risk action part in a multi-level label field in the data label of each marked data; and carrying out semantic matching on the level semantic unit of cash in the target query semantic phrase and the corresponding risk object part in the multi-level label field in the data label of each marked data. And obtaining the matching degree value of each marked data according to the result of the three matching.
According to the embodiment of the disclosure, the matching degree value corresponding to the marked data is compared with a second preset threshold value, and the marked data with the matching degree value larger than the second preset threshold value is determined as target data. The second preset threshold is set manually and is used for determining target data which can be obtained by querying by taking the target query semantic phrase as a search term, wherein the target data can represent information required by a user.
According to the embodiment of the disclosure, the multi-level tag field in the data tag of the marked data is subjected to semantic matching with the multi-level semantic unit in the target query semantic phrase, so that the matching degree value of each marked data can be calculated, the matching degree of the data tag in the plurality of marked data and the target query semantic phrase can be determined, and the marked data with the matching degree value larger than the second preset threshold value is determined as target data through the second preset threshold value, so that information required by a user and obtained by querying the target query semantic phrase as a query retrieval word, namely the target data, is realized.
Fig. 5 schematically illustrates a flowchart of obtaining a matching degree value of marked data according to an embodiment of the present disclosure.
As shown in fig. 5, the method 500 includes operations S510 to S540.
In operation S510, feature conversion is performed on each level of tag fields in the multi-level tag fields, so as to obtain a plurality of tag feature vectors.
According to an embodiment of the present disclosure, the multi-level tag field may characterize a multi-level semantic unit, and then the multi-level tag field may include a multi-level first-type semantic unit and a multi-level second-type semantic unit. And respectively performing feature conversion on each level of tag fields in the multi-level tag fields to obtain a plurality of tag feature vectors respectively corresponding to each level of tag fields. For example, in the case where the multi-level tag field may represent "one-level line+waste+cash", feature conversion is performed on each level tag field in the multi-level tag field, so that a semantic feature vector corresponding to "one-level line", a semantic feature vector corresponding to "waste", and a semantic feature vector corresponding to "cash" may be obtained.
In operation S520, each level of semantic units in the multi-level semantic units is subjected to feature conversion to obtain a plurality of semantic feature vectors.
According to embodiments of the present disclosure, the multi-level semantic units may be multi-level first-type semantic units and multi-level second-type semantic units. And respectively carrying out feature conversion on each level of semantic units in the multi-level semantic units to obtain feature vectors corresponding to each level of semantic units. For example, in the case where the multi-level semantic units can represent "one-level-row+disuse+cash", feature conversion is performed on each level of semantic units in the multi-level semantic units, so that a semantic feature vector corresponding to "one-level-row", a semantic feature vector corresponding to "disuse" and a semantic feature vector corresponding to "cash" can be obtained.
In operation S530, similarity values between the plurality of tag feature vectors and the plurality of semantic feature vectors are calculated.
According to an embodiment of the present disclosure, similarity values are calculated for feature vectors corresponding between a plurality of tag feature vectors and a plurality of semantic feature vectors, respectively. The multi-level tag field and the multi-level semantic unit can be divided into three parts, and similarity values between tag feature vectors corresponding to the first part in the multi-level tag field and semantic feature vectors corresponding to the first part in the multi-level semantic unit are calculated; calculating a similarity value between a tag feature vector corresponding to a second part in the multi-level tag field and a semantic feature vector corresponding to a second part in the multi-level semantic unit; and calculating a similarity value between the tag feature vector corresponding to the third part in the multi-level tag field and the semantic feature vector corresponding to the third part in the multi-level semantic unit. For example, in the case where the multi-level tag field may represent "one-level-row+waste+cash" and the multi-level semantic unit may represent "one-level-row+disuse+cash", calculating a similarity value between a tag feature vector corresponding to a first portion "one-level-row" in the multi-level tag field and a semantic feature vector corresponding to a first portion "one-level-row" in the multi-level semantic unit; calculating a similarity value between a label feature vector corresponding to the waste of a second part in the multi-level label field and a semantic feature vector corresponding to the disuse of the second part in the multi-level semantic unit; and calculating a similarity value between the label feature vector corresponding to the third part 'cash' in the multi-level label field and the semantic feature vector corresponding to the third part 'cash' in the multi-level semantic unit.
In operation S540, a matching degree value of each marked data is calculated according to the similarity values between the plurality of tag feature vectors and the plurality of semantic feature vectors.
According to the embodiment of the disclosure, according to the similarity values respectively corresponding to each multi-level tag field and three parts of the multi-level semantic unit, the matching degree value corresponding to each marked data can be obtained.
According to the embodiment of the disclosure, through the tag feature vectors corresponding to each level of tag fields in the multi-level tag fields and the semantic feature vectors corresponding to each level of semantic units in the multi-level semantic units, similarity values between a plurality of tag feature vectors and a plurality of semantic feature vectors can be calculated; and calculating the matching degree value of each marked data according to the similarity values between the plurality of tag feature vectors and the plurality of semantic feature vectors so as to determine the target data from the plurality of marked data according to the matching degree value of the marked data.
According to an embodiment of the present disclosure, the above data processing method further includes: obtaining a scoring value from a user client, wherein the scoring value is used for representing the satisfaction degree of a user on search results reflected by multiple target data; and under the condition that the scoring value is smaller than a third preset threshold value, receiving a custom query semantic phrase from the client, and associating the custom query semantic phrase with the data labels of the multiple pieces of marked data so as to determine the multiple pieces of custom query data from the multiple pieces of marked data.
According to the embodiment of the disclosure, a user can obtain search results reflected by multiple target data by taking the target query semantic phrase as a search word, and a scoring value can be obtained according to the satisfaction degree of the user on the search results.
According to embodiments of the present disclosure, the third preset threshold may characterize the minimum satisfaction of the user with the search result. The user-defined query semantic phrase can represent the query semantic phrase defined by the user according to the requirement. Under the condition that the scoring value is smaller than a third preset threshold value, namely that the user is not satisfied with the search results reflected by the multiple target data, the user can receive the custom query semantic phrase from the client, namely that the user can custom query the semantic phrase, and associate the custom query semantic phrase with the data labels of the multiple marked data, so that multiple pieces of custom query data are determined from the multiple marked data, wherein the custom query data can represent the query results obtained by querying by taking the custom query semantic phrase as the search word.
According to an embodiment of the present disclosure, associating the custom query semantic phrase with the data tags of the plurality of pieces of tagged data such that determining the plurality of pieces of custom query data from the plurality of pieces of tagged data may include: carrying out semantic matching on the multi-level tag fields in the data tags of the marked data and the multi-level semantic units in the custom query semantic phrase, and calculating to obtain the matching degree value of each marked data; and determining the marked data with the matching degree value larger than a second preset threshold value as custom query data.
According to the embodiment of the disclosure, the target query semantic phrase is used as a search word to obtain search results reflected by a plurality of target data, and under the condition that a user is not satisfied with the search results, the user can query by using the user-defined query semantic phrase as the search word, so that the user-defined query data can be obtained.
According to an embodiment of the present disclosure, the above data processing method further includes: carrying out statistical analysis on a plurality of target data, and outputting a statistical result; and visually displaying the statistical result.
According to the embodiment of the disclosure, statistics is performed on a plurality of pieces of target data obtained according to the target query semantic phrase, so that the statistical result can be visually displayed through a visual interface. Likewise, the statistical results of the user-defined query results obtained by the user according to the user-defined query semantic phrase can be visually displayed.
According to the embodiment of the disclosure, the statistical result is visualized, so that a user can intuitively and clearly know the statistical result of the query, and whether satisfaction is satisfied is judged according to the statistical result.
Based on the data processing method, the disclosure also provides a data processing device. The device will be described in detail below in connection with fig. 6.
Fig. 6 schematically shows a block diagram of a data processing apparatus according to an embodiment of the present disclosure.
As shown in fig. 6, the data processing apparatus 600 of this embodiment includes a first acquisition module 610, a generation module 620, a first association module 630, and a second association module 640.
The first obtaining module 610 is configured to obtain, in response to a user query request, a reference field by parsing the user query request, where the reference field includes at least a target query period field and a target institution field to which the user belongs. In an embodiment, the first obtaining module 610 may be configured to perform the operation S210 described above, which is not described herein.
The generation module 620 is configured to read, from the database, target historical query log data that is generated at a time prior to the target query period and that is associated with the target institution based on the target query period field and the target institution field. In an embodiment, the generating module 620 may be configured to perform the operation S220 described above, which is not described herein.
The first association module 630 is configured to associate the target historical query log data with a plurality of predefined query semantic phrases to obtain a target query semantic phrase. In an embodiment, the first association module 630 may be used to perform the operation S230 described above, which is not described herein.
The second association module 640 is configured to associate the target query semantic phrase with the data tags of the plurality of pieces of marked data, so as to determine the plurality of pieces of target data from the plurality of pieces of marked data. In an embodiment, the second association module 640 may be used to perform the operation S240 described above, which is not described herein.
According to an embodiment of the present disclosure, the first association module 630 includes a first obtaining unit and a first determining unit.
The first obtaining unit is used for matching the target historical query log data with a plurality of query semantic phrases and calculating to obtain the attention value corresponding to each query semantic phrase.
The first determining unit is used for determining the query semantic phrase with the attention value larger than a first preset threshold value as a target query semantic phrase.
According to an embodiment of the present disclosure, the first obtaining unit includes a determining subunit and an obtaining subunit.
The determining subunit is used for determining the occurrence times of all levels of semantic units in the query semantic phrase in the target historical query log data.
And the obtaining subunit is used for calculating and obtaining the attention value corresponding to each inquiry semantic phrase according to the occurrence times of each level of semantic units.
According to an embodiment of the present disclosure, the second association module 640 includes a second obtaining unit and a second determining unit.
The second obtaining unit is used for carrying out semantic matching on the multi-level tag field in the data tag of the marked data and the multi-level semantic unit in the target query semantic phrase, and calculating to obtain the matching degree value of each marked data.
And the second determining unit is used for determining marked data with the matching degree value larger than a second preset threshold value as target data.
According to an embodiment of the present disclosure, the second obtaining unit includes a first converting subunit, a second converting subunit, a first calculating subunit, and a second calculating subunit.
The first converting subunit is configured to perform feature conversion on each level of tag fields in the multi-level tag fields, so as to obtain a plurality of tag feature vectors.
And the second conversion subunit is used for respectively carrying out feature conversion on each level of semantic units in the multi-level semantic units to obtain a plurality of semantic feature vectors.
And the first calculating subunit is used for calculating similarity values between the plurality of tag feature vectors and the plurality of semantic feature vectors.
And the second calculating subunit is used for calculating the matching degree value of each marked data according to the similarity values between the plurality of tag feature vectors and the plurality of semantic feature vectors.
According to an embodiment of the present disclosure, the data processing apparatus further includes a second acquisition module, a receiving module, and a third association module.
And the second acquisition module is used for acquiring a scoring value from the user client, wherein the scoring value is used for representing the satisfaction degree of the user on the retrieval results reflected by the multiple target data.
And the receiving module is used for receiving the custom query semantic phrase from the client under the condition that the scoring value is smaller than a third preset threshold value.
And the third association module is used for associating the custom query semantic phrase with the data labels of the multiple pieces of marked data so as to determine multiple pieces of custom query data from the multiple pieces of marked data.
According to an embodiment of the disclosure, the data processing apparatus further includes an output module and a display module.
And the output module is used for carrying out statistical analysis on the multiple target data and outputting a statistical result.
And the display module is used for visually displaying the statistical result.
According to embodiments of the present disclosure, any of the first acquisition module 610, the generation module 620, the first association module 630, and the second association module 640 may be combined in one module to be implemented, or any of the modules may be split into a plurality of modules. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. According to embodiments of the present disclosure, at least one of the first acquisition module 610, the generation module 620, the first association module 630, and the second association module 640 may be implemented at least in part as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging the circuitry, or in any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, at least one of the first acquisition module 610, the generation module 620, the first association module 630, and the second association module 640 may be at least partially implemented as computer program modules that, when executed, perform the corresponding functions.
Fig. 7 schematically illustrates a block diagram of an electronic device adapted to implement a data processing method according to an embodiment of the disclosure.
As shown in fig. 7, an electronic device 700 according to an embodiment of the present disclosure includes a processor 701 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. The processor 701 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. The processor 701 may also include on-board memory for caching purposes. The processor 701 may comprise a single processing unit or a plurality of processing units for performing different actions of the method flows according to embodiments of the disclosure.
In the RAM 703, various programs and data necessary for the operation of the electronic apparatus 700 are stored. The processor 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. The processor 701 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 702 and/or the RAM 703. Note that the program may be stored in one or more memories other than the ROM 702 and the RAM 703. The processor 701 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in the one or more memories.
According to an embodiment of the present disclosure, the electronic device 700 may further include an input/output (I/O) interface 705, the input/output (I/O) interface 705 also being connected to the bus 704. The electronic device 700 may also include one or more of the following components connected to an input/output (I/O) interface 705: an input section 706 including a keyboard, a mouse, and the like; an output portion 707 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 708 including a hard disk or the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. The drive 710 is also connected to an input/output (I/O) interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read therefrom is mounted into the storage section 708 as necessary.
The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 702 and/or RAM 703 and/or one or more memories other than ROM 702 and RAM 703 described above.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowcharts. The program code means for causing a computer system to carry out the data processing methods provided by the embodiments of the present disclosure when the computer program product is run on the computer system.
The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 701. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed over a network medium in the form of signals, downloaded and installed via the communication section 709, and/or installed from the removable medium 711. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 709, and/or installed from the removable medium 711. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 701. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.
The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims (12)

1. A data processing method, comprising:
responding to a user query request, and acquiring a reference field by analyzing the user query request, wherein the reference field at least comprises a target query period field and a target institution field to which a user belongs;
Reading target historical query log data from a database, based on the target query period field and the target institution field, the generation time preceding the target query period and associated with the target institution;
correlating the target historical query log data with a plurality of predefined query semantic phrases to obtain target query semantic phrases;
and associating the target query semantic phrase with the data tags of the plurality of pieces of marked data so as to determine the plurality of pieces of target data from the plurality of pieces of marked data.
2. The method of claim 1, associating the target historical query log data with a plurality of predefined query semantic phrases, the obtaining a target query semantic phrase comprising:
matching the target historical query log data with a plurality of query semantic phrases, and calculating to obtain attention values corresponding to the query semantic phrases;
and determining the query semantic phrase with the attention value larger than a first preset threshold value as the target query semantic phrase.
3. The method according to claim 2, wherein:
the query semantic phrase comprises multi-level semantic units, and business association relations exist among the multi-level semantic units;
Matching the target historical query log data with a plurality of query semantic phrases, and calculating to obtain a focus value corresponding to each query semantic phrase comprises the following steps:
determining the occurrence times of the semantic units of each level in the query semantic phrase in the target historical query log data;
and calculating to obtain the attention value corresponding to each query semantic phrase according to the occurrence times of the semantic units at each level.
4. A method according to claim 3, wherein:
the query semantic phrase comprises a first type query semantic phrase and a second type query semantic phrase;
the first-type query semantic phrase comprises a plurality of levels of first-type semantic units, and the plurality of levels of first-type semantic units comprise: control elements, control contents and control effects;
the second category of query semantic terms includes a plurality of levels of second category semantic units, the plurality of levels of second category semantic units including: risk subject, risk action, risk object.
5. The method of claim 1, wherein the data tag of the marked data comprises a multi-level tag field and the query semantic phrase comprises multi-level semantic units;
associating the target query semantic phrase with data tags of the plurality of pieces of marked data such that determining the plurality of pieces of target data from the plurality of pieces of marked data includes:
Carrying out semantic matching on the multi-level tag field in the data tag of the marked data and the multi-level semantic unit in the target query semantic phrase, and calculating to obtain the matching degree value of each marked data;
and determining the marked data with the matching degree value larger than a second preset threshold value as the target data.
6. The method of claim 5, wherein semantically matching the multi-level tag field in the data tag of the marked data with the multi-level semantic unit in the target query semantic phrase, and calculating a matching degree value of each marked data comprises:
performing feature conversion on each level of tag fields in the multi-level tag fields to obtain a plurality of tag feature vectors;
performing feature conversion on each level of semantic units in the multi-level semantic units to obtain a plurality of semantic feature vectors;
calculating similarity values between the plurality of tag feature vectors and the plurality of semantic feature vectors;
and calculating to obtain the matching degree value of each marked data according to the similarity value between the plurality of tag feature vectors and the plurality of semantic feature vectors.
7. The method of any of claims 1-6, further comprising:
Obtaining a scoring value from a user client, wherein the scoring value is used for representing the satisfaction degree of a user on the retrieval results reflected by the multiple target data;
under the condition that the scoring value is smaller than a third preset threshold value, receiving a custom query semantic phrase from a client;
and associating the custom query semantic phrase with the data tags of the plurality of pieces of marked data so as to determine the plurality of pieces of custom query data from the plurality of pieces of marked data.
8. The method of any of claims 1-6, further comprising:
carrying out statistical analysis on the target data, and outputting a statistical result;
and visually displaying the statistical result.
9. A data processing apparatus comprising:
the first acquisition module is used for responding to a user query request and acquiring a reference field by analyzing the user query request, wherein the reference field at least comprises a target query period field and a target mechanism field to which a user belongs;
a generation module for reading target historical query log data, which is generated before the target query period and is associated with the target institution, from a database based on the target query period field and the target institution field;
The first association module is used for associating the target historical query log data with a plurality of predefined query semantic phrases to obtain target query semantic phrases;
and the second association module is used for associating the target query semantic phrase with the data labels of the multiple pieces of marked data so as to determine multiple pieces of target data from the multiple pieces of marked data.
10. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-8.
11. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1-8.
12. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 8.
CN202310405541.1A 2023-04-17 2023-04-17 Data processing method, device, equipment and storage medium Pending CN116361361A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310405541.1A CN116361361A (en) 2023-04-17 2023-04-17 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310405541.1A CN116361361A (en) 2023-04-17 2023-04-17 Data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116361361A true CN116361361A (en) 2023-06-30

Family

ID=86904749

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310405541.1A Pending CN116361361A (en) 2023-04-17 2023-04-17 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116361361A (en)

Similar Documents

Publication Publication Date Title
US10521404B2 (en) Data transformations with metadata
US9323736B2 (en) Natural language metric condition alerts generation
CN111401777B (en) Enterprise risk assessment method, enterprise risk assessment device, terminal equipment and storage medium
US20140100923A1 (en) Natural language metric condition alerts orchestration
US11263207B2 (en) Performing root cause analysis for information technology incident management using cognitive computing
US20140100901A1 (en) Natural language metric condition alerts user interfaces
US20220284067A1 (en) Method for pushing information, electronic device
CN112017062B (en) Resource quota distribution method and device based on guest group subdivision and electronic equipment
CN110942392A (en) Service data processing method, device, equipment and medium
CN116594683A (en) Code annotation information generation method, device, equipment and storage medium
US20230110941A1 (en) Data processing for enterprise application chatbot
CN113297287B (en) Automatic user policy deployment method and device and electronic equipment
US9443214B2 (en) News mining for enterprise resource planning
CN116955856A (en) Information display method, device, electronic equipment and storage medium
CN116225848A (en) Log monitoring method, device, equipment and medium
CN116048463A (en) Intelligent recommendation method and device for content of demand item based on label management
CN116361361A (en) Data processing method, device, equipment and storage medium
CN114201964A (en) Public opinion risk identification method and device, electronic equipment and storage medium
CN113095078A (en) Associated asset determination method and device and electronic equipment
CN113297139A (en) Metadata query method and system and electronic equipment
US20200334595A1 (en) Company size estimation system
CN113177116B (en) Information display method and device, electronic equipment, storage medium and program product
CN115689721A (en) Credit system information processing method, device, equipment and medium
CN115687284A (en) Information processing method, device, equipment and storage medium
CN115689263A (en) Information generation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination