CN113449506A

CN113449506A - Data detection method, device and equipment and readable storage medium

Info

Publication number: CN113449506A
Application number: CN202110732679.3A
Authority: CN
Inventors: 王义文; 施炳航
Original assignee: Weikun Shanghai Technology Service Co Ltd
Current assignee: Weikun Shanghai Technology Service Co Ltd
Priority date: 2021-06-29
Filing date: 2021-06-29
Publication date: 2021-09-28
Also published as: WO2023272833A1

Abstract

The embodiment of the application discloses a data detection method, a device, equipment and a readable storage medium, which relate to natural language processing technology in artificial intelligence, wherein the method comprises the following steps: acquiring target text data and determining a target scene identifier corresponding to the target text data; extracting a target object identifier associated with a target scene in the target text data, detecting the target text data, and extracting first key information associated with the target scene in the target text data, wherein the target scene identifier is used for identifying the target scene; and acquiring second key information corresponding to the target object identifier, and determining the violation result of the target text data based on the first key information and the second key information. By adopting the embodiment of the application, the accuracy of data detection can be improved.

Description

Data detection method, device and equipment and readable storage medium

Technical Field

The present application relates to the field of natural language processing technology in artificial intelligence, and in particular, to a data detection method, apparatus, device, and readable storage medium.

Background

In many fields, supervision is required to improve the quality of service. For example, in the sales field, due to insufficient supervision of the company, when a salesperson sells a product to a customer, false information may be used or risk information may be concealed in order to improve performance, so that a user may unknowingly purchase the product, and subsequently, if the product is at risk, the company may be complained, and the like, thereby affecting the reputation of the company. Therefore, how to improve the supervision of the related field is an urgent problem to be solved. In the prior art, manual supervision is generally used, the cost is high, and the data detection accuracy is low.

Disclosure of Invention

The embodiment of the application provides a data detection method, a data detection device, data detection equipment and a readable storage medium, and the data detection accuracy can be improved.

In a first aspect, the present application provides a data detection method, including:

acquiring target text data and determining a target scene identifier corresponding to the target text data;

extracting a target object identifier associated with a target scene in the target text data, detecting the target text data, and extracting first key information associated with the target scene in the target text data, wherein the target scene identifier is used for identifying the target scene;

and acquiring second key information corresponding to the target object identifier, and determining the violation result of the target text data based on the first key information and the second key information.

With reference to the first aspect, in a possible implementation manner, the determining a target scene identifier corresponding to target text data includes:

receiving a data acquisition response sent by a data providing terminal, wherein the data acquisition response comprises the target text data and the target scene identification; alternatively, the first and second electrodes may be,

and acquiring interface information of the data providing terminal, and determining a scene identifier corresponding to the interface information as a target scene identifier corresponding to the target text data.

With reference to the first aspect, in a possible implementation manner, the first key information includes a plurality of first keywords;

the detecting the target text data and extracting first key information associated with the target scene in the target text data includes:

and acquiring a target identification model associated with the target scene, identifying the target text data based on the target identification model, and extracting a plurality of first keywords associated with the target scene from the target text data.

With reference to the first aspect, in a possible implementation manner, the second key information includes a plurality of second keywords;

the obtaining of the second key information corresponding to the target object identifier includes:

acquiring attribute information of the target object identified by the target object identification from a data storage library, wherein the data storage library is used for storing the attribute information of at least one object;

and extracting a plurality of second keywords belonging to the same category as the plurality of first keywords from the attribute information of the target object, wherein one first keyword corresponds to one second keyword.

With reference to the first aspect, in a possible implementation manner, the determining a violation result of the target text data based on the first key information and the second key information includes:

determining whether the first key information and the second key information match;

if the first key information is matched with the second key information, determining that the violation result of the target text data is not violation;

and if the first key information does not match with the second key information, determining that the violation result of the target text data is a violation.

With reference to the first aspect, in a possible implementation manner, the method further includes:

if the violation result of the target text data is violation, acquiring violation keywords contained in the target text data, wherein the violation keywords belong to keywords contained in the first key information;

acquiring an identifier of a data providing terminal corresponding to the target text data, and determining violation information of a target user associated with the data providing terminal, wherein the violation information of the target user comprises at least one of the identifier of the target user, historical violation data of the target user and violation level of the target user, and the identifier of the data providing terminal is used for identifying the data providing terminal;

and outputting the violation keywords contained in the target text data and the violation information of the target user.

With reference to the first aspect, in a possible implementation manner, the acquiring target text data includes:

acquiring initial data from a data providing terminal, the initial data including at least one of initial voice data or initial text data;

performing voice recognition processing on the initial voice data to obtain text data corresponding to the initial voice data, and determining the text data corresponding to the initial voice data as the target text data; and/or the presence of a gas in the gas,

performing text splicing processing on at least one piece of initial text data to obtain target text data; and/or the presence of a gas in the gas,

and performing text screening processing on the initial text data to obtain text data associated with a target user, and determining the text data associated with the target user as the target text data.

In a second aspect, the present application provides a data detection apparatus, comprising:

the data acquisition module is used for acquiring target text data and determining a target scene identifier corresponding to the target text data;

the data extraction module is used for extracting a target object identifier associated with a target scene in the target text data, detecting the target text data, and extracting first key information associated with the target scene in the target text data, wherein the target scene identifier is used for identifying the target scene;

and the violation determining module is used for acquiring second key information corresponding to the target object identifier and determining a violation result of the target text data based on the first key information and the second key information.

With reference to the second aspect, in a possible implementation manner, the data obtaining module includes:

the data response unit is used for receiving a data acquisition response sent by the data providing terminal, wherein the data acquisition response comprises the target text data and the target scene identifier; alternatively, the first and second electrodes may be,

and the interface determining unit is used for acquiring interface information of the data providing terminal and determining the scene identifier corresponding to the interface information as the target scene identifier corresponding to the target text data.

With reference to the second aspect, in a possible implementation manner, the first key information includes a plurality of first keywords; the data extraction module is specifically configured to:

With reference to the second aspect, in a possible implementation manner, the second key information includes a plurality of second keywords; the violation determination module includes:

an attribute obtaining unit, configured to obtain attribute information of the target object identified by the target object identifier from a data store, where the data store is configured to store attribute information of at least one object;

and the word extraction unit is used for extracting a plurality of second keywords which belong to the same category as the plurality of first keywords from the attribute information of the target object, wherein one first keyword corresponds to one second keyword.

With reference to the second aspect, in a possible implementation manner, the violation determining module includes:

an information matching unit, configured to determine whether the first key information matches the second key information;

a result determining unit, configured to determine that a violation result of the target text data is a non-violation result if the first key information matches the second key information;

the result determining unit is further configured to determine that the violation result of the target text data is a violation if the first key information does not match the second key information.

With reference to the second aspect, in a possible implementation manner, the data detection apparatus further includes:

the result output module is used for acquiring the illegal keyword contained in the target text data if the illegal result of the target text data is illegal, wherein the illegal keyword belongs to the keyword contained in the first key information;

the result output module is used for acquiring an identifier of a data providing terminal corresponding to the target text data and determining violation information of a target user associated with the data providing terminal, wherein the violation information of the target user comprises at least one of the identifier of the target user, historical violation data of the target user and violation level of the target user, and the identifier of the data providing terminal is used for identifying the data providing terminal;

and the result output module is used for outputting the violation keywords contained in the target text data and the violation information of the target user.

With reference to the second aspect, in a possible implementation manner, the data obtaining module is specifically configured to:

In a third aspect, the present application provides a computer device comprising: a processor, a memory, a network interface;

the processor is connected to a memory and a network interface, wherein the network interface is used for providing a data communication function, the memory is used for storing a computer program, and the processor is used for calling the computer program to enable a computer device comprising the processor to execute the method of the first aspect.

In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program adapted to be loaded and executed by a processor to cause a computer device having the processor to perform the method of the first aspect.

In a fifth aspect, the present application provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternatives of the first aspect described above.

In the embodiment of the application, target text data are obtained, and a target scene identifier corresponding to the target text data is determined; extracting a target object identifier associated with a target scene in the target text data, detecting the target text data, and extracting first key information associated with the target scene in the target text data, wherein the target scene identifier is used for identifying the target scene; and acquiring second key information corresponding to the target object identifier, and determining the violation result of the target text data based on the first key information and the second key information. By determining a target scene and a target object identifier corresponding to the target text data, the target text data is detected, first key information associated with the target scene in the target text data can be extracted, and then a violation result of the target text data is judged for the second time by combining second key information (such as attribute information of the target object) of the target object, whether the target text data violates is determined, and the violation detection accuracy can be improved. For example, in the process of selling a product, target text data corresponding to a salesperson is obtained, the target text data is detected in combination with the above process to determine key information in the target text data, and the key information in the target text data is secondarily judged in combination with attribute information of the product to determine whether the target text data contains violation operation, so that whether the salesperson violates the violation operation is determined, and the accuracy of violation detection can be improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic view of an application scenario of a data detection method provided in an embodiment of the present application;

fig. 2 is a schematic flowchart of a data detection method according to an embodiment of the present application;

FIG. 3 is a schematic flow chart diagram of another data detection method provided in the embodiments of the present application;

fig. 4 is a schematic structural diagram of a data detection apparatus according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice technology, a natural language processing technology, machine learning/deep learning and the like.

Among them, Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like. Key technologies for Speech Technology (Speech Technology) are automatic Speech recognition Technology (ASR) and Speech synthesis Technology (TTS), as well as voiceprint recognition Technology. The computer can listen, see, speak and feel, and the development direction of the future human-computer interaction is provided, wherein the voice becomes one of the best viewed human-computer interaction modes in the future.

The application relates to a natural language processing technology in artificial intelligence, and the target text data is identified by utilizing the natural language processing technology, so that the accuracy of identification of the target text data can be improved, and the accuracy of detection of violation results in the target text data is further improved; moreover, the efficiency of data detection can be improved by carrying out automatic machine identification on the target text data; this application can be applicable to fields such as wisdom government affairs, wisdom education, is favorable to promoting the construction in wisdom city.

Referring to fig. 1, fig. 1 is a schematic view of an application scenario of a data detection method according to an embodiment of the present application, as shown in fig. 1, a computer device may obtain initial data 12 from a data providing terminal 11, where the initial data 12 may include call recording data, chat text data in a social program, chat voice data in the social program, and the like. The computer device may process the initial data using the data processing module 13 to obtain target text data. The data processing module 13 may include a voice recognition module, a text processing module, and the like, for example, the voice recognition module may be used to process call recording data and chat voice data to obtain target text data, and the text processing module may be used to process the chat text data to obtain the target text data. Further, the computer device may determine a target scene identifier corresponding to the target text data, thereby determining a target scene to which the target text data belongs, and extract a target object identifier 14 associated with the target scene in the target text data, where the target object identifier 14 may be, for example, an object name (product name). Further, the computer device may obtain a target recognition model 15 associated with the target scene, recognize the target text data based on the target recognition model 15, and extract the first key information 16 associated with the target scene in the target text data. Further, the computer device may obtain second key information corresponding to the target object identifier 14, for example, attribute information of the product, from the data storage 17 based on the target object identifier 14, and determine a violation result of the target text data based on the first key information and the second key information. And if the violation detection result is that the violation is not violated, outputting prompt information without violating the violation. If the violation detection result is a violation, the violation keyword in the target text data and violation information (such as a name of a worker, historical violation times, historical violation keywords, and the like) of the target user associated with the target text data may be obtained, and the violation keyword in the target text data and the violation information of the target user may be output, so that the relevant management user may correspondingly manage the target user.

It is understood that the computer device mentioned in the embodiments of the present application includes, but is not limited to, a terminal device or a server. In other words, the computer device may be a server or a terminal device, or may be a system of a server and a terminal device. The above-mentioned terminal device may be an electronic device, including but not limited to a mobile phone, a tablet computer, a desktop computer, a notebook computer, a palm computer, a vehicle-mounted device, an Augmented Reality/Virtual Reality (AR/VR) device, a helmet display, a wearable device, a smart speaker, a digital camera, a camera, and other Mobile Internet Devices (MID) with network access capability. The above-mentioned server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, Network service, cloud communication, middleware service, domain name service, security service, vehicle-road cooperation, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like.

Further, please refer to fig. 2, fig. 2 is a schematic flow chart of a data detection method according to an embodiment of the present application; as shown in fig. 2, the method includes, but is not limited to, the following steps:

s101, acquiring target text data and determining a target scene identifier corresponding to the target text data.

In the embodiment of the application, the computer device can acquire the target text data and determine the target scene identification corresponding to the target text data. The target text data may refer to text data that needs violation detection, and may include text data corresponding to each of multiple application scenarios, for example. In the embodiment of the present application, the detection of one target text data is exemplified, and if there are a plurality of target text data, a method for detecting the target text data may be referred to. The target scene identification may be used to uniquely indicate the target scene, and may include, for example, a name of the target scene, a number of the target scene, or other identification used to uniquely indicate the target scene. That is, the computer device determines the target scene identification, and then determines the target scene identified by the target scene identification, that is, determines the target scene to which the target text data belongs. The target scenario may include any of a fund sales agent scenario, an insurance sales agent scenario, a service scenario, a credit card transaction scenario, or other scenario. Optionally, the computer device may obtain the target text data from the data obtaining terminal, and may also obtain the target text data from a local storage, which is not limited in this embodiment of the application.

Optionally, the computer device may obtain initial data, and process the initial data to obtain target text data. Specifically, the computer device may obtain initial data from the data providing terminal, where the initial data includes at least one of initial voice data or initial text data; performing voice recognition processing on the initial voice data to obtain text data corresponding to the initial voice data, and determining the text data corresponding to the initial voice data as target text data; and/or performing text splicing processing on at least one piece of initial text data to obtain target text data; and/or performing text screening processing on the initial text data to obtain text data associated with the target user, and determining the text data associated with the target user as the target text data.

That is, if the initial data is voice data, the computer device may perform voice recognition processing on the initial voice data based on the voice recognition module in fig. 1, convert the initial voice data into text data, and determine the obtained text data as target text data. Optionally, the computer device may perform Speech Recognition processing on the Speech data by using an Automatic Speech Recognition (ASR) technology to obtain text data corresponding to the Speech data. Alternatively, the computer device may also perform speech recognition processing on the speech data by using other technologies, which is not limited in this embodiment of the application. Optionally, when the initial data is voice data, the initial data may include call recording data, chat text data in the social program, chat voice data in the social program, chat recording data in the social program, and chat video data in the social program, and the like.

If the initial data is text data, text splicing processing can be performed on at least one piece of initial text data to obtain target text data. For example, when the initial text data is the chat text data of the target user, at least one piece of initial text data of the target user may be obtained, text splicing processing is performed on the at least one piece of initial text data to obtain the chat text data of the target user, and the chat text data of the target user is determined as the target text data. The target user may refer to, for example, staff of various types of sales seat scenarios, or staff of various types of services, and so on. That is, the target text data may refer to chat text data transmitted from a terminal used by a worker in the current chat.

If the initial data is text data, text screening processing can be performed on the initial text data to obtain text data associated with the target user, and the text data associated with the target user is determined as target text data. For example, if the initial data is text data of a chat between a worker and a client, the initial text data may be subjected to a text filtering process, wherein the text data filtering associated with the client is discarded, and the remaining text data associated with the target user is determined as the target text data. The text data associated with the client may be chat text data sent by a terminal used by the client in the current chat.

Optionally, if the initial data is voice data, the computer device performs voice recognition processing on the initial voice data to obtain text data corresponding to the initial voice data, and if the number of the initial text data is at least one, text splicing processing may be performed on the at least one piece of initial text data to obtain target text data. Or, the computer device may perform text screening processing on the initial text data to obtain text data associated with the target user, and determine the text data associated with the target user as the target text data. For example, if the initial text data includes chat text data between the target user and the client, text screening may be performed on the initial text data to obtain text data associated with the target user, and the text data associated with the target user is determined as the target text data. In the subsequent process of detecting the violation of the target text data, only the chatting text data of the target user (such as a worker) can be detected, so that the data detection efficiency is improved.

Optionally, the method for determining, by the computer device, the target scene identifier corresponding to the target text data may include: and the computer equipment receives a data acquisition response sent by the data providing terminal, wherein the data acquisition response comprises target text data and a target scene identifier. Specifically, the data providing terminal may perform scene marking on the target text data in advance to obtain a target scene identifier corresponding to the target text data, and send the target text data and the target scene identifier to the computer device, so that the computer device may receive a data acquisition response (i.e., the target text data and the target scene identifier) sent by the data providing terminal, thereby determining the target scene identifier corresponding to the target text data.

Optionally, the computer device may further obtain interface information of the data providing terminal, and determine a scene identifier corresponding to the interface information as a target scene identifier corresponding to the target text data. Specifically, the data providing terminal may perform data interaction with the computer device through the interface, so that the computer device may obtain interface information of the data providing terminal, and determine a scene identifier corresponding to the data providing terminal through the interface information, thereby obtaining a target scene identifier corresponding to the target text data. For example, the scene identifier corresponding to the data providing terminal may include any one of a fund sales agent scene identifier, a service scene identifier, an insurance sales agent scene identifier, or other scene identifiers, that is, the data providing terminal is a terminal in any one of the scenes, and when the target user uses the data providing terminal to perform service processing, the obtained target text data also belongs to the target text data associated with the scene. Therefore, the computer device can determine the scene identifier corresponding to the data providing terminal by acquiring the interface information of the data providing terminal, and determine the scene identifier corresponding to the data providing terminal as the target scene identifier corresponding to the target text data. That is, by determining which terminal in which scene the data providing terminal belongs to, when the target text data is subsequently acquired from the data providing terminal, the scene corresponding to the data providing terminal may be determined as the target scene corresponding to the target text data, so as to determine the target scene identifier corresponding to the target text data.

S102, extracting a target object identification associated with a target scene in the target text data, detecting the target text data, and extracting first key information associated with the target scene in the target text data.

In the embodiment of the application, the computer device can extract the target object identifier associated with the target scene in the target text data in a manner of extracting the keyword. Specifically, the computer device may determine an object keyword in the target text data by recognizing a target object identifier associated with the target scene in the target text data, so as to extract the object keyword in the target text data as the target object identifier associated with the target scene. The target object identifier may include a name of the target object, a number of the target object, and the like. For example, if the computer device determines that the target scene is a fund sales agent scene, the target object may refer to a specific fund, and the target object identifier may refer to a name of the fund, a number of the fund, and the like. Or, if the computer device determines that the target scene is an insurance sales seat scene, the target object may refer to a specific insurance, and the target object identifier may refer to a name of the insurance, a number of the insurance, and the like. That is to say, the computer device may identify and extract a target object identifier associated with the target scene in the target text data by determining the target scene identifier corresponding to the target text data, so as to determine the target object identifier in the target text data.

Further, the computer device may detect the target text data, and extract first key information associated with the target scene in the target text data. The target scene identification is used for identifying the target scene. Optionally, the first key information may include a plurality of first key words, and the computer device may obtain a target recognition model associated with the target scene, recognize the target text data based on the target recognition model, and extract the plurality of first key words associated with the target scene in the target text data.

In a specific implementation, the target recognition model may be a Natural Language Processing (NLP) model, or the target recognition model may be another model. The first keyword may refer to a keyword associated with the target scene in the target text data, for example, may refer to a keyword describing risk, benefit, principal or other information of the target object, and may specifically include no risk, low risk, high benefit, low benefit, and reserve deposit, and the like. The computer device determines first key information by identifying a plurality of first keywords associated with the target scene in the target text data. By using the NLP model to identify the target text data, not only can the first key words in the target text data be identified, but also the semantic understanding can be carried out on the target text data, the implied meaning in the target text data can be identified, and therefore the extracted first key information can reflect the content of the target text data more accurately.

Optionally, before the target text data is identified by using the target identification model associated with the target scene, initial identification models corresponding to various scenes may be trained in advance, each scene may correspond to one initial identification model, the sample key information of each scene may be different by obtaining the sample key information of each scene, the sample key information corresponding to each scene is labeled, the initial identification model in the scene is trained by using the labeled sample key information, the model is stored when the model converges and reaches a certain precision (that is, a loss function value in the model is smaller than a loss threshold and the precision is greater than a precision threshold), and the stored model is the identification detection model. For example, the sample key information corresponding to the fund sales agent scene may include key information such as risk, income, principal, and the like; the sample key information corresponding to the insurance sales seat scene can comprise key information such as insurance age, insurance cost required to be paid every year, insurance category and payable amount; the sample key information corresponding to the service scenario may include key information of categories such as insults, abuses and the like; the sample key information corresponding to the credit card transaction scenario may include key information such as credit card amount and annual fee. It can be seen that the sample key information of each scene is different, and therefore the first key information obtained by the recognition model in each scene by recognizing the corresponding text data is also different, in the embodiment of the present application, the recognition model corresponding to each scene is obtained by training the recognition model corresponding to each scene, and further when the target text data is recognized, the target recognition model associated with the target scene can be obtained, the target text data is recognized based on the target recognition model, and the first key information in the target text data is extracted. As the corresponding recognition model is trained for each scene, the recognition model is used for recognizing the corresponding text data, and the model recognition accuracy can be improved.

Optionally, when extracting the target object identifier associated with the target scene in the target text data, the computer device may obtain the target object identifier associated with the target scene because the target scene identifier corresponding to the target text data is determined, and may not obtain the target object identifier not associated with the target scene. For example, when the target scene is a fund agent sales scene, the target object identifier associated with the target scene in the extracted target text data is a fund identifier, such as a fund name, and if it is detected that other scene identifiers, such as an insurance identifier, are included in the target text data, the extraction may not be performed, so that the data detection efficiency may be saved.

Optionally, when the computer device extracts the target object identifier associated with the target scene, if the number of the object identifiers associated with the target scene in the target text data is at least one, the computer device may determine the at least one object identifier, and determine the target object identifier from the at least one object identifier, so as to extract the target object identifier in the at least one object identifier, and then subsequently, when the violation detection is performed on the target text data, the efficiency of data detection may be improved. In a specific implementation, the computer device may extract text information in the target text data, determine that the target text data is text data for a certain object identifier, and thereby determine the object identifier as the target object identifier.

For example, if the target scene identifier is a fund sales scene identifier, and it is determined that the object identifier associated with the fund sales scene in the target text data includes a fund a identifier and a fund B identifier, the computer device may determine the target object identifier from the fund a identifier and the fund B identifier, for example, if the target object identifier is determined to be a fund a identifier, extract the fund a identifier. Further, the computer equipment detects the target text data and determines first key information related to the fund agent sales scene in the target text data. The target text data comprises a fund A identifier and a fund B identifier, correspondingly, first key information related to a fund agent sales scene in the target text data comprises first key information corresponding to the fund A and first key information corresponding to the fund B, and the first key information extracted in the previous step is the first key information corresponding to the fund A. Further, the computer device may obtain attribute information of the fund a, determine second key information corresponding to the fund a according to the attribute information of the fund a, determine a violation result of the target text data based on the second key information corresponding to the fund a and the first key information corresponding to the fund a, that is, determine whether the target user has a violation operation in the sale process of the fund a.

That is to say, in the case that the number of object identifiers associated with the target scene is multiple, the computer device may determine the target object identifier from the multiple object identifiers, so as to extract the target object identifier, and may not extract other object identifiers associated with the target scene, so as to reduce subsequent data detection efficiency. For example, in an actual application scenario, when a salesperson sells a product a, other types of products (e.g., product B) are usually used to compare with the product to be sold, so as to enhance the purchase desire of the user for the product a.

S103, second key information corresponding to the target object identification is obtained, and the violation result of the target text data is determined based on the first key information and the second key information.

In the embodiment of the application, the computer device may determine the violation result of the target text data based on the first key information and the second key information by obtaining the second key information corresponding to the target object identifier, and the violation result of the target text data may include violation or no violation. Optionally, the computer device may determine whether the first key information matches the second key information; and if the first key information is matched with the second key information, determining that the violation result of the target text data is not violation. And if the first key information is not matched with the second key information, determining that the violation result of the target text data is a violation.

The matching of the first key information and the second key information may mean that the meanings of the first key information and the second key information are the same, and the mismatching of the first key information and the second key information may mean that the meanings of the first key information and the second key information are opposite. For example, in a fund agent sales scene, if the first key information is no risk, the fund is guaranteed; and if the second key information is risk-free and warranty, the meaning of the first key information is the same as that of the second key information, and if the first key information is matched with the second key information, the violation result of the target text data is determined to be no violation. I.e., the target user (e.g., a worker) correctly informs the customer of the risk of the fund product, the sale is legitimate. On the contrary, if the first key information is no risk and warranty, and the second key information is medium-high risk and no warranty, it indicates that the meanings of the first key information and the second key information are opposite, and the first key information and the second key information are not matched, and it is determined that the violation result of the target text data is violation. I.e., the target user does not properly inform the client of the risk of the target object, the sale is illegal.

For example, in an insurance agent sales scene, if the first key information is of a protectable type a and the protectable amount is more than one hundred thousand, and the second key information is of a protectable type a and the protectable amount is more than one hundred thousand, the first key information and the second key information are identical in meaning, and the first key information and the second key information are matched, and the violation result of the target text data is determined to be no violation. That is, the target user correctly informs the customer of the insurable category and the reimburseable amount of the target object that the sale is legitimate. On the contrary, if the first key information is of the protectable type a and the reimburseable amount is more than one hundred thousand, and the second key information is of the protectable type B and the reimburseable amount is less than one hundred thousand, the meanings of the first key information and the second key information are different, and the first key information and the second key information are not matched, the violation result of the target text data is determined to be a violation. That is, the target user does not properly inform the customer of the insurable category and the reimburseable amount of the target object, the sale is illegal.

For example, in a credit card transaction scenario, if the first key information includes that the credit card amount is greater than 2 ten thousand and the annual fee is less than 500 yuan, and the second key information includes that the credit card amount is greater than 2 ten thousand and the annual fee is less than 500 yuan, it indicates that the first key information and the second key information have the same meaning, and the first key information and the second key information are matched, it is determined that the violation result of the target text data is not violation. That is, the target user correctly informs the customer of the securable category and the reimburseable amount of the credit card, that the sale is legitimate. On the contrary, if the first key information includes that the credit card amount is more than 2 ten thousand and the annual fee is less than 500 yuan, and the second key information includes that the credit card amount is less than 2 ten thousand and the annual fee is more than 500 yuan, it indicates that the meanings of the first key information and the second key information are different, and the first key information and the second key information are not matched, and it is determined that the violation result of the target text data is violation. I.e., the target user has not properly informed the customer of the securable category and the reimburseable amount of the credit card, the sale is legitimate.

Optionally, the second key information may include a plurality of second key words, and the computer device may obtain attribute information of the target object identified by the target object identifier from the data repository; a plurality of second keywords belonging to the same category as the plurality of first keywords are extracted from the attribute information of the target object. The data storage library is used for storing attribute information of at least one object, a first keyword corresponds to a second keyword, and the first keyword and the second keyword can be keywords which are described aiming at the same attribute of a target object. For example, the first keyword and the second keyword may refer to keywords describing the benefit of the target object, such as low benefit, total benefit, high benefit, and the like; keywords describing the risk of the target object, such as low risk, medium and low risk, medium and high risk, and the like; or keywords describing the principal of the target object, such as a principal of deposit, a principal of non-deposit, and so forth. The data store may store a name, an object number, and detail information for each of one or more objects, which may include a description of information for each object, such as risk, benefit, and principal. The attribute information of the target object may include a name of the target object, a number of the target object, and detail information, and the detail information of the target object may include a description of information such as risk, profit, and principal of the target object.

That is to say, the data repository stores attribute information of a plurality of objects in advance, and when the computer device obtains a target object identifier from target text data, the computer device may obtain the attribute information of the target object from the attribute information of the plurality of objects stored in the data repository, and extract a second keyword belonging to the same category as the first keyword from the attribute information of the target object, so that when it is subsequently determined whether the first key information matches the second key information, it is determined whether the first key information matches the second key information according to whether meanings of the first keyword and the second keyword are the same or similar. Optionally, if the meanings of the corresponding keywords in the plurality of first keywords and the plurality of second keywords are the same or similar, it indicates that the first key information is matched with the second key information. And if the meanings of any one of the first keywords are different from the meanings of the corresponding keywords in the second keywords, the first key information is not matched with the second key information.

For example, the computer device acquires a target object identifier as fund a from the target text data, detects the target text data, and extracts a first keyword, for example, including low risk and a deposit fund, from the data repository, the computer device acquires attribute information of the fund a, and extracts a second keyword, which belongs to the same category as the first keyword, including medium risk and high risk and a non-deposit fund, from the attribute information of the fund a; namely, the first keyword and the second keyword are descriptions aiming at the risk and principal of fund A, and the first keyword and the second keyword have opposite meanings and indicate that the first key information is not matched with the second key information, so that the violation result of the target text data is determined to be violation.

In the embodiment of the application, whether the target text data contains illegal contents or not can be determined by determining the target scene corresponding to the target text data and detecting the target text data by using the target recognition model corresponding to the target scene. For example, in a fund agent sales scenario, it may be detected whether the target user (staff) has a risk, profit, or the like to notify the customer of the fund, and when it is determined that the target user does not correctly notify the customer of the risk, profit, or the like of the fund, it is determined that the target text data includes violation content. Alternatively, in the insurance agent sales scene, it may be detected whether the target user has an insurable category, an insurable amount, or the like that informs the client of the insurance, and when it is determined that the target user does not correctly inform the client of the insurable category, the insurable amount, or the like of the insurance, it is determined that the target text data includes the violation content. Alternatively, in a credit card transaction scenario, it may be detected whether the target user has the credit card amount, annual fee, etc. that the client is notified of, and when it is determined that the target user has not correctly notified the credit card amount, annual fee, etc. of the client, it is determined that the target text data includes the violation content. Alternatively, for example, in a service scenario, it may be detected whether the target user has words of insults, abuse, or the like, and if it is determined that the target text data contains keywords of insults, abuse, or the like, it is determined that the target text data contains offending content. By automatically detecting the target text data, cost can be saved.

Optionally, the computer device may further obtain an adjustment operation for the attribute information of the target object in the data repository, adjust the attribute information of the target object based on the adjustment operation, determine, by combining the adjusted attribute information of the target object, the first key information in the target text data, and determine whether the target text data includes the violation operation. For example, before the target time, the attribute information of the target object is a high-risk non-warranty fund, after the target time, the attribute information of the target object is adjusted to a low-risk non-warranty fund, the computer device may update the attribute information of the target object in the data repository, and when it is determined whether the target text data includes an illegal operation subsequently, it may be determined whether the target text data includes the illegal operation by combining with the latest attribute information of the target product, so as to improve the accuracy of the illegal detection.

Further, please refer to fig. 3, fig. 3 is a schematic flow chart of another data detection method according to an embodiment of the present application; as shown in fig. 3, the method includes, but is not limited to, the following steps:

s201, acquiring target text data and determining a target scene identifier corresponding to the target text data.

S202, extracting a target object identifier associated with a target scene in the target text data, detecting the target text data, and extracting first key information associated with the target scene in the target text data.

S203, second key information corresponding to the target object identification is obtained, and the violation result of the target text data is determined based on the first key information and the second key information.

In the embodiment of the present application, the specific implementation manner of step S201 to step S203 may refer to the implementation manner of step S101 to step S103 in fig. 2, and details are not repeated here.

And S204, if the violation result of the target text data is violation, acquiring violation keywords contained in the target text data.

In the embodiment of the application, if the violation result of the target text data is violation, the computer device may obtain the violation keyword included in the target text data. The violation keywords may refer to one or more keywords included in the first key information. As can be seen from the foregoing embodiment, the first key information may include one or more first keywords, and the violation keyword may refer to a keyword having a meaning opposite to that of the second keyword in the one or more first keywords, that is, if it is detected that the violation result of the target text data is a violation, the computer device may obtain a specific violation keyword included in the target text data, so that subsequent related managers may determine violation content of the target user, and may also manage the target user.

S205, acquiring the identifier of the data providing terminal corresponding to the target text data, and determining violation information of the target user associated with the data providing terminal.

In the embodiment of the application, the computer device can acquire the identifier of the data providing terminal corresponding to the target text data and determine violation information of a target user associated with the data providing terminal. That is to say, after the computer device obtains the identifier of the data providing terminal corresponding to the target text data, the computer device can determine the target user using the data providing terminal, so as to obtain the historical violation information of the target user, and further quickly know the historical violation condition of the target user, thereby facilitating the relevant processing of the target user.

The violation information of the target user may include at least one of an identifier of the target user, historical violation data of the target user, and a violation level of the target user, where the identifier of the data providing terminal is used to identify the data providing terminal. The identification of the target user may be used to uniquely indicate the target user, and may include, for example, the name of the target user, the job number of the target user, the department to which the target user belongs, and the identification of the data providing terminal used by the target user, and so on. The historical violation data for the target user may include historical violation results for the target user, such as whether the target user has a historical violation record, a historical number of violations, a historical violation keyword, a comparison of numbers of violations in a historical period, and so forth. The violation level of the target user may be used to indicate the historical number of violations of the target user, for example, the higher the violation level of the target user, the more number of violations representing the target user; the higher the violation level of the target user is, the lower the number of violations of the target user is.

And S206, outputting the violation keywords contained in the target text data and the violation information of the target user.

In the embodiment of the present application, the computer device may output the violation keywords and the violation information of the target user, which are included in the target text data, in a text manner, or the computer device may also output the violation keywords and the violation information of the target user, which are included in the target text data, in a voice manner, which is not limited in the embodiment of the present application. By outputting the violation keywords and the violation information of the target user, the target user and related management users can know the violation keywords and the violation information of the target user, so that the target user can be conveniently and relatively managed, other users can also know the violation information, a warning effect can be achieved, and further the supervision is improved.

In the embodiment of the application, when the violation result of the target text data is determined to be violation, the violation result can be output, and corresponding violation information of the target user, namely the violation information of the violation user, can also be output, so that the related management user can know the violation condition of the target user, the related user can be conveniently managed, and further the supervision in the target scene is improved.

The method of the embodiments of the present application is described above, and the apparatus of the embodiments of the present application is described below.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a data detection apparatus according to an embodiment of the present application. The data detection means may be a computer program (comprising program code) running on a computer device, for example the data detection means is an application software; the data detection device can be used for executing corresponding steps in the method provided by the embodiment of the application. The data detection device 40 includes:

a data obtaining module 41, configured to obtain target text data, and determine a target scene identifier corresponding to the target text data;

a data extraction module 42, configured to extract a target object identifier associated with a target scene in the target text data, detect the target text data, and extract first key information associated with the target scene in the target text data, where the target scene identifier is used to identify the target scene;

and the violation determining module 43 is configured to obtain second key information corresponding to the target object identifier, and determine a violation result of the target text data based on the first key information and the second key information.

Optionally, the data obtaining module 41 includes:

a data response unit 411, configured to receive a data acquisition response sent by the data providing terminal, where the data acquisition response includes the target text data and the target scene identifier; alternatively, the first and second electrodes may be,

an interface determining unit 412, configured to obtain interface information of the data providing terminal, and determine a scene identifier corresponding to the interface information as a target scene identifier corresponding to the target text data.

Optionally, the first key information includes a plurality of first keywords; the data extraction module 42 is specifically configured to:

Optionally, the second key information includes a plurality of second key words; the violation determination module 43 includes:

an attribute obtaining unit 431, configured to obtain attribute information of the target object identified by the target object identifier from a data store, where the data store is configured to store attribute information of at least one object;

a word extracting unit 432, configured to extract a plurality of second keywords belonging to the same category as the plurality of first keywords from the attribute information of the target object, where one first keyword corresponds to one second keyword.

Optionally, the violation determining module 43 includes:

an information matching unit 433, configured to determine whether the first key information matches the second key information;

a result determining unit 434, configured to determine that the violation result of the target text data is not violation if the first key information matches the second key information;

the result determining unit 434 is further configured to determine that the violation result of the target text data is a violation if the first key information does not match the second key information.

Optionally, the data detecting device 40 further includes:

a result output module 44, configured to, if the violation result of the target text data is a violation, obtain a violation keyword included in the target text data, where the violation keyword belongs to a keyword included in the first key information;

the result output module 44 is configured to obtain an identifier of a data providing terminal corresponding to the target text data, and determine violation information of a target user associated with the data providing terminal, where the violation information of the target user includes at least one of the identifier of the target user, historical violation data of the target user, and a violation level of the target user, and the identifier of the data providing terminal is used to identify the data providing terminal;

the result output module 44 is configured to output the violation keywords included in the target text data and the violation information of the target user.

Optionally, the data obtaining module 41 is specifically configured to:

It should be understood that the data detection apparatus shown in fig. 4 may correspondingly execute any method embodiment, and the above operations or functions of each unit/module in the data detection apparatus are respectively for implementing corresponding operations in any method embodiment, and are not described herein again for brevity.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure. As shown in fig. 5, the computer device 50 may include: the processor 501, the network interface 504 and the memory 505, and the computer device 50 may further include: a user interface 503, and at least one communication bus 502. Wherein a communication bus 502 is used to enable connective communication between these components. The user interface 503 may include a Display screen (Display) and a Keyboard (Keyboard), and the optional user interface 503 may also include a standard wired interface and a standard wireless interface. The network interface 504 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 505 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. The memory 505 may alternatively be at least one memory device located remotely from the processor 501. As shown in fig. 5, the memory 505, which is a kind of computer-readable storage medium, may include therein an operating system, a network communication module, a user interface module, and a device control application program.

In the computer device 50 shown in fig. 5, the network interface 504 may provide network communication functions; while the user interface 503 is primarily an interface for providing input to a user; and processor 501 may be used to invoke a device control application stored in memory 505 to implement:

It should be understood that the computer device 50 described in this embodiment may perform the description of the above-mentioned data detection method in the embodiment corresponding to fig. 2 and fig. 3, and may also perform the description of the above-mentioned data detection apparatus in the embodiment corresponding to fig. 4, which is not described herein again. In addition, the beneficial effects of the same method are not described in detail.

Embodiments of the present application also provide a computer-readable storage medium storing a computer program, the computer program comprising program instructions, which, when executed by a computer, cause the computer to perform the method according to the foregoing embodiments, and the computer may be a part of the above-mentioned computer device. Such as processor 501 described above. By way of example, the program instructions may be executed on one computer device, or on multiple computer devices located at one site, or distributed across multiple sites and interconnected by a communication network, which may comprise a blockchain network.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims

1. A method for data detection, comprising:

and second key information corresponding to the target object identification is obtained, and the violation result of the target text data is determined based on the first key information and the second key information.

2. The method of claim 1, wherein the determining the target scene identifier corresponding to the target text data comprises:

and acquiring interface information of a data providing terminal, and determining a scene identifier corresponding to the interface information as a target scene identifier corresponding to the target text data.

3. The method of claim 1, wherein the first key information comprises a plurality of first keywords;

4. The method of claim 3, wherein the second key information comprises a plurality of second keywords;

5. The method according to claim 1 or 4, wherein the determining the violation result of the target text data based on the first key information and the second key information comprises:

and if the first key information is not matched with the second key information, determining that the violation result of the target text data is a violation.

6. The method of claim 5, further comprising:

7. The method of claim 1, wherein the obtaining target text data comprises:

acquiring initial data from a data providing terminal, wherein the initial data comprises at least one of initial voice data or initial text data;

8. A data detection apparatus, comprising:

9. A computer device, comprising: a processor, a memory, and a network interface;

the processor is coupled to the memory and the network interface, wherein the network interface is configured to provide data communication functionality, the memory is configured to store program code, and the processor is configured to invoke the program code to cause the computer device to perform the method of any of claims 1-7.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program adapted to be loaded and executed by a processor to cause a computer device having the processor to perform the method of any of claims 1-7.