CN116738198A - Information identification method, device, equipment, medium and product - Google Patents

Information identification method, device, equipment, medium and product Download PDF

Info

Publication number
CN116738198A
CN116738198A CN202310796666.1A CN202310796666A CN116738198A CN 116738198 A CN116738198 A CN 116738198A CN 202310796666 A CN202310796666 A CN 202310796666A CN 116738198 A CN116738198 A CN 116738198A
Authority
CN
China
Prior art keywords
text
tag
information
training
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310796666.1A
Other languages
Chinese (zh)
Inventor
王宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202310796666.1A priority Critical patent/CN116738198A/en
Publication of CN116738198A publication Critical patent/CN116738198A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Economics (AREA)
  • Computing Systems (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Operations Research (AREA)
  • Human Resources & Organizations (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Machine Translation (AREA)

Abstract

The disclosure provides an information identification method, an information identification device, an information identification equipment, an information identification medium and an information identification product, relates to the technical field of artificial intelligence, and can be applied to the technical field of financial science and technology, wherein the information identification method comprises the following steps: acquiring text data to be identified; preprocessing text data to be identified to obtain a first text feature. And inputting the first text characteristic into a trained recognition model to recognize target information included in the text data to be recognized, wherein the recognition model is obtained by training a space transformation network model in a heterogeneous migration learning mode based on domain adaptation, and a recognition label of the recognition model is configured according to the service attribute information. The information identification method, the information identification device, the information identification equipment, the information identification medium and the information identification product can accurately identify the target information in the text data to be identified.

Description

Information identification method, device, equipment, medium and product
Technical Field
The disclosure relates to the technical field of artificial intelligence, and can be applied to the technical field of financial science and technology, in particular to an information identification method, an information identification device, an information identification equipment, an information identification medium and an information identification product.
Background
With the deep learning technology, natural language processing also enters a stage of vigorous development. Today, various natural language processing techniques have been used in production and work to greatly reduce labor costs. With the rapid development of financial services, a plurality of services have been developed, such as natural language processing techniques, such as customer voices, which are recognized by customers to enable banks to provide services to customers according to recognition results.
However, the recognition accuracy of the voice of the existing client to the text data of the incoming call of the client is not high, and the accurate judgment of the mechanism on the client requirement is affected, so that the accuracy of the mechanism for serving the client is affected, and the client experience is reduced.
Disclosure of Invention
Accordingly, the primary purpose of the present disclosure is to provide an information recognition method, apparatus, device, medium and product, which aims to at least partially solve the technical problems of low recognition accuracy of the text data of the incoming call of the customer by the voice of the existing customer.
To achieve the above object, a first aspect of embodiments of the present disclosure provides an information identifying method, including: acquiring text data to be identified; preprocessing the text data to be identified to obtain a first text feature; and inputting the first text characteristic into a trained recognition model to recognize target information included in the text data to be recognized, wherein the recognition model is obtained by training a space transformation network model in a heterogeneous migration learning mode based on domain adaptation, and a recognition label of the recognition model is configured according to service attribute information.
According to an embodiment of the present disclosure, the information identification method further includes: sequentially performing dimension reduction and dimension increase on the first text feature to filter invalid information in the first text feature, so as to obtain a second text feature; and inputting the first text characteristic into a trained recognition model, and recognizing target information included in the text data to be recognized.
According to an embodiment of the present disclosure, the preprocessing the text data to be identified to obtain a first text feature includes: determining the length of each text in the text data; comparing the length of each text with a preset length threshold value, and determining the text with the length larger than the preset length threshold value; and intercepting a part of the text with the length larger than a preset length threshold value to obtain the first text feature with the preset dimension.
According to an embodiment of the present disclosure, the information identification method further includes: training the recognition model, comprising: acquiring historical text data; determining a training data set and a test data set based on the historical text data; labeling the training data set according to the identification tag, and labeling part of the test data set according to the identification tag; and taking the statistical distance between the source domain where the training data set is located and the target domain where the test data set is located as a loss function, and inputting the marked training data set and part of the marked test data into the space transformation network model to perform heterogeneous migration learning to obtain a trained identification model.
According to an embodiment of the present disclosure, the determining a training data set and a test data set based on the historical text data includes: downsampling a part of data in the historical text data to obtain a training data set; and randomly sampling another part of data in the historical text data to obtain a test data set.
According to an embodiment of the present disclosure, before inputting the noted training data set and the noted part of the test data into the spatial transformation network model for heterogeneous migration learning, the method further includes: the training data set and the test data set are mapped to the same dimension.
According to an embodiment of the present disclosure, the training data set and the test data set are mapped to a regeneration kernel hilbert space such that the training data set and the test data set are in the same dimension.
According to an embodiment of the present disclosure, the identification tag includes: the system comprises an accounting query tag, an account opening and website information query tag, an account and debit card tag, a transfer money tag, a personal credit tag, a deposit tag, a financial tag, a fund tag, a noble metal tag, a personal mobile phone banking tag, a personal online banking tag, a self-service machine tag, a messenger tag and a comprehensive tag, wherein the comprehensive tag is used for representing the tags corresponding to business attribute information except accounting query, account opening and website information query, account and debit card, transfer money, personal credit, deposit, financial, fund, noble metal, personal mobile phone banking, personal online banking, self-service machines and messengers.
A second aspect of an embodiment of the present disclosure provides an information identifying apparatus, including: the acquisition module is used for acquiring text data to be identified; the preprocessing module is used for preprocessing the text data to be recognized to obtain a first text characteristic; the identification module is used for inputting the first text characteristic into a trained identification model to identify target information included in the text data to be identified, wherein the identification model is obtained by training a space transformation network model in a heterogeneous transfer learning mode based on domain adaptation, and an identification label of the identification model is configured according to service attribute information.
A third aspect of an embodiment of the present disclosure provides an electronic device, including: one or more processors; and a storage means for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform a method of identifying information according to the above.
A fourth aspect of the disclosed embodiments provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform a method of identifying information according to the above.
A fifth aspect of the disclosed embodiments provides a computer program product comprising a computer program which, when executed by a processor, implements a method of identifying information according to the above.
The information identification method, the device, the equipment, the medium and the product provided by the embodiment of the disclosure have at least the following beneficial effects:
the identification tag is configured according to the attribute information of the service, and the text data to be identified is identified by combining the identification model which is obtained by training the space transformation network model based on domain adaptation and adopting the heterogeneous transfer learning mode, so that the identification model can be more suitable for information identification under a specific service scene, and the accuracy of target information identification is improved.
Because the processing mode of carrying out the dimension rising and dimension falling on the text features is adopted, invalid information in the first text features can be filtered, on one hand, the influence of invalid data on target information identification in the identification process is reduced, the accuracy of target information identification is improved, on the other hand, the data volume involved in the identification process is reduced, and the identification efficiency is improved.
Because the text with overlong length in the first text feature is intercepted by setting the text length threshold value, the calculated data size is reduced on the premise of ensuring that effective data is intercepted as much as possible, and the recognition efficiency is further improved.
In the training process of the identification model, all training data sets and part of test data sets are marked based on the identification labels, and then part of labeled data of the target domain and the source domain are used as the training set at the same time, and the statistical distance between the source domain where the training data set is positioned and the target domain where the test data set is positioned is combined as a loss function for training, so that the difference between the source domain and the target domain can be reduced better, domain adaptation is realized, the problem of model failure caused by the fact that all the source domain and the target domain do not have the same characteristics in the training process can be solved, and the accuracy of target information identification is improved.
In the process of acquiring and determining the training data set and the test data set, the acquired training data of each category can be balanced due to downsampling of the data in the historical text data, so that the problem that the accuracy of target information identification is low due to the fact that the accuracy of model training and the data distribution of results tend to be low due to certain categories can be avoided, and the accuracy of target information identification is improved. Because the data in the historical text data are randomly sampled, the acquired test data can cover the text data related to the current service as much as possible, so that the trained recognition model is more comprehensive, the application range of the recognition model is more comprehensive, and the accuracy of target information is further improved.
Because the training data set and the test data set are mapped to the same dimension, the problem of model failure caused by the fact that not all source domains and target domains have the same characteristics in the training process can be solved, and therefore accuracy of target information identification is further improved.
By defining the specific type of the identification tag, the identification method can be better suitable for the technical field of financial services.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present disclosure, and other drawings may be obtained from the structures shown in these drawings without inventive effort to those of ordinary skill in the art.
FIG. 1 schematically illustrates a system architecture 100 of an information identification method and apparatus according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a method of information identification according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow chart of a method of information identification according to another embodiment of the present disclosure;
FIG. 4 schematically illustrates a flowchart of preprocessing text data to be recognized in operation S202 shown in FIG. 1, according to an embodiment of the present disclosure;
FIG. 5 schematically illustrates a flow chart of training a recognition model according to an embodiment of the present disclosure;
FIG. 6 schematically illustrates a flowchart of determining a training data set and a test data set based on historical text data in operation S502 illustrated in FIG. 5, according to an embodiment of the present disclosure;
FIG. 7A schematically illustrates graphs of AUC's for respective training and test data sets, according to an embodiment of the disclosure;
FIG. 7B schematically illustrates a graph of a loss function according to an embodiment of the disclosure;
FIG. 8 schematically illustrates a confusion matrix plot of results on a labeled test set in the STN model for an validation set of epoch 40, in accordance with an embodiment of the present disclosure;
FIG. 9 schematically illustrates a labeling result diagram for labeling data of a test set according to an embodiment of the disclosure
FIG. 10 schematically illustrates a block diagram of an information recognition device according to an embodiment of the present disclosure;
fig. 11 schematically illustrates a block diagram of an information recognition apparatus according to another embodiment of the present disclosure;
fig. 12 schematically illustrates a block diagram of an information identifying apparatus according to still another embodiment of the present disclosure;
Fig. 13 schematically illustrates a block diagram of an electronic device adapted to implement the above-described method according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a formulation similar to at least one of "A, B or C, etc." is used, in general such a formulation should be interpreted in accordance with the ordinary understanding of one skilled in the art (e.g. "a system with at least one of A, B or C" would include but not be limited to systems with a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
Some of the block diagrams and/or flowchart illustrations are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data consistency restoration apparatus, such that the instructions, when executed by the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart. The techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). Additionally, the techniques of this disclosure may take the form of a computer program product on a computer-readable storage medium having instructions stored thereon, the computer program product being for use by or in connection with an instruction execution system.
In the technical scheme of the disclosure, the related information is collected, stored, used, processed, transmitted, provided, disclosed, applied and the like, which all conform to the regulations of related laws and regulations, necessary security measures are taken, and the public order harmony is not violated.
In the technical scheme of the disclosure, if the personal information of the user needs to be acquired, the authorization or the consent of the user is acquired before the personal information of the user is acquired or acquired.
Aiming at the technical problems in the related art, the embodiment of the disclosure provides an information identification method, which comprises the following steps: acquiring text data to be identified; preprocessing text data to be identified to obtain a first text feature. And inputting the first text characteristic into a trained recognition model to recognize target information included in the text data to be recognized, wherein the recognition model is obtained by training a space transformation network model in a heterogeneous migration learning mode based on domain adaptation, and a recognition label of the recognition model is configured according to the service attribute information.
Fig. 1 schematically illustrates a system architecture 100 of an information identification method and apparatus according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which embodiments of the present disclosure may be applied to assist those skilled in the art in understanding the technical content of the present disclosure, but does not mean that embodiments of the present disclosure may not be used in other devices, systems, environments, or scenarios.
As shown in fig. 1, a system architecture 100 according to this embodiment may include a communication device 101, a storage device 102, a network 103, and a server 104. The network 103 is used for the communication device 101, and the storage device 102 and the server 104 provide a communication link.
The communication device 101 may be, for example, an electronic device with a display or touch screen, and may include, for example, a landline, a cell phone, a computer, etc., and the communication device 101 may be configured to receive incoming voice data from a user and convert the voice data into text data.
The storage device 102 may be a hardware storage device or a software storage device, and the disclosure is not limited. The storage device 102 is used for storing incoming call voice data of the user received by the communication device 101 and text data converted from the incoming call voice data.
The network 103 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. The wired mode can be, for example, connection by adopting any one of the following interfaces: the wireless mode may be, for example, a wireless mode connection, where the wireless mode may be, for example, any one of a plurality of wireless technology standards such as bluetooth, wi-Fi, infrared, zigBee, etc.
The server 104 receives text data to be recognized currently acquired by the communication device 101 through the network 103, performs preprocessing on the text data to be recognized to obtain first text features, inputs the first text features into a trained recognition model stored on the server 104, and recognizes target information included in the text data to be recognized. In the process of training the recognition model, the server 104 acquires historical text data from the storage device 102 through the network 103, determines a training data set and a test data set based on the historical text data, marks the training data set according to the recognition tag, marks part of test data of the test data set according to the recognition tag, takes the statistical distance between a source domain where the training data set is located and a target domain where the test data set is located as a loss function, and inputs the marked training data set and the marked part of test data into the space transformation network model to perform heterogeneous migration learning to obtain the trained recognition model.
It should be noted that the information identifying method provided by the embodiment of the present disclosure may be executed by the server 104. Accordingly, the forwarding information identifying apparatus provided in the embodiments of the present disclosure may be disposed in the server 104. Alternatively, the information identifying method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 104 and is capable of communicating with the communication device 101 and/or the storage device 102 and/or the server 104. Accordingly, the information identifying apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 104 and is capable of communicating with the communication device 101 and/or the storage device 102 and/or the server 104. Alternatively, the information identifying method provided by the embodiment of the present disclosure may be partially executed by the server 104, partially executed by the communication device 101, and partially executed by the storage device 102. Correspondingly, the information identifying apparatus provided in the embodiments of the present disclosure may be partially disposed in the server 104, partially disposed in the communication device 101, and partially disposed in the storage device 102.
It should be understood that the number of communication devices, storage devices, networks, and servers in fig. 1 are merely illustrative. There may be any number of communication devices, storage devices, networks, and servers, as desired for implementation.
The information identification method provided by the embodiment of the disclosure can be applied to the field of financial science and technology. For example, it is important for banks to provide business support for further mining of value customers at different levels and to accommodate more premium customers, providing accurate service to customers. For example, banks are generally provided with a special customer complaint unit for accurately servicing customer calls, but due to the large base of customer groups, the number of staff in the customer complaint unit is limited, and in order to reduce the cost and improve the customer service consultation efficiency, the technology of processing the customer call voice data or text data by using chat robots (Chat Generative Pre-trained Transformer, chatGPT) (e.g., customer voices) has been increasingly used. In this scenario, how to accurately recognize target information in incoming voice data or text data is a key enabling the most customer to perform a point-to-point accurate service. If the target information in the incoming call voice data or text data can be accurately identified, the target information can be quickly distributed to related departments, such as retail departments/business departments and the like, in the follow-up service, and even can be refined to a client manager, so that a bank can perform one-to-one service on the incoming call clients. By adopting the information identification method provided by the embodiment of the disclosure, the target information in the incoming call voice data or the text data can be accurately identified.
It should be understood that the information identification method provided by the embodiments of the present disclosure is not limited to application in the field of financial technology, but may be applied to any field other than the financial field. The above description is merely exemplary, and the information recognition method of the embodiments of the present disclosure may be applied to any field related to recognition of target information in incoming call voice data or text data, for example, other technical fields such as electronic commerce, logistics, etc.
The information recognition method of the embodiment of the present disclosure will be described in detail below with reference to fig. 2 to 9, based on the scene of the description information recognition of fig. 1.
Fig. 2 schematically illustrates a flowchart of an information identification method according to an embodiment of the present disclosure.
As shown in fig. 2, the information identifying method may include, for example, operations S201 to S203.
In operation S201, text data to be recognized is acquired.
In the embodiment of the disclosure, the user may feed back the voice data containing the appeal through telephone, or manually input the text number containing the appeal through a mobile phone bank or an internet bank. In order to facilitate recognition of target information using natural language processing data, the server may convert voice data into text data.
In operation S202, text data to be recognized is preprocessed to obtain first text features.
In the embodiment of the disclosure, the text lengths included in different text data to be recognized are not the same, and considering that the text data cannot be recognized by using a large-scale recognition model due to restriction of calculation force, the length of the text data to be recognized can be processed in a preprocessing mode.
In operation S203, the first text feature is input into the trained recognition model, and target information included in the text data to be recognized is recognized.
In the embodiment of the disclosure, the identification model is obtained by training the spatial transformation network model in a heterogeneous migration learning mode based on domain adaptation, and the identification label of the identification model is configured according to service attribute information.
It will be appreciated that different text data may contain different target information, for example, the text data may be "i want to query for line of opening" and the target information may be "query line of opening" and the text data may be "i want to loan" and the target information may be "personal credit". It is known that in order to accurately recognize various types of text data, the recognition model is required to have high suitability.
According to the embodiment of the disclosure, the identification tag is configured according to the attribute information of the service, and the text data to be identified is identified by combining the identification model which is obtained by training the space transformation network model based on domain adaptation and adopting the heterogeneous transfer learning mode, so that the identification model can be more suitable for information identification under a specific service scene, and the accuracy of target information identification is improved.
Fig. 3 schematically illustrates a flow chart of an information identification method according to another embodiment of the present disclosure.
As shown in fig. 3, the information identifying method may further include operations S301 to S303, for example.
In operation S301, the first text feature is sequentially subjected to dimension reduction and dimension increase to filter invalid information in the first text feature, so as to obtain a second text feature.
In operation S302, the first text feature is input into a trained recognition model, and target information included in text data to be recognized is recognized.
For example, the voice data currently received by the server side includes some environmental noise (such as the sound of driving on a road vehicle, the sound of blowing, the sound of raining, etc.), and the presence of the environmental noise increases the calculation amount of the subsequent information recognition, reduces the recognition efficiency, and may affect the accuracy of the target information recognition. For example, the user expresses his own voice data or text data as not simple, and has repeated sentences or some information which is not greatly related to the final appeal, and the recognition of the target information by the information does not have positive effects, but increases the calculation amount and reduces the recognition efficiency.
According to the embodiment of the disclosure, the processing mode of carrying out the dimension ascending and dimension descending on the text features is adopted, so that invalid information in the first text features can be filtered, on one hand, the influence of invalid data on target information identification in the identification process is reduced, the accuracy of target information identification is improved, on the other hand, the data volume involved in the identification process is reduced, and the identification efficiency is improved.
Fig. 4 schematically illustrates a flowchart of preprocessing text data to be recognized in operation S202 illustrated in fig. 1 according to an embodiment of the present disclosure.
As shown in fig. 4, preprocessing text data to be recognized in operation S202 may include operations S401 to S403.
In operation S401, a length of each text in the text data is determined.
In operation S402, the length of each text is compared with a preset length threshold value, and texts having lengths greater than the preset length threshold value are determined.
In operation S403, a portion is cut from the text with a length greater than a preset length threshold, to obtain a first text feature with a preset dimension.
In the embodiments of the present disclosure, the length of text data may be understood as the number of characters contained in the text data. The preset length threshold may be determined according to the computing power of the recognition model, for example, the preset length threshold may take 10 characters, 15 characters, etc. It should be immediately appreciated that the foregoing preset length threshold values are merely exemplary and are not intended to limit the present disclosure.
According to the embodiment of the disclosure, as the text with the overlong length in the first text feature is intercepted by setting the text length threshold, the calculated data volume is reduced on the premise of ensuring that effective data is intercepted as much as possible, and the recognition efficiency is further improved.
Further, on the basis of the above embodiment, the information identifying method may further include training an identifying model.
FIG. 5 schematically illustrates a flow chart for training a recognition model according to an embodiment of the present disclosure.
As shown in fig. 5, the training recognition model may include operations S501 to S504, for example.
In operation S501, history text data is acquired.
In operation S502, a training data set and a test data set are determined based on historical text data.
In operation S503, the training data set is labeled according to the identification tag, and a part of the test data set is labeled according to the identification tag.
In operation S504, the statistical distance between the source domain where the training data set is located and the target domain where the test data set is located is used as a loss function, and the labeled training data set and the labeled part of the test data are input into the spatial transformation network model to perform heterogeneous migration learning, so as to obtain a trained recognition model.
In an embodiment of the present disclosure, obtaining the historical text data may include: historical work order data is obtained, and historical text data is extracted from the historical work order data.
In the embodiment of the disclosure, a Spatial transformation network model (Spatial TransformerNetworks, STN) can be constructed based on BERT (Bidirectional Encoder Representations from Transformers) model transfer learning, domain adaptation and other artificial intelligence technologies, and the service types are split again according to the potential value types of incoming clients, and can be divided into accounts inquiry, account opening and website information inquiry, account and debit card, transfer and remittance, personal credit, deposit, financial management, fund, noble metal, personal mobile banking, personal online banking, self-help machines, silver messenger and other 14 categories, so that the automatic identification process of each work order according to the service types is realized, and service support is provided for the value clients of different levels of the follow-up further mining my. As will be described in detail later.
BERT is a deep bi-directional transducer pre-training model, which can be trained using only a plain text corpus as an unsupervised learning. Similar to the Transformer, BERT is also a deep network model of the contact context, but the BERT uses only the Encoder part, which builds multiple encoders together to also form its basic network structure. The method comprises the steps of translating a translation model into a deep learning neural network model, wherein the translation model is a deep learning neural network model acting on the translation model, the modules of the Encoder and the Decoder are reserved, the Encoder is divided into two modules, namely a Self-attribute layer and a feedforward neural network layer, and the Encoder-Decoder Attention layer is added after the Self-attribute layer on the basis of the Encoder.
BERT is changed compared with a transducer in input, and segment vector features (segment embedding) are added on the basis of the position of the transducer and the word vector. It is mentioned in the transducer that the position information can be trained or directly obtained using a function, whereas the position information in the BERT is directly obtained using training. The segment vector is like to determine whether two texts are similar, and the one text has only two elements, 0 and 1, which will set the same word vector part in two similar sentences to 0 and the different part to 1. Finally, adding the three to obtain BERT input.
During the training, the BERT will perform two phases of training. The first stage is Pre-Training (Pre-Training), which is divided into two steps, the first step is masking, which masks a proportion of the vocabulary, letting the model predict the masked words omnidirectionally according to the context. The second step is to choose tens of thousands of pairs of sentences in the training continuous parameters, some of which are continuous and some of which are discontinuous, so that the model works at this time to make a two-class model to determine which sentences are continuous and which are not. The second partial-phase Tuning (Fine-Tuning) can adapt different neural networks according to different tasks, such as classification models, prediction models, and even non-supervised learning such as clustering.
From the architecture of the model, the advantage over the transformers is that BERT overcomes the unidirectional limitation that BERT fuses the context on the left and right sides, thus effectively pre-training the Encoder model of a bidirectional Transformer. In practical use, BERT exhibits better performance over many data sets than other models, such as transducer, bi-LSTM. In the model of the embodiment of the disclosure, the pre-trained model is utilized to re-train parameters therein to adapt to the text, and the final architecture is a multi-layer full-connection layer and a Decoder layer.
Further, in carrying out the present disclosure, the applicant found that: in supervised learning, models obtained from a training set may not perform well on a test set due to the difference in data distribution between the training set and the test set. For example, in the graphic classification model, if the content of the training set is daytime content, the RGB color values on the image will be higher due to the problem of light, and if the test set is evening, the RGB color values will be lower, so that many algorithm failures will be caused by light formed in the result. The same situation also occurs in the task of the cyclic neural network, such as a translation model, and the like, and the generalization error is larger due to the difference of language types and different use modes, so that the training of the recognition model provided by the embodiment of the disclosure also has a corresponding problem.
Thus, the domain adaptation (Domain Adaptation, DA) is introduced in the training of the recognition model, and the domain adaptation can be understood as a method for reducing Generalization errors (Generalization), and the core idea is to reduce the domain distance (disclassification) between the domain where the training set is located (source domain) and the domain where the test set is located (target domain) through transformation, so that the trained model is better adapted to the data of the test set.
Methods for achieving domain adaptation are generally classified into two types, in which domain distance reduction, which is a relatively common and intuitive method, is used as an optimization target in a training process and data is generated and countered using counterlearning to achieve the target. The domain distance is the statistical distance between the domain where the test set is located and the domain where the training set is located, and is measured using the maximum mean difference (Maximum Mean Discrepancy, MMD) in domain adaptation, and can be defined as formula (1):
where f (·) is a continuous function over the sample space, E p [f(x)]For the mean value of the function value of the continuous function f (,), E of the sample data in the test set q [f(y)]For the sample data in the training set, the continuous function f is calculatedThe mean of the function values of. Cndot.).
Because of the arbitrary nature of the continuous function f (·) it is difficult to directly calculate, for example, before performing the heterogeneous migration learning of the annotated training data set and the annotated partial test data input into the spatial transformation network model in operation S504, the method further includes: the training data and the test data are mapped to the same dimension.
Preferably, the training data set and the test data set may be mapped to a regenerated kernel hilbert space (Reproducing Kernel Hilbert Space, RKHS) such that the training data set and the test data set are in the same dimension. In particular, using the characteristics of RKHS, i.e., any function and its dependent variables in the space can be expressed as an inner product of a set of kernel functions. In this way, the MMD may be calculated by equation (2):
wherein, phi is Gaussian kernel (GBF kernel), namely formula (3):
it should be noted that there are two reasons for using the gaussian kernel here, one is that it can compress the norms of the function set (()) to (0, 1) by the range of the exponential function over the positive interval (0, 1), providing for the last step of equation (2), and the other is equivalent to the fourier transform over an infeasible set, and it maps the data to an orthogonal space, making domain adaptation a significant advantage over classification problems.
Therefore, the MMD can be actually calculated last and used pq || H To perform the calculation. Considering equation (2) maps data to RKHS in use, in order to do such mapping simultaneously in actual calculation and ensure correctness of kernel function in use, square of kernel function is utilized in use Namely formula (4):
MMD[F,p,q] 2 =tr(KL)
where the matrix is K is the matrix obtained by the kernel function K (·,) L is the piecewise constant and tr is the trace of the matrix, i.e., equation (5):
thus, the difference between the two domains can be calculated by equation (4), and the statistical distance between the two domains can be reduced by shortening the obtained MMD.
For MMD usage, the domain distance of the extracted features is measured to reach the optimal solution after it is placed in the Bottleneck layer (Bottleneck) in the full connection layer (Fully Connected Layers), which is obtained by Fine Tuning (Fine Tuning). Numerically, the MMD calculated value is taken as a part of the loss function, and is linearly combined with the original loss function value after introducing the super-parameters, thereby reducing the domain distance and improving the generalization
In embodiments of the present disclosure, the data set, which is currently using the customer's voice, is not exactly the same as the test set, which particularly highlights the necessity of transfer learning. In addition, different dialects may relate to different word vectors, for example, a dialect "how to get back", may include 3 words or 4 words, and may generate different words on the converted text due to different pronunciations, so that the dialects may be converted through heterogeneous migration learning (Hyterogeneous domain adaptation), so that the problem that not all source domains have the same characteristics as the target domains is solved, and the generalization capability is improved. Therefore, the present disclosure uses the STN model with higher accuracy in the transfer learning. In the transfer learning, the existing knowledge is called a source domain (source domain), the new knowledge to be learned is called a target domain (target domain), and the transfer learning is to transfer the knowledge of the source domain onto the target domain. In the embodiment of the disclosure, the source domain is a domain where the training set data set is located, is a different domain from the test data set, but has rich supervision information, and the target domain is a domain where the test set data set is located, and has a small number of labels.
In an embodiment of the present disclosure, the loss function of the STN model is composed of two parts, margin MMD (M m ) And Conditional MMD (M c ) Their expressions are formula (6) and formula (7), respectively:
wherein M is m For MMD as defined above, at M C Wherein C is the number of categories, n s And (3) withThe number of the source domain data and the number under the category k, n t And->The number of data with labels of the target domain and the number under the category k, n u The number of unlabeled data in the target domain. And->Defined as formula (8):
where R is the total number of cycles, R is the current number of cycles, and y is the pseudo tag generated by the model using its own parameters. The recognition model has the advantage of discarding the defects of the prior MMD invisible categoryAnd by addingThe recognition model can gradually increase the weight of the added pseudo tags, so that the semi-supervised learning effect is approximately achieved.
Therefore, based on the recognition model, the data of the voice of the client can be used as a training set to manually mark part of the data, and then the model without marking originally is trained, so that the accuracy is improved.
Further, it is necessary to analyze feature vectors of text data after the recognition model obtains the feature vectors. Features are screened through such fully connected layers, similar to the multiple fully connected layers in the Computer Vision (CV) field. For example, different accents, dialects, and regional dialogue features can have different effects on information, so this disclosure adds domain adaptation of STNs and gives its loss function (9) in this section:
ζ MMD =M m +M C
The best place for domain adaptation is to add at the bottleneck (Bottlenect), meaning that the fully connected layer of the disclosed embodiments needs to first dimension down and then dimension up to filter a portion of invalid or minimally affected feature information, after such bottleneck layer, add the domain adaptation layer, then determine the number of layers and dimensions by fine tuning, and reduce the dimensions to the probability values established before the embodiments of the disclosure after domain adaptation is completed. Finally, the type is confirmed by the probability value, and the loss function involved is defined as formula (10):
ζ C =soft max(Z-Z true )
wherein Z is the label finally obtained in training, Z true Is a label according to the true label.
Thus, the final loss function of the recognition model is expressed as (11):
ζ=ζ C +λζ MMD
wherein λ is the hyper-parameter.
According to the embodiment of the disclosure, in the training process of the identification model, all training data sets and part of test data sets are marked based on the identification label, and then part of labeled data of the target domain and the source domain are used as the training set at the same time, and the statistical distance between the source domain where the training data set is positioned and the target domain where the test data set is positioned is combined as a loss function for training, so that the difference between the source domain and the target domain can be reduced better, domain adaptation is realized, and the problem of model failure caused by the fact that all the source domain and the target domain have the same characteristics in the training process can be solved, and the accuracy of target information identification is improved. Because the training data set and the test data set are mapped to the same dimension, the problem of model failure caused by the fact that not all source domains and target domains have the same characteristics in the training process can be solved, and therefore accuracy of target information identification is further improved.
Fig. 6 schematically illustrates a flowchart of determining a training data set and a test data set based on historical text data in operation S502 illustrated in fig. 5 according to an embodiment of the present disclosure.
As shown in fig. 6, determining the training data set and the test data set based on the history text data in operation S502 may include, for example, operations S601 to S602.
In operation S601, a portion of the data in the history text data is downsampled to obtain a training data set.
In operation S602, another portion of the data in the history text data is randomly sampled to obtain a test data set.
In the embodiment of the disclosure, since the provided customer incoming call voice-text data does not provide a classification tag, the classification or clustering operation by directly using the data set is difficult to realize due to the limitation of hardware and software facilities.
Based on this, the embodiment of the disclosure selects the data of the acoustic volume of the client as the training set of the initial pre-training model, selects the work_order_biz_class_cd, the work_order_biz_sub_class_cd, the cure_open_idtfy_cd as the training label, and the work_order_cure_process_program_desc as the training data set for training.
However, since the content of the self tag does not correspond, the embodiment of the present disclosure reconfigures the training tag (called as the identification tag when identifying) according to the attribute information (for example, may be based on the attribute information) of the service, and table 6.1 gives the mapping relationship between the category of the sound work order and the category of the subject tag of the customer:
TABLE 6.1
As shown in table 6, the identification tag includes: the system comprises an accounting query tag, an account opening and website information query tag, an account and debit card tag, a money transfer tag, a personal credit tag, a deposit tag, a financial tag, a fund tag, a noble metal tag, a personal mobile banking tag, a personal online banking tag, a self-service machine tool tag, a messenger tag and a comprehensive tag, wherein the comprehensive tag is used for representing the corresponding tag of business attribute information except accounting query, account opening and website information query, account and debit card, money transfer and remittance, personal credit, deposit, financial management, fund, noble metal, personal mobile banking, personal online banking, self-service machine tool and messenger. By defining the specific type of the identification tag, the identification method can be better suitable for the technical field of financial services.
It should be noted that, because the amount of data on each category is not the same, for example, account and debit card account ratios are very large in the customer's voice, which can lead to sample imbalance in the data set, and thus, the accuracy of training and resulting data distribution tend to be of this type.
For example, the data set of the sound work order of the customer is very large and exceeds 1300 ten thousand, and the complete training of the data is not allowed in time, so that the data of the sound work order of the customer is subjected to downsampling operation, and the data amount of each type is balanced. Embodiments of the present disclosure, for example, choose 14 ten thousand data, i.e., 1 ten thousand data volume per class, for training. After the initial training set is obtained, the provided customer incoming call text data is randomly sampled, a plurality of lines of data are extracted for manual label marking, the small data set is utilized for testing, and a plurality of lines of data can be extracted for manual label marking to serve as a verification set. For example, embodiments of the present disclosure randomly sample 269 thousands of customer incoming text data provided, extract 510 data for manual tagging, and utilize this small dataset for validation sets.
According to the embodiment of the disclosure, in the process of acquiring the determined training data set and the test data set, since the data in the historical text data is subjected to downsampling, the acquired training data of each category can be balanced, so that the problem that the accuracy of target information identification is low due to the accuracy of model training and the tendency of data distribution of results to certain categories can be avoided, and the accuracy of target information identification is improved. Because the data in the historical text data are randomly sampled, the acquired test data can cover the text data related to the current service as much as possible, so that the trained recognition model is more comprehensive, the application range of the recognition model is more comprehensive, and the accuracy of target information is further improved.
Further, in the training process of the recognition model, since the length of each text is not the same, and since the dimension provided by BERT is high, the MiniBert model used cannot cover all the text, and a part of the text with an excessively long length is cut. And because of the limitation of the computational power of the graphics processor (Graphics Processing Unit, GPU), the test data cannot be predicted using a larger scale model. Therefore, the data features stored in the 40 th cycle are used, the last full-connection layer of the model is removed, the feature dimension of a new dimension (for example 1024 dimensions) is obtained to serve as new pre-training data, and the training of transfer learning is performed on the new pre-training data, so that the difference between a source domain and a target domain is reduced, and the accuracy is improved.
To further parameter the advantages of the identification model provided by the presently disclosed embodiments, some data are provided below for support.
Fig. 7A schematically illustrates AUC graphs corresponding to a training data set and a test data set, respectively, according to an embodiment of the present disclosure. Fig. 7B schematically illustrates a graph of a loss function according to an embodiment of the disclosure.
As shown in fig. 7A and 7B, the accuracy of recognition by the recognition model provided by the embodiments of the present disclosure is improved by about 10% as compared to the model of the five MMDs, which is up to 50.39%. Similarly, the effect of the overfitting on the model accuracy can be seen on the accuracy curve, and therefore, the embodiment of the disclosure selects the model with epoch 40 as the final model as the way to predict the test set.
Fig. 8 schematically illustrates a confusion matrix plot of results on a labeled test set in the STN model for an epoch 40 validation set in accordance with an embodiment of the present disclosure.
As shown in fig. 8, for example, model training is performed by relying on the jipyter platform of the worker silver figure, and tuning work is performed on the prediction effect of the verification set. And verifying and evaluating the model through indexes such as Accuracy, N-dimensional confusion matrix, weighted-F1-score and the like, and selecting the recognition model result with the best effect as a final recognition model for customer incoming call information service recognition. By taking the F1score for each class and averaging, this model is 0.5055 for the validation set of F1 score.
The embodiment of the disclosure also labels the data of the test set.
Fig. 9 schematically illustrates a labeling result diagram for labeling data of a test set according to an embodiment of the disclosure.
As shown in fig. 9, account and debit card consultation is at a maximum of about 65%, with most of the contents being customer-made to transfer or account questions, which may then be processed by the customer manager. While the least occupation is fund-related consultation, no fund-related results appear in the predicted outcomes of the embodiments of the present disclosure. This inaccurate result may be due to less data about the fund itself in the customer's voice, resulting in data and an unbalanced result. It is worth mentioning that most of the acoustic data of the customers of the training set are also account and debit card consultations. Because the current model of the embodiment of the disclosure has demonstrated convergence and a certain accuracy, if the training set label of the embodiment of the disclosure is replaced by the call information of the manually labeled label, the accuracy is further improved.
In summary, the identification method of the embodiment of the present disclosure can accurately and comprehensively identify the target information in the text data. The identification method is applied to the field of banks, and can accurately perform point-to-point accurate service on the clients, so that the service efficiency and the service quality of the incoming call service are improved, and the experience of the clients is improved.
The embodiment of the present disclosure further provides an information recognition apparatus based on the information recognition methods shown in fig. 2 to 9, and the information recognition apparatus of the embodiment of the present disclosure will be described below by way of fig. 10 to 12 based on the scenario described in fig. 1.
Fig. 10 schematically illustrates a block diagram of an information recognition apparatus according to an embodiment of the present disclosure.
As shown in fig. 10, the information identifying apparatus 1000 may include an acquisition module 1010, a preprocessing module 1020, and an identification module 1030.
An obtaining module 1010, configured to obtain text data to be identified. The obtaining module 1010 may be configured to perform the operation S201 described above, which is not described herein.
The preprocessing module 1020 is configured to preprocess text data to be recognized to obtain a first text feature. The preprocessing module 1020 may be used to perform the operation S202 described above, which is not described herein.
The recognition module 1030 is configured to input the first text feature into a trained recognition model, and recognize target information included in the text data to be recognized, where the recognition model is obtained by training the spatial transformation network model by using a heterogeneous migration learning manner based on domain adaptation, and a recognition tag of the recognition model is configured according to the service attribute information. The identification module 1030 may be used to perform operation S203 described above, and will not be described herein.
Fig. 11 schematically illustrates a block diagram of an information recognition apparatus according to another embodiment of the present disclosure.
As shown in fig. 11, the information identifying apparatus 1000 may further include a filtering module 1040, for example.
And the filtering module 1040 is configured to sequentially perform dimension reduction and dimension increase on the first text feature, so as to filter invalid information in the first text feature, and obtain a second text feature. The filtering module 1040 is used for performing the operation S301 described above, and will not be described herein.
The recognition module 1030 is further configured to input the first text feature into a trained recognition model, and recognize target information included in the text data to be recognized. The identification module 1030 may also be used to perform operation S302 described above, which is not described herein.
Fig. 12 schematically illustrates a block diagram of an information identifying apparatus according to still another embodiment of the present disclosure.
As shown in fig. 12, the information identifying apparatus 1000 may further include a training module 1050, for example.
Training module 1050, for training a recognition model, comprising: historical text data is obtained. A training data set and a test data set are determined based on the historical text data. Labeling the training data set according to the identification tag, and labeling part of the test data set according to the identification tag. And taking the statistical distance between the source domain where the training data set is positioned and the target domain where the test data set is positioned as a loss function, and inputting the marked training data set and part of marked test data into a space transformation network model to perform heterogeneous migration learning so as to obtain a trained identification model. The training module 1050 may be used to perform operations S501 to S504 described above, which are not described herein.
Any number of modules, sub-modules, units, sub-units, or at least some of the functionality of any number of the sub-units according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented as split into multiple modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system-on-chip, a system-on-substrate, a system-on-package, an Application Specific Integrated Circuit (ASIC), or in any other reasonable manner of hardware or firmware that integrates or encapsulates the circuit, or in any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be at least partially implemented as computer program modules, which when executed, may perform the corresponding functions.
For example, any number of the acquisition module 1010, the preprocessing module 1020, the recognition module 1030, the filtering module 1040, and the training module 1050 may be combined in one module/unit/sub-unit or any number of the modules/units/sub-units may be split into a plurality of modules/units/sub-units. Alternatively, at least some of the functionality of one or more of these modules/units/sub-units may be combined with at least some of the functionality of other modules/units/sub-units and implemented in one module/unit/sub-unit. According to embodiments of the present disclosure, at least one of the acquisition module 1010, the preprocessing module 1020, the recognition module 1030, the filtering module 1040, and the training module 1050 may be implemented at least in part as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging circuitry, or in any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, at least one of the acquisition module 1010, the preprocessing module 1020, the recognition module 1030, the filtering module 1040, and the training module 1050 may be implemented at least in part as computer program modules that, when executed, perform the corresponding functions.
It should be noted that, in the embodiment of the present disclosure, the information identifying apparatus portion corresponds to the information identifying method portion in the embodiment of the present disclosure, and specific implementation details and technical effects thereof are the same, which are not described herein again.
Fig. 13 schematically illustrates a block diagram of an electronic device adapted to implement the above-described method according to an embodiment of the present disclosure. The electronic device shown in fig. 13 is merely an example and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 13, an electronic device 1300 according to an embodiment of the present disclosure includes a processor 1301 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1302 or a program loaded from a storage portion 1308 into a Random Access Memory (RAM) 1303. Processor 1301 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. Processor 1301 may also include on-board memory for caching purposes. Processor 1301 may include a single processing unit or multiple processing units for performing different actions of the method flow according to embodiments of the present disclosure.
In the RAM1303, various programs and data necessary for the operation of the electronic apparatus 1300 are stored. The processor 1301, the ROM 1302, and the RAM1303 are connected to each other through a bus 1304. The processor 1301 performs various operations of the method flow according to the embodiment of the present disclosure by executing programs in the ROM 1302 and/or the RAM 1303. Note that the program may be stored in one or more memories other than the ROM 1302 and the RAM 1303. Processor 1301 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in the one or more memories.
According to an embodiment of the disclosure, the electronic device 1300 may also include an input/output (I/O) interface 1305, the input/output (I/O) interface 1305 also being connected to the bus 1304. The electronic device 1300 may also include one or more of the following components connected to the I/O interface 1305: an input section 1306 including a keyboard, a mouse, and the like; an output portion 1307 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker, and the like; a storage portion 1308 including a hard disk or the like; and a communication section 1309 including a network interface card such as a LAN card, a modem, or the like. The communication section 1309 performs a communication process via a network such as the internet. The drive 1310 is also connected to the I/O interface 1305 as needed. Removable media 1311, such as magnetic disks, optical disks, magneto-optical disks, semiconductor memory, and the like, is installed as needed on drive 1310 so that a computer program read therefrom is installed as needed into storage portion 1308.
According to embodiments of the present disclosure, the method flow according to embodiments of the present disclosure may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program comprising program code for performing the method shown in the flowcharts. In such embodiments, the computer program may be downloaded and installed from a network via the communication portion 1309 and/or installed from the removable medium 1311. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 1301. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium. Examples may include, but are not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM1302 and/or RAM 1303 described above and/or one or more memories other than ROM1302 and RAM 1303.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be combined in various combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.

Claims (12)

1. An information identification method, comprising:
acquiring text data to be identified;
preprocessing the text data to be identified to obtain a first text feature;
and inputting the first text characteristic into a trained recognition model to recognize target information included in the text data to be recognized, wherein the recognition model is obtained by training a space transformation network model in a heterogeneous migration learning mode based on domain adaptation, and a recognition label of the recognition model is configured according to service attribute information.
2. The information identifying method according to claim 1, the information identifying method further comprising:
sequentially performing dimension reduction and dimension increase on the first text feature to filter invalid information in the first text feature, so as to obtain a second text feature;
and inputting the first text characteristic into a trained recognition model, and recognizing target information included in the text data to be recognized.
3. The information identifying method according to claim 1 or 2, wherein the preprocessing the text data to be identified to obtain a first text feature includes:
determining the length of each text in the text data;
Comparing the length of each text with a preset length threshold value, and determining the text with the length larger than the preset length threshold value;
and intercepting a part of the text with the length larger than a preset length threshold value to obtain the first text feature with the preset dimension.
4. The information identifying method according to claim 1, the information identifying method further comprising: training the recognition model, comprising:
acquiring historical text data;
determining a training data set and a test data set based on the historical text data;
labeling the training data set according to the identification tag, and labeling part of the test data set according to the identification tag;
and taking the statistical distance between the source domain where the training data set is located and the target domain where the test data set is located as a loss function, and inputting the marked training data set and part of the marked test data into the space transformation network model to perform heterogeneous migration learning to obtain a trained identification model.
5. The information identification method of claim 4, wherein the determining training data set and test data set based on the historical text data comprises:
Downsampling a part of data in the historical text data to obtain a training data set;
and randomly sampling another part of data in the historical text data to obtain a test data set.
6. The information identifying method according to claim 4 or 5, wherein before inputting the labeled training dataset and the labeled portion of the test data into the spatial transformation network model for heterogeneous migration learning, the method further comprises:
the training data set and the test data set are mapped to the same dimension.
7. The information identification method of claim 6, wherein the training data set and the test data set are mapped to a regenerated kernel hilbert space such that the training data set and the test data set are in the same dimension.
8. The information identification method according to claim 1, wherein the identification tag includes: the system comprises an accounting query tag, an account opening and website information query tag, an account and debit card tag, a transfer money tag, a personal credit tag, a deposit tag, a financial tag, a fund tag, a noble metal tag, a personal mobile phone banking tag, a personal online banking tag, a self-service machine tag, a messenger tag and a comprehensive tag, wherein the comprehensive tag is used for representing the tags corresponding to business attribute information except accounting query, account opening and website information query, account and debit card, transfer money, personal credit, deposit, financial, fund, noble metal, personal mobile phone banking, personal online banking, self-service machines and messengers.
9. An information identifying apparatus, comprising:
the acquisition module is used for acquiring text data to be identified;
the preprocessing module is used for preprocessing the text data to be recognized to obtain a first text characteristic;
the identification module is used for inputting the first text characteristic into a trained identification model to identify target information included in the text data to be identified, wherein the identification model is obtained by training a space transformation network model in a heterogeneous transfer learning mode based on domain adaptation, and an identification label of the identification model is configured according to service attribute information.
10. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-8.
11. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1-8.
12. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 8.
CN202310796666.1A 2023-06-30 2023-06-30 Information identification method, device, equipment, medium and product Pending CN116738198A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310796666.1A CN116738198A (en) 2023-06-30 2023-06-30 Information identification method, device, equipment, medium and product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310796666.1A CN116738198A (en) 2023-06-30 2023-06-30 Information identification method, device, equipment, medium and product

Publications (1)

Publication Number Publication Date
CN116738198A true CN116738198A (en) 2023-09-12

Family

ID=87918406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310796666.1A Pending CN116738198A (en) 2023-06-30 2023-06-30 Information identification method, device, equipment, medium and product

Country Status (1)

Country Link
CN (1) CN116738198A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117391515A (en) * 2023-10-24 2024-01-12 科讯嘉联信息技术有限公司 Service quality management method and system based on general large language model

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117391515A (en) * 2023-10-24 2024-01-12 科讯嘉联信息技术有限公司 Service quality management method and system based on general large language model

Similar Documents

Publication Publication Date Title
CN110377911B (en) Method and device for identifying intention under dialog framework
CN110597964B (en) Double-recording quality inspection semantic analysis method and device and double-recording quality inspection system
CN111340616B (en) Method, device, equipment and medium for approving online loan
CN111161740A (en) Intention recognition model training method, intention recognition method and related device
CN111062217A (en) Language information processing method and device, storage medium and electronic equipment
CN114722839B (en) Man-machine cooperative dialogue interaction system and method
CN112463968B (en) Text classification method and device and electronic equipment
CN109582788A (en) Comment spam training, recognition methods, device, equipment and readable storage medium storing program for executing
CN112732871A (en) Multi-label classification method for acquiring client intention label by robot
CN116738198A (en) Information identification method, device, equipment, medium and product
CN110046345A (en) A kind of data extraction method and device
WO2020042164A1 (en) Artificial intelligence systems and methods based on hierarchical clustering
US11481609B2 (en) Computationally efficient expressive output layers for neural networks
US20230070966A1 (en) Method for processing question, electronic device and storage medium
US20220335274A1 (en) Multi-stage computationally efficient neural network inference
CN115730590A (en) Intention recognition method and related equipment
CN110287396A (en) Text matching technique and device
CN115687934A (en) Intention recognition method and device, computer equipment and storage medium
CN115631748A (en) Emotion recognition method and device based on voice conversation, electronic equipment and medium
CN115510188A (en) Text keyword association method, device, equipment and storage medium
CN115357711A (en) Aspect level emotion analysis method and device, electronic equipment and storage medium
CN114942992A (en) Transaction opponent identification method and device based on converged network and electronic equipment
CN114067805A (en) Method and device for training voiceprint recognition model and voiceprint recognition
CN113869068A (en) Scene service recommendation method, device, equipment and storage medium
CN112270179B (en) Entity identification method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination