CN111324706B - Labeling method and device and electronic equipment - Google Patents

Labeling method and device and electronic equipment Download PDF

Info

Publication number
CN111324706B
CN111324706B CN202010071319.9A CN202010071319A CN111324706B CN 111324706 B CN111324706 B CN 111324706B CN 202010071319 A CN202010071319 A CN 202010071319A CN 111324706 B CN111324706 B CN 111324706B
Authority
CN
China
Prior art keywords
text
marked
annotated
labeling
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010071319.9A
Other languages
Chinese (zh)
Other versions
CN111324706A (en
Inventor
柴博
张强
宋博川
贾全烨
戴铁潮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Zhejiang Electric Power Co Ltd
Global Energy Interconnection Research Institute
Original Assignee
State Grid Corp of China SGCC
State Grid Zhejiang Electric Power Co Ltd
Global Energy Interconnection Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Zhejiang Electric Power Co Ltd, Global Energy Interconnection Research Institute filed Critical State Grid Corp of China SGCC
Priority to CN202010071319.9A priority Critical patent/CN111324706B/en
Publication of CN111324706A publication Critical patent/CN111324706A/en
Application granted granted Critical
Publication of CN111324706B publication Critical patent/CN111324706B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Acoustics & Sound (AREA)
  • Economics (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Artificial Intelligence (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a labeling method, a labeling device and electronic equipment, wherein the labeling method comprises the following steps: acquiring a text of a problem to be marked; determining whether the problem represented by the text of the problem to be marked belongs to a target class; when the problems represented by the to-be-annotated problem text belong to the target category, determining whether the target database contains the to-be-annotated problem text; and when the target database contains the to-be-annotated problem text, annotating the to-be-annotated problem text according to a preset label. By determining the category of the obtained problem text to be marked, when the problem text belongs to the target category, the label preset for the problem text is marked according to the target database, so that consistency of marking the same problem text is ensured, and accuracy of artificial intelligent customer service formed based on the marked text on the same problem service is improved.

Description

Labeling method and device and electronic equipment
Technical Field
The invention relates to the field of data processing, in particular to a labeling method, a labeling device and electronic equipment.
Background
Along with the development of artificial intelligence, the application fields of the system are wider and wider, such as artificial intelligence customer service, and the system can serve users more quickly and accurately through the artificial intelligence customer service and save the labor cost. The service quality of the artificial intelligence customer service depends on the accuracy of the labeling of text data in the development and design stage, but the service types are different and the used text data are different under different due to the different due scenes of the artificial intelligence customer service.
For example, for the Information and Communication Technology (ICT) customer service of the national network, which is a specific execution unit for providing information and communication services for users outside the national network, ICT customer service personnel solve how to use the information and communication system for users and accept users to repair faults and feedback system use cases, and the related users include both staff inside a company and users outside the company, such as bidding units. Because of the wide variety of users and complex business logic that the national network information communication system needs to serve, a labeling method is needed to improve the accuracy of the artificial intelligent customer service in complex application scenarios.
Disclosure of Invention
Therefore, the technical problem to be solved by the invention is to overcome the defect that the labeling result obtained by the existing labeling method cannot meet the complex application scene and influence the accuracy of the artificial intelligence customer service, thereby providing the labeling method, the labeling device and the electronic equipment.
According to a first aspect, an embodiment of the present invention discloses a labeling method, including: acquiring a text of a problem to be marked; determining whether the problem represented by the text of the problem to be marked belongs to a target class; when the problems represented by the to-be-annotated problem text belong to the target category, determining whether the target database contains the to-be-annotated problem text; and when the target database contains the to-be-annotated problem text, annotating the to-be-annotated problem text according to a preset label.
With reference to the first aspect, in a first implementation manner of the first aspect, when the problem represented by the to-be-annotated problem text belongs to the target category, after determining whether the target database includes the to-be-annotated problem text, the method further includes: when the target database does not contain the problem text to be marked, adding the problem text to be marked into the target database; obtaining a target label; and labeling the to-be-labeled problem text according to the target label.
With reference to the first aspect, in a second implementation manner of the first aspect, after determining whether the question characterized by the to-be-annotated question text belongs to a target category, the method further includes: when the problems represented by the to-be-annotated problem text do not belong to the target category, determining the application field to which the to-be-annotated problem text belongs; determining whether the labeling standard corresponding to the application field is contained; when the labeling standard corresponding to the application field is included, labeling the problems characterized by the text of the problems to be labeled and the corresponding solutions according to the labeling standard; and establishing association for the marked problem characterized by the text of the problem to be marked and the corresponding solution.
With reference to the second implementation manner of the first aspect, in a third implementation manner of the first aspect, after the associating the question characterized by the text of the question to be annotated after the annotating and the corresponding solution, the method further includes: determining whether the marked problem represented by the text of the problem to be marked or not and the corresponding solution contain attribute description or not; when the marked problem represented by the text of the problem to be marked and the corresponding solution contain attribute description, carrying out attribute division and marking on the marked problem represented by the text of the problem to be marked and the corresponding solution according to the attribute description.
With reference to the second implementation manner of the first aspect, in a fourth implementation manner of the first aspect, after determining whether the labeling standard corresponding to the application domain is included, the method further includes: when the labeling standard corresponding to the application field is not included, determining whether the to-be-labeled problem text includes attribute description or not; and when the to-be-annotated problem text contains attribute description, carrying out attribute division and annotation on the to-be-annotated problem text according to the attribute description.
According to a second aspect, the embodiment of the invention discloses a labeling device, an acquisition module, a labeling module and a labeling module, wherein the acquisition module is used for acquiring a to-be-labeled problem text; the first determining module is used for determining whether the problem represented by the text of the problem to be annotated belongs to a target class; the second determining module is used for determining whether the target database contains the to-be-annotated problem text or not when the to-be-annotated problem characterized by the to-be-annotated problem text belongs to the target category; and the labeling module is used for labeling the to-be-labeled problem text according to a preset label when the to-be-labeled problem text is contained in the target database.
With reference to the second aspect, in a first implementation manner of the second aspect, the second determining module is further configured to add the to-be-annotated problem text to the target database when the to-be-annotated problem text is not included in the target database; obtaining a target label; and labeling the to-be-labeled problem text according to the target label.
With reference to the second aspect, in a second implementation manner of the second aspect, the first determining module is further configured to determine an application domain to which the to-be-annotated question text belongs when the question represented by the to-be-annotated question text does not belong to the target category; determining whether the labeling standard corresponding to the application field is contained; when the labeling standard corresponding to the application field is included, labeling the problems characterized by the text of the problems to be labeled and the corresponding solutions according to the labeling standard; and establishing association for the marked problem characterized by the text of the problem to be marked and the corresponding solution.
With reference to the second embodiment of the second aspect, in a third embodiment of the second aspect, the first determining module is further configured to determine whether the noted problem represented by the text of the problem to be noted and the corresponding solution include attribute descriptions; when the marked problem represented by the text of the problem to be marked and the corresponding solution contain attribute description, carrying out attribute division and marking on the marked problem represented by the text of the problem to be marked and the corresponding solution according to the attribute description.
With reference to the second embodiment of the second aspect, in a fourth embodiment of the second aspect, the first determining module is further configured to determine, when the labeling standard corresponding to the application field is not included, whether the to-be-labeled question text includes an attribute description; and when the to-be-annotated problem text contains attribute description, carrying out attribute division and annotation on the to-be-annotated problem text according to the attribute description.
According to a third aspect, an embodiment of the present invention discloses an electronic device, including: a processor, a memory and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the labeling method described in any implementation of the first aspect and the first aspect when the program is executed.
According to a fourth aspect, an embodiment of the present invention discloses a readable computer storage medium, on which computer instructions are stored, which instructions, when executed by a processor, implement the steps of the labeling method described in any implementation manner of the first aspect and the first aspect.
The technical scheme provided by the embodiment of the invention has the following advantages:
according to the labeling method provided by the embodiment of the invention, the to-be-labeled problem text is obtained, whether the problem represented by the to-be-labeled problem text belongs to the target category is determined, when the problem represented by the to-be-labeled problem text belongs to the target category, whether the to-be-labeled problem text is contained in the target database is determined, and when the to-be-labeled problem text is contained in the target database, the to-be-labeled problem text is labeled according to the preset label. By determining the category of the obtained problem text to be marked, when the problem text belongs to the target category, the label preset for the problem text is marked according to the target database, so that consistency of marking the same problem text is ensured, and accuracy of artificial intelligent customer service formed based on the marked text on the same problem service is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a labeling method according to an embodiment of the present invention;
FIG. 2 is a block diagram of an labeling device according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In the description of the present invention, it should be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; the two components can be directly connected or indirectly connected through an intermediate medium, or can be communicated inside the two components, or can be connected wirelessly or in a wired way. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
In addition, the technical features of the different embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
The embodiment of the application provides a labeling method, as shown in fig. 1, which comprises the following steps:
and step 101, acquiring a to-be-annotated problem text.
The method for obtaining the text of the problem to be annotated may be, for example, obtaining a plurality of audio files in advance, where the audio files are files formed by answering the problem consultation of the user in advance through a manual seat service manner and providing a solution to the problem of the consultation, that is, the obtained audio files include the problem audio and the audio of the solution to the problem, and converting the obtained audio files into text data through an audio conversion tool. And carrying out review and error correction operation on the obtained text data, defining text data corresponding to the artificial seat and text data corresponding to the user in the text data, and carrying out common text error correction on the obtained text data, such as wrongly written characters, poor semantics or correction of technical expressions which are not in the field. The text data can be subjected to the natural voice review operation through a pre-trained machine learning model, or can be subjected to the error correction and review manually, and the corrected text is re-input to the terminal so that the terminal can accurately label. And obtaining the text of the problem to be marked through the audio conversion operation.
Step 102, determining whether the question represented by the to-be-annotated question text belongs to a target class, and executing step 103 when the question represented by the to-be-annotated question text belongs to the target class.
Illustratively, the target category may be determined according to the frequency with which the questions are consulted, i.e., questions (Frequently Asked Questions, FAQ) to be frequently asked are taken as target categories, and the FAQ questions may include a system function query question, a system workflow query question, and a maintenance phone consultation question, and the type and frequency of the FAQ questions are not limited in this application, and may be determined according to actual needs by those skilled in the art. The method for determining whether the problem represented by the to-be-annotated problem text belongs to the target category may be to extract keywords in the to-be-annotated problem text, and establish an association relationship between the target category and the keywords in advance, and determine whether the problem represented by the to-be-annotated problem text belongs to the target category according to the pre-established association relationship when the keywords are extracted from the to-be-annotated problem text.
Step 103, determining whether the target database contains the to-be-annotated problem text, and executing step 104 when the target database contains the to-be-annotated problem text.
The target database is used for storing question text belonging to a target category, and the question text stored in the target database is provided with a label according to the category. The method for determining whether the target database contains the text of the problem to be marked can be to acquire keywords of the text of the problem to be marked, perform traversal inquiry in the target database according to the keywords, perform text comparison on traversal results by using a natural language processing tool, and when the coincidence rate of the text comparison results reaches a target condition, namely the text of the problem to be marked is contained in the target database.
And 104, marking the to-be-marked problem text according to a preset label.
For example, when the target database contains the to-be-annotated problem text, since the same problem text as the to-be-annotated problem text is already stored in the target database in advance, the label of the same problem text can be used as a preset label, and the to-be-annotated problem text is annotated according to the preset label.
By determining the category of the obtained problem text to be marked, when the problem text belongs to the target category, the label preset for the problem text is marked according to the target database, so that consistency of marking the same problem text is ensured, and accuracy of artificial intelligent customer service formed based on the marked text on the same problem service is improved.
As an optional embodiment of the present application, after step 103, the method further includes: when the target database does not contain the problem text to be marked, adding the problem text to be marked into the target database; obtaining a target label; and labeling the to-be-labeled problem text according to the target label.
In an exemplary embodiment, when the target database does not include the to-be-annotated problem text, setting a problem label and a solution corresponding to the problem for the obtained to-be-annotated problem text, using the obtained problem label as a target label, and adding the to-be-annotated problem text including the target label and the corresponding solution to the target database, thereby improving the diversity of data included in the target database.
The target label may be obtained by displaying the to-be-labeled problem text to the customer service agent when the target database does not contain the to-be-labeled problem text, receiving a label setting result of the to-be-labeled problem text by the customer service agent, and labeling the to-be-labeled problem text by taking the received set label as the target label. The received labeling result of the customer service agent can be a label corresponding to the FAQ problems such as system functions, system flows or telephone inquiry, and the FAQ problems are convenient to be classified and stored in a centralized mode.
As an optional embodiment of the present application, after step 102, the method further includes:
firstly, when the problem represented by the text of the problem to be marked does not belong to the target category, determining the application field to which the text of the problem to be marked belongs.
For example, when the problem represented by the to-be-marked text does not belong to the target category, the to-be-marked problem text can be characterized as not being the FAQ problem, in order to improve the accuracy of the marking result of the to-be-marked problem text which does not belong to the FAQ problem, and then ensure the service quality of artificial intelligence customer service, for the to-be-marked problem text of the non-FAQ problem, the application field to which the to-be-marked problem text belongs is determined according to the keywords contained in the to-be-marked problem text, for example, for an e-commerce platform and a vendor system, as the to-be-marked problem cannot be used as a FAQ problem due to the fact that the specific application field is involved, the application field to which the to-be-marked problem text belongs can be determined by taking the application field as the tag, and the label to be-to-be-marked problem text is convenient for classifying related field problems.
Secondly, determining whether the labeling standard corresponding to the application field is contained or not; and when the labeling standard corresponding to the application field is included, labeling the problems represented by the text of the problems to be labeled and the corresponding solutions according to the labeling standard.
The method of determining whether the labeling standard corresponding to the application domain is included may be to store a plurality of labeling standards in a database in advance, take the obtained application domain as a keyword, perform traversal query in the database, determine whether the labeling standard corresponding to the corresponding application domain is included in the database, where the labeling standard may be a system function description or a system operation flow description of the corresponding application domain, and the embodiment of the present application does not limit the category of the labeling standard. Taking the e-commerce platform bidding system as an example, when the database contains the bidding function operation flow corresponding to the field, the text of the problem to be marked can be marked according to the bidding function operation flow. By using the labeling standard to set labels for the non-FAQ problems, the accuracy of labeling results and the service accuracy of artificial intelligent customer service are improved.
And thirdly, establishing association between the marked problem characterized by the text of the problem to be marked and the corresponding solution. Through establishing association between the marked problems represented by the text of the problems to be marked and the corresponding solutions, after the artificial intelligence customer service identifies the problems of the consultation of the user, the corresponding solutions can be timely acquired according to the established association relationship and fed back to the user.
As an optional implementation manner of the application, after the association is established between the noted problem characterized by the text of the problem to be noted and the corresponding solution, the method further includes: determining whether the marked problem represented by the text of the problem to be marked or not and the corresponding solution contain attribute description or not; when the marked problem represented by the text of the problem to be marked and the corresponding solution contain attribute description, carrying out attribute division and marking on the marked problem represented by the text of the problem to be marked and the corresponding solution according to the attribute description.
For example, the attribute description may include a user name to which the to-be-annotated question text belongs or a department category within a company to which the to-be-annotated question text belongs. For example, when the to-be-marked problem text is obtained as "a financial system webpage of a certain company is not opened", the attribute description corresponding to the to-be-marked problem text which is obtained by extracting and identifying the keyword is respectively "the certain company" and "the financial system", and the attribute description can be used as a label of the to-be-marked problem text which is "the webpage is not opened". By setting different labels for the to-be-marked problem texts, classification is convenient for classifying the different to-be-marked problem texts, and then according to the obtained accurate problem text labels, the feedback result of the artificial intelligent customer service is more targeted, and the service quality of the artificial customer service is improved.
As an optional implementation manner of the application, after the determining whether the labeling standard corresponding to the application domain is included, the method further includes: when the labeling standard corresponding to the application field is not included, determining whether the to-be-labeled problem text includes attribute description or not; and when the to-be-annotated problem text contains attribute description, carrying out attribute division and annotation on the to-be-annotated problem text according to the attribute description.
For example, in order to ensure the accuracy of the labeling result of the to-be-labeled problem text, when the labeling standard corresponding to the application field is not included, the corresponding to-be-labeled problem text is not labeled. By acquiring the attribute description contained in the questions to be annotated which are not annotated, and carrying out attribute division and annotation on the corresponding questions to be annotated text according to the attribute description, the quantity of the corresponding questions to be annotated in the application field which does not contain the annotation standard is counted in time according to the annotation result, when the quantity is large, the condition that a plurality of questions need to be consulted in the corresponding application field by a user is indicated, the technician can acquire and store the annotation standard of the application field in time, and the artificial intelligent customer service can feed back the questions in the related field in time and accurately.
The embodiment of the application also provides a labeling device, as shown in fig. 2, including:
an obtaining module 201, configured to obtain a text of a problem to be annotated;
a first determining module 202, configured to determine whether the question represented by the to-be-annotated question text belongs to a target category;
a second determining module 203, configured to determine whether the target database contains the to-be-annotated question text when the to-be-annotated question represented by the to-be-annotated question text belongs to the target category;
and the labeling module 204 is configured to label the to-be-labeled question text according to a preset label when the to-be-labeled question text is included in the target database.
As an optional implementation manner of the present application, the second determining module 203 is further configured to add the to-be-annotated question text to the target database when the to-be-annotated question text is not included in the target database; obtaining a target label; and labeling the to-be-labeled problem text according to the target label.
As an optional implementation manner of the present application, the first determining module 202 is further configured to determine, when the problem represented by the to-be-annotated question text does not belong to the target category, an application domain to which the to-be-annotated question text belongs; determining whether the labeling standard corresponding to the application field is contained; when the labeling standard corresponding to the application field is included, labeling the problems characterized by the text of the problems to be labeled and the corresponding solutions according to the labeling standard; and establishing association for the marked problem characterized by the text of the problem to be marked and the corresponding solution.
As an optional implementation manner of the present application, the first determining module 202 is further configured to determine whether the noted problem characterized by the text of the problem to be noted and the corresponding solution include an attribute description; when the marked problem represented by the text of the problem to be marked and the corresponding solution contain attribute description, carrying out attribute division and marking on the marked problem represented by the text of the problem to be marked and the corresponding solution according to the attribute description.
As an optional implementation manner of the present application, the first determining module 202 is further configured to determine, when the labeling standard corresponding to the application field is not included, whether the to-be-labeled question text includes an attribute description; and when the to-be-annotated problem text contains attribute description, carrying out attribute division and annotation on the to-be-annotated problem text according to the attribute description.
The embodiment of the application also provides an electronic device, as shown in fig. 3, including: the steps of the labeling method described in the above embodiments are implemented by the processor 501, the memory 502, and a computer program stored in the memory 502 and executable on the processor 501, where the processor 501, the memory 502, the image capturing device 503, and the voice device 504 may be connected by a bus or other means, and fig. 3 may be connected by a bus, for example.
The processor 501 may be a central processing unit (Central Processing Unit, CPU). The processor 501 may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or a combination thereof.
The memory 502 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the labeling method in the embodiment of the invention. The processor 501 executes various functional applications of the processor and data processing by running non-transitory software programs, instructions, and modules stored in the memory 502, i.e., to implement the labeling methods in the method embodiments described above.
Memory 502 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created by the processor 501, etc. In addition, memory 502 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 502 may optionally include memory located remotely from processor 501, which may be connected to processor 501 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The one or more modules are stored in the memory 502, which when executed by the processor 501, performs the labeling method in the embodiment shown in fig. 1.
The details of the above electronic device may be understood correspondingly with respect to the corresponding related descriptions and effects in the embodiment shown in fig. 1, which are not repeated herein.
The embodiment of the invention also provides a computer storage medium, which stores computer executable instructions, and the computer executable instructions can execute the labeling method in any of the method embodiments. Wherein the storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a Flash Memory (Flash Memory), a Hard Disk (HDD), or a Solid State Drive (SSD); the storage medium may also comprise a combination of memories of the kind described above.
It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. While still being apparent from variations or modifications that may be made by those skilled in the art are within the scope of the invention.

Claims (8)

1. A method of labeling, comprising:
acquiring a text of a problem to be marked;
determining whether the problem represented by the text of the problem to be marked belongs to a target class;
when the problems represented by the to-be-annotated problem text belong to the target category, determining whether a target database contains the to-be-annotated problem text;
when the target database contains the to-be-annotated problem text, annotating the to-be-annotated problem text according to a preset label; when the problem represented by the to-be-annotated problem text belongs to the target category, determining whether the target database contains the to-be-annotated problem text, and then the method further comprises:
when the target database does not contain the problem text to be marked, adding the problem text to be marked into the target database;
obtaining a target label;
labeling the to-be-labeled problem text according to the target label; after determining whether the question characterized by the to-be-annotated question text belongs to the target category, the method further comprises:
when the problems represented by the to-be-annotated problem text do not belong to the target category, determining the application field to which the to-be-annotated problem text belongs;
determining whether the labeling standard corresponding to the application field is contained;
when the labeling standard corresponding to the application field is included, labeling the problems characterized by the text of the problems to be labeled and the corresponding solutions according to the labeling standard;
and establishing association for the marked problem characterized by the text of the problem to be marked and the corresponding solution.
2. The method of claim 1, wherein after the associating the question text-characterized by the noted question to be noted and the corresponding solution, the method further comprises:
determining whether the marked problem represented by the text of the problem to be marked or not and the corresponding solution contain attribute description or not;
when the marked problem represented by the text of the problem to be marked and the corresponding solution contain attribute description, carrying out attribute division and marking on the marked problem represented by the text of the problem to be marked and the corresponding solution according to the attribute description.
3. The method of claim 1, wherein after determining whether the labeling criteria corresponding to the application domain are included, the method further comprises:
when the labeling standard corresponding to the application field is not included, determining whether the to-be-labeled problem text includes attribute description or not;
and when the to-be-annotated problem text contains attribute description, carrying out attribute division and annotation on the to-be-annotated problem text according to the attribute description.
4. An labeling device, comprising:
the acquisition module is used for acquiring the text of the problem to be marked;
the first determining module is used for determining whether the problem represented by the text of the problem to be annotated belongs to a target class;
the second determining module is used for determining whether the to-be-annotated problem text is contained in the target database or not when the to-be-annotated problem text characterized problem belongs to the target category;
the labeling module is used for labeling the to-be-labeled problem text according to a preset label when the to-be-labeled problem text is contained in the target database; the second determining module is further configured to add the to-be-annotated problem text to the target database when the to-be-annotated problem text is not included in the target database; obtaining a target label; labeling the to-be-labeled problem text according to the target label; the first determining module is further configured to determine an application field to which the to-be-annotated question text belongs, when the question represented by the to-be-annotated question text does not belong to the target category; determining whether the labeling standard corresponding to the application field is contained; when the labeling standard corresponding to the application field is included, labeling the problems characterized by the text of the problems to be labeled and the corresponding solutions according to the labeling standard; and establishing association for the marked problem characterized by the text of the problem to be marked and the corresponding solution.
5. The apparatus of claim 4, wherein the first determining module is further configured to determine whether the noted question text-characterized question to be noted and the corresponding solution include an attribute description; when the marked problem represented by the text of the problem to be marked and the corresponding solution contain attribute description, carrying out attribute division and marking on the marked problem represented by the text of the problem to be marked and the corresponding solution according to the attribute description.
6. The apparatus of claim 4, wherein the first determining module is further configured to determine, when the labeling standard corresponding to the application field is not included, whether the to-be-labeled question text includes an attribute description; and when the to-be-annotated problem text contains attribute description, carrying out attribute division and annotation on the to-be-annotated problem text according to the attribute description.
7. An electronic device, comprising:
a processor, a memory and a computer program stored on the memory and executable on the processor, which processor implements the steps of the labeling method of any of claims 1-3 when the program is executed.
8. A readable computer storage medium having stored thereon computer instructions, which when executed by a processor, implement the steps of the labeling method of any of claims 1-3.
CN202010071319.9A 2020-01-21 2020-01-21 Labeling method and device and electronic equipment Active CN111324706B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010071319.9A CN111324706B (en) 2020-01-21 2020-01-21 Labeling method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010071319.9A CN111324706B (en) 2020-01-21 2020-01-21 Labeling method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN111324706A CN111324706A (en) 2020-06-23
CN111324706B true CN111324706B (en) 2023-05-26

Family

ID=71167224

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010071319.9A Active CN111324706B (en) 2020-01-21 2020-01-21 Labeling method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN111324706B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018023981A1 (en) * 2016-08-03 2018-02-08 平安科技(深圳)有限公司 Public opinion analysis method, device, apparatus and computer readable storage medium
CN109683773A (en) * 2017-10-19 2019-04-26 北京国双科技有限公司 Corpus labeling method and device
CN110377743A (en) * 2019-07-25 2019-10-25 北京明略软件系统有限公司 A kind of text marking method and device
CN110717312A (en) * 2019-10-10 2020-01-21 北京明略软件系统有限公司 Text labeling method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018023981A1 (en) * 2016-08-03 2018-02-08 平安科技(深圳)有限公司 Public opinion analysis method, device, apparatus and computer readable storage medium
CN109683773A (en) * 2017-10-19 2019-04-26 北京国双科技有限公司 Corpus labeling method and device
CN110377743A (en) * 2019-07-25 2019-10-25 北京明略软件系统有限公司 A kind of text marking method and device
CN110717312A (en) * 2019-10-10 2020-01-21 北京明略软件系统有限公司 Text labeling method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张栋 ; 李寿山 ; 周国栋 ; .基于答案辅助的半监督问题分类方法.计算机工程与科学.2015,(12),全文. *

Also Published As

Publication number Publication date
CN111324706A (en) 2020-06-23

Similar Documents

Publication Publication Date Title
TWI621077B (en) Character recognition method and server for claim documents
CN111210842B (en) Voice quality inspection method, device, terminal and computer readable storage medium
CN110807085B (en) Fault information query method and device, storage medium and electronic device
CN107729251A (en) Testing case management and device
US20120150825A1 (en) Cleansing a Database System to Improve Data Quality
CN112527994A (en) Emotion analysis method, emotion analysis device, emotion analysis equipment and readable storage medium
CN107291775A (en) The reparation language material generation method and device of error sample
CN108074033A (en) Processing method, system, electronic equipment and the storage medium of achievement data
CN112836018A (en) Method and device for processing emergency plan
CN111061696A (en) Method and device for analyzing transaction message log
CN113312260A (en) Interface testing method, device, equipment and storage medium
CN112182233B (en) Knowledge base for storing equipment fault records, and method and system for assisting in positioning equipment faults by using knowledge base
CN112200465B (en) Electric power AI method and system based on multimedia information intelligent analysis
CN111783424B (en) Text sentence dividing method and device
CN117911039A (en) Control method, equipment and storage medium for after-sales service system
WO2021128721A1 (en) Method and device for text classification
CN117472372A (en) Responsive form construction method and system
CN111324706B (en) Labeling method and device and electronic equipment
CN112947959A (en) Updating method and device of AI service platform, server and storage medium
CN110544467A (en) Voice data auditing method, device, equipment and storage medium
CN102882737A (en) Transaction language-1(TL1) command automatically testing method based on extensible markup language (XML) script
CN110727759A (en) Method and device for determining theme of voice information
CN109522210A (en) Interface testing parameters analysis method, device, electronic device and storage medium
CN111488327B (en) Data standard management method and system
CN110471708B (en) Method and device for acquiring configuration items based on reusable components

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant