CN111768767A - User tag extraction method and device, server and computer readable storage medium - Google Patents

User tag extraction method and device, server and computer readable storage medium Download PDF

Info

Publication number
CN111768767A
CN111768767A CN202010440700.8A CN202010440700A CN111768767A CN 111768767 A CN111768767 A CN 111768767A CN 202010440700 A CN202010440700 A CN 202010440700A CN 111768767 A CN111768767 A CN 111768767A
Authority
CN
China
Prior art keywords
logical
objects
logic
expression
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010440700.8A
Other languages
Chinese (zh)
Other versions
CN111768767B (en
Inventor
欧阳湘粤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhuiyi Technology Co Ltd
Original Assignee
Shenzhen Zhuiyi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhuiyi Technology Co Ltd filed Critical Shenzhen Zhuiyi Technology Co Ltd
Priority to CN202010440700.8A priority Critical patent/CN111768767B/en
Publication of CN111768767A publication Critical patent/CN111768767A/en
Application granted granted Critical
Publication of CN111768767B publication Critical patent/CN111768767B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/61Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/686Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title or artist information, time, location or usage information, user ratings
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a user tag extraction method and device, a server and a computer readable storage medium, comprising the following steps: in the process of extracting the user tags from the audio data of the conversation robot and the user, the audio data of the conversation robot and the user are obtained, and the audio data are input into a preset rule module. The preset rule module comprises a logic formula, the data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic relations among the conditions. And analyzing the audio data through a logic formula to obtain the label of the user. The logical expression obtained by adopting the data structure of the prefix expression only comprises simple operators and operands. And the traditional JSON data structure comprises complex objects and array structures, so that the prefix expression greatly reduces the space consumption when storing the logic expression in the use scene compared with the traditional JSON data structure, and further improves the operation efficiency.

Description

User tag extraction method and device, server and computer readable storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for extracting a user tag, a server, and a computer-readable storage medium.
Background
With the continuous development of artificial intelligence and natural language processing technology, the conversation robot is widely applied in a plurality of business scenes such as financial service, home life, personal assistant and the like, and the quality and the efficiency of the service are improved.
However, because the usage scenario of the dialog robot is complicated, and because of the complexity of natural language itself, when the conventional JSON (javascript Object natation) data structure is used to store the logic in the usage scenario, the JSON data structure makes the occupied space in the database too large, consumes a large amount of storage space, and reduces the operating efficiency.
Disclosure of Invention
The embodiment of the application provides a user tag extraction method, a user tag extraction device, a server and a computer-readable storage medium, which can reduce space consumption when logic in a use scene is stored, and further improve operation efficiency.
A user tag extraction method comprises the following steps:
acquiring audio data of a conversation robot and a user;
inputting the audio data into a preset rule module, wherein the preset rule module comprises a logic formula, the data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic relations among the conditions;
and analyzing the audio data through the logic formula to obtain the label of the user.
In one embodiment, the analyzing the audio data by the logic formula to obtain the tag of the user includes:
inquiring the logic formula from a database through the preset rule module;
analyzing the character strings in the logical expression to obtain objects and the logical relation between the objects;
rendering the objects and the logical relationship between the objects into a logical tree;
and analyzing the audio data through the logic tree to obtain the label of the user.
In one embodiment, the analyzing the character string corresponding to the logical expression to obtain an object and a logical relationship between the objects includes:
and analyzing the character strings in the logical expression through a first function to obtain objects and the logical relation between the objects.
In one embodiment, the analyzing the character strings in the logical expression through the first function to obtain the objects and the logical relationship between the objects includes:
sequentially inputting the character strings in the logic formula into a state machine for reading;
respectively executing different commands in sequence according to the read values to obtain return values;
and obtaining the objects and the logical relationship among the objects based on all the return values.
In one embodiment, the method further comprises: abstracting an object in a preset logic tree to obtain the logic formula, and storing the logic formula in a database.
In one embodiment, the method further comprises: abstracting the objects in the preset logic tree into the logic formula through a second function, and storing the logic formula into a database.
In one embodiment, the abstracting, by the second function, the object in the preset logical tree into the logical expression, and storing the logical expression in the database includes:
circularly traversing all objects in a preset logic tree in a recursive mode to obtain values corresponding to the logic relationship among the objects and the values corresponding to the objects;
and converting values corresponding to the logical relationship between the objects and values corresponding to the objects into character strings to obtain a logical expression, and storing the logical expression into a database.
In one embodiment, the logical relationship is represented in the form of a prefix.
In one embodiment, the logical expression includes a condition group and a logical relationship between the condition group, the condition group includes a sub-condition group and a logical relationship between the sub-condition group, and the sub-condition group includes at least one condition and a logical relationship between the conditions.
A user tag extraction apparatus, comprising:
the acquisition module is used for acquiring audio data of the conversation robot and a user;
the input module is used for inputting the audio data into a preset rule module, the preset rule module comprises a logic formula, the data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logics among the conditions;
and the analysis module is used for analyzing the audio data through the logic formula to obtain the label of the user.
In one embodiment, the analysis module includes:
the logic formula query unit is used for querying the logic formula from the database through the preset rule module;
the character string analysis unit is used for analyzing the character strings corresponding to the logical expression to obtain objects and the logical relation between the objects;
the logical tree generating unit is used for rendering the objects and the logical relations among the objects into a logical tree;
and the analysis unit is used for analyzing the audio data through the logic tree to obtain the label of the user.
In one embodiment, the character string parsing unit is further configured to parse the character string in the logical expression through a first function to obtain objects and a logical relationship between the objects.
In one embodiment, the character string parsing unit is further configured to sequentially input the character strings in the logical expression into a state machine for reading; respectively executing different commands in sequence according to the read values to obtain return values; and obtaining the objects and the logical relationship among the objects based on all the return values.
In one embodiment, the apparatus further comprises: and the logical formula storage module is used for abstracting the objects in the preset logical tree to obtain the logical formula and storing the logical formula in a database.
In one embodiment, the logical formula storage module is further configured to abstract an object in a preset logical tree into the logical formula through a second function, and store the logical formula in a database.
In one embodiment, the logical storage module is further configured to cycle through all objects in a preset logical tree in a recursive manner, and obtain values corresponding to logical relationships between the objects and values corresponding to the objects; and converting values corresponding to the logical relationship between the objects and values corresponding to the objects into character strings to obtain a logical expression, and storing the logical expression into a database.
In one embodiment, the logical relationship is represented in the form of a prefix.
In one embodiment, the logical expression includes a condition group and a logical relationship between the condition group, the condition group includes a sub-condition group and a logical relationship between the sub-condition group, and the sub-condition group includes at least one condition and a logical relationship between the conditions.
A server comprising a memory and a processor, the memory having stored therein a computer program which, when executed by the processor, causes the processor to carry out the steps of the above method.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method as above.
According to the user tag extraction method, the device, the server and the computer readable storage medium, in the process of extracting the user tag from the audio data of the conversation robot and the user, the audio data of the conversation robot and the user are obtained, and the audio data are input into the preset rule module. The preset rule module comprises a logic formula, the data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic relations among the conditions. And analyzing the audio data through a logic formula to obtain the label of the user. The logical expression obtained by adopting the data structure of the prefix expression only comprises simple operators and operands. And the traditional JSON data structure comprises complex objects and array structures, so that the prefix expression greatly reduces the space consumption when storing the logic expression in the use scene compared with the traditional JSON data structure, and further improves the operation efficiency.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a diagram of an application environment of a user tag extraction method in one embodiment;
FIG. 2 is a flow diagram of a method for user tag extraction in one embodiment;
FIG. 3 is a flowchart of the method for analyzing audio data to obtain a user tag in FIG. 2;
FIG. 4 is a diagram illustrating the structure of a logical tree in one embodiment;
FIG. 5 is a diagram illustrating a structure of a logical tree rendered according to a prefix expression in one embodiment;
FIG. 6 is a flowchart illustrating a method for analyzing a string in a logical expression through a first function to obtain objects and logical relationships between the objects according to an embodiment;
FIG. 7 is a flowchart illustrating a method for abstracting an object in a preset logical tree into a logical expression according to a second function and storing the logical expression in a database according to an embodiment;
FIG. 8 is a block diagram showing the structure of a user tag extracting apparatus according to an embodiment;
FIG. 9 is a block diagram of the structure of the analysis module of FIG. 8;
fig. 10 is a schematic diagram of an internal configuration of a server in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It will be understood that, as used herein, the terms "first," "second," and the like may be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish one element from another.
Fig. 1 is an application scenario diagram of a touch screen control method in an embodiment. As shown in fig. 1, the application environment includes a conversation robot 120 and a server 140. The server 140 acquires audio data of the conversation robot and the user from the conversation robot 120. The server 140 inputs the audio data into a preset rule module, where the preset rule module includes a logic formula, the data structure adopted by the logic formula is a prefix expression, and the logic formula includes conditions and logic relationships among the conditions. And analyzing the audio data through a logic formula to obtain the label of the user.
Fig. 2 is a flowchart of a user tag extraction method in an embodiment, and as shown in fig. 2, a user tag extraction method is provided, which is applied to a server and includes steps 220 to 260.
Step 220, audio data of the conversation robot and the user are obtained.
Among them, speech robots that implement a certain function by performing a dialogue with a user instead of a manual work, such as coriana, ice, Siri of apple, Google Now, ali honey, hundredth secret, turing robot, assistant coming and going, and asking questions, are one of the dialogue robots. Of course, the conversation robot may also be a voice robot that replaces a traditional manual call center, for example, a voice robot that replaces a manual call center in an industry such as a bank, an operator, and the like. The conversation robot can realize communication with the user and record in the communication process, so that audio data is generated, and the audio data is subjected to semantic recognition to obtain the audio data subjected to semantic recognition.
Step 240, inputting the audio data into a preset rule module, where the preset rule module includes a logic formula, the data structure adopted by the logic formula is a prefix expression, and the logic formula includes conditions and logic relationships among the conditions.
After audio data generated in the communication process between the conversation robot and the user are obtained and semantic recognition is carried out, the audio data after the semantic recognition is input into a preset rule module on the server. The preset rule module can analyze the input audio data according to preset rules to obtain the label of the user. The preset rules in the preset rule module can be represented in the form of a logic expression, and the logic expression comprises conditions and logic relations among the conditions. The condition is a judgment condition in the process of analyzing the audio data to obtain the user label. The conditions and the logical relationship between the conditions include or and, of course, other types of logical relationships may also be included, which is not limited in this application.
Here, the data structure adopted by the logic formula in the rule presetting module is a prefix expression. Where a prefix expression is an expression of an operator in front of an operand. Correspondingly, there are also infix and suffix expressions. Because the data structure adopted by the logic expression in the rule presetting module is a prefix expression, the condition in the logic expression and the logic relation between the conditions are set before the condition.
Traditionally, the logic in the rule presetting module adopts a JSON data structure, and the JSON data structure comprises complex object and array structures. Therefore, the JSON data structure has a large logical volume, occupies more memory space, is slow in analysis speed and reduces the operation speed. For example, { "logic": "|", "values": { "logic": "&", "values": { "logic": "", "values": [ "E", "F", "G" ] }, "B", "C" ] }, "a" { "logic": "|", "values": { "logic": "&", "values": [ "H", "I" ] }, "D" ] } is a logical formula of a JSON data structure.
And step 260, analyzing the audio data through a logic formula to obtain the label of the user.
After audio data generated in the communication process between the conversation robot and the user are obtained, the audio data are input into a preset rule module on the server. And analyzing the audio data through a logic formula in a preset rule module to obtain the label of the user. Because the logical expression comprises the conditions and the logical relations among the conditions, whether the audio data meet the conditions or not is sequentially judged through the logical expression, and the label of the user is finally obtained.
In the embodiment of the application, in the process of extracting the user tags from the audio data of the conversation robot and the user, the audio data of the conversation robot and the user are obtained, and the audio data are input into the preset rule module. The preset rule module comprises a logic formula, the data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic relations among the conditions. And analyzing the audio data through a logic formula to obtain the label of the user. The logical expression obtained by adopting the data structure of the prefix expression only comprises simple operators and operands. And the traditional JSON data structure comprises complex objects and array structures, so that the prefix expression greatly reduces the space consumption when storing the logic expression in the use scene compared with the traditional JSON data structure, and further improves the operation efficiency.
In one embodiment, as shown in fig. 3, step 260, analyzing the audio data by a logic formula to obtain the user's tag, includes:
at step 262, the logic formula is queried from the database by the preset rule module.
When a service is created, a corresponding logic tree is combed out according to an actual service scene, an object in the logic tree is abstracted to obtain a logic formula, and the logic formula is stored in a database. Logical expressions in this application employ data structures that are prefix expressions, e.g., (| a, (& B, C, (| E, F, G)), (| D, (& H, I))). The structure of the logical tree expressed by the logical expression is shown in fig. 4.
And then, when the user label extraction is needed in the actual service scene, acquiring audio data of the conversation robot and the user, and inputting the audio data into a preset rule module. And querying the logic formula from the database through a preset rule module.
Step 264, analyzing the character strings in the logical expression to obtain the objects and the logical relationship between the objects.
And after the logical expression is acquired from the database, analyzing the character string in the logical expression. The method comprises the steps of analyzing character strings in a logic formula and abstracting objects in a logic tree, wherein the character strings in the logic formula and the objects in the logic tree are corresponding two opposite processes. And analyzing the character strings in the logical expression to obtain the objects and the logical relation among the objects.
Step 266, render the objects and the logical relationships between the objects as a logical tree.
After the character strings in the logical expression are analyzed to obtain the objects and the logical relationship between the objects, the objects and the logical relationship between the objects can be rendered into a logical tree. As shown in fig. 5, the obtained logic tree is rendered according to the prefix expression in the business scenario of collection by the bank. The logical tree contains the contents of the conditions and the logical relationships between the conditions.
And 268, analyzing the audio data through the logic tree to obtain the label of the user.
The audio data is analyzed by conditions and logical relations between the conditions in the logical tree, for example, condition 1 in fig. 5 includes condition 1.1, condition 1.2, condition 1.3, condition 1.4, and condition 1.5. And the logical relationship between condition 1.1, condition 1.2, condition 1.3, condition 1.4, and condition 1.5 is "and".
Further, condition 1.5 includes condition 1.5.1, condition 1.5.2, condition 1.5.3, condition 1.5.4, and condition 1.5.5. And the logical relationships among condition 1.5.1, condition 1.5.2, condition 1.5.3, condition 1.5.4, and condition 1.5.5 are yes.
Further, condition 1.5.5 includes condition 1.5.5.1, condition 1.5.5.2, condition 1.5.5.3, condition 1.5.5.4. And the logical relationships between condition 1.5.5.1, condition 1.5.5.2, condition 1.5.5.3, condition 1.5.5.4 are yes. The content in the above conditions includes ID, word slot value, hang-up node, data threshold, connection condition, play state, hang-up person, hang-up node, and the like in the process trigger, which is not limited in the present application.
And inputting the audio data of the conversation robot and the user into the logic tree for analysis to obtain an analysis result. And then the label of the user can be obtained according to the analysis result. The label of the user is a classification obtained by dividing the user in the service scene. For example, in a business scenario of bank collection, the user may be given labels such as "call-in and collection success", "call-in and collection failure", "call-out not through", and of course, other types of labels may also be included. Therefore, the audio data of the conversation robot and the user are input into the logic tree to be analyzed, and an analysis result is obtained. And then the user can be marked with any one or more labels according to the analysis result.
In the embodiment of the application, the logic formula is inquired from the database through the preset rule module, and the character strings in the logic formula are analyzed to obtain the objects and the logic relation between the objects. And rendering the objects and the logical relationship between the objects into a logical tree, and analyzing the audio data through the logical tree to obtain the label of the user. Therefore, the logic tree in the actual service scene is finally rendered through analyzing the logic expression with a simple structure, namely the prefix expression. At this time, the user's tag can be analyzed by inputting the audio data to the audio data. Compared with the traditional JSON data structure, the logical expression adopting the prefix expression has the advantage that the data volume of the logical expression is greatly reduced. Therefore, the storage space in the database is greatly reduced by storing the logical expression in the database, and the operational efficiency in the two processes of abstracting to the logical expression and analyzing the logical expression to the object is improved by a small data amount. Further, the efficiency of extracting the user tag from the audio data of the conversation robot and the user is finally improved.
In one embodiment, analyzing the character strings corresponding to the logical expression to obtain the objects and the logical relationship between the objects includes:
and analyzing the character strings in the logic formula through the first function to obtain the objects and the logic relation among the objects.
The first function is a logic expression2Object, and the character strings in the logic formula can be analyzed through the first function to obtain the objects and the logic relation between the objects. For example, logical expressions (| a, (& B, C, (| E, F, G)), (| D, (& H, I))) may be analyzed to derive objects and logical relationships between objects. Specifically, the character strings in the logic formula are sequentially input into the state machine for reading. And sequentially and respectively executing different commands according to the read values to obtain return values, and obtaining the objects and the logical relationship among the objects based on all the return values.
In the embodiment of the application, the simple logical expression obtained from the database is analyzed through the first function to obtain the information represented in the logical expression, that is, the character string in the logical expression is analyzed through the first function to obtain the object and the logical relationship between the objects. Thus, objects and logical relationships between objects may be rendered as a logical tree. And analyzing the audio data through the logic tree to obtain the label of the user.
In one embodiment, as shown in fig. 6, analyzing the character strings in the logical expression through the first function to obtain the objects and the logical relationship between the objects includes:
step 620, inputting the character strings in the logic formula into a state machine in sequence for reading;
step 640, sequentially and respectively executing different commands according to the read values to obtain return values;
step 660, based on all the return values, the objects and the logical relationship between the objects are obtained.
The source code of the logicExpression2Object function is as follows:
Figure BDA0002504051320000111
Figure BDA0002504051320000121
Figure BDA0002504051320000131
Figure BDA0002504051320000141
in the embodiment of the application, the simple logical expression obtained from the database is analyzed through the first function to obtain the information represented in the logical expression, that is, the character string in the logical expression is analyzed through the first function to obtain the object and the logical relationship between the objects. Thus, objects and logical relationships between objects may be rendered as a logical tree. And analyzing the audio data through the logic tree to obtain the label of the user.
In one embodiment, a user tag extraction method is provided, which further includes: abstracting the objects in the preset logic tree to obtain a logic formula, and storing the logic formula in a database.
And when the service is created, combing out a corresponding logic tree according to the actual service scene. As shown in fig. 5, the rendered logic tree in the business scenario is collected by the bank. The logical tree contains the contents of the conditions and the logical relationships between the conditions. Abstracting the objects in the logic tree to obtain a logic formula, and storing the logic formula in a database. The logical expression in the present application uses a data structure as a prefix expression, for example, the logical expression obtained by abstracting the object in the logical tree shown in fig. 5 is as follows: (&1.1, 1.2, 1.3, 1.4, (&1.5.1, 1.5.2, 1.5.3, 1.5.4, (&1.5.5.1, 1.5.5.2, 1.5.5.3, 1.5.5.4))). In the embodiment of the application, the objects in the preset logical tree are abstracted to obtain the logical expression, and the data structure of the logical expression is a prefix expression. And the traditional JSON data structure comprises complex objects and array structures, so that the prefix expression greatly reduces the space consumption when storing the logic expression in the use scene compared with the traditional JSON data structure, and further improves the operation efficiency.
In one embodiment, a user tag extraction method is provided, which further includes: abstracting the objects in the preset logic tree into a logic formula through a second function, and storing the logic formula into a database.
The second function is object2LogicExpression, objects in the preset logical tree can be abstracted into logical expressions through the second function, and the logical expressions are stored in the database. Specifically, all objects in the preset logic tree are circularly traversed in a recursive manner, and values corresponding to the logical relationship between the objects and values corresponding to the objects are obtained. And converting the values corresponding to the logical relations between the objects and the values corresponding to the objects into character strings to obtain a logical expression, and storing the logical expression into a database. For example, the logical formula obtained by abstracting the objects in the logical tree shown in fig. 5 is: (&1.1, 1.2, 1.3, 1.4, (&1.5.1, 1.5.2, 1.5.3, 1.5.4, (&1.5.5.1, 1.5.5.2, 1.5.5.3, 1.5.5.4))).
In the embodiment of the application, when a service is created, the corresponding logic tree is combed out according to an actual service scene, an object in the logic tree is abstracted to obtain a logic formula, and the logic formula is stored in the database. And then, in the service scene, the corresponding logical formula is directly acquired from the database, and the character strings in the logical formula are analyzed to obtain the objects and the logical relationship between the objects. And rendering the objects and the logical relations among the objects into a logical tree. The audio data can be analyzed through the logic tree to obtain the label of the user. And the logical expression adopts a data structure of the prefix expression, so that the space consumption for storing the logical expression in the use scene is greatly reduced, and the operation efficiency is further improved.
In one embodiment, as shown in fig. 7, abstracting an object in the preset logical tree into a logical expression through a second function, and storing the logical expression in the database includes:
step 720, circularly traversing all objects in the preset logic tree in a recursive manner, and acquiring values corresponding to the logic relationship between the objects and values corresponding to the objects;
step 740, converting the values corresponding to the logical relationship between the objects and the values corresponding to the objects into character strings to obtain a logical expression, and storing the logical expression in the database.
The source code of the second function object2LogicExpression is as follows:
Figure BDA0002504051320000161
in the embodiment of the application, the objects in the preset logic tree are abstracted into the logic formula through the second function, and the logic formula is stored in the database. The logic formula adopts a data structure of the prefix expression, and the data structure of the prefix expression greatly reduces the space consumption when the logic formula in the use scene is stored, thereby improving the operation efficiency.
In one embodiment, the logical relationships are represented in the form of prefixes.
Where a prefix expression is an expression of an operator in front of an operand. Correspondingly, there are also infix and suffix expressions. Because the data structure adopted by the logic expression in the preset rule module is a prefix expression, the condition in the logic expression and the logic relation between the conditions are arranged before the conditions. For example, (| a, (& B, C, (| E, F, G)), (| D, (& H, I))). The first tuple is A, the second tuple is B, the third tuple is C, the fourth tuple is (| E, F, G), and the fifth tuple is (| D, (& H, I).
Wherein, the logical relationship between the three condition groups of a, (& B, C, (| E, F, G)), (| D, (& H, I)) is an or relationship. Further, the second condition group (& B, C, (| E, F, G)) includes three sub-condition groups of B, C, (| E, F, G), and the logical relationship between the three sub-condition groups of B, C, (| E, F, G) is a relationship of "and". Still further, the logical relationship between the three conditions of E, F, G in this sub-condition group of (| E, F, G) is an or relationship.
Traditionally, the logical formula in the preset rule module employs a JSON data structure, and the JSON data structure includes complex object and array structures. Therefore, the JSON data structure has a large logical volume, occupies more memory space, is slow in analysis speed and reduces the operation speed. For example, { "logic": "|", "values": { "logic": "&", "values": { "logic": "", "values": [ "E", "F", "G" ] }, "B", "C" ] }, "a" { "logic": "|", "values": { "logic": "&", "values": [ "H", "I" ] }, "D" ] } is a logical formula of a JSON data structure.
In the embodiment of the present application, the logic expression of the complex JSON data structure is expressed in the form of a prefix expression, that is, the logic relationship is expressed in the form of a prefix in the logic expression. The logical expression of the conventional JSON data structure is converted into a prefix expression in the form of a, (& B, C, (| E, F, G)), (| D, (& H, I)). Obviously, compared with the traditional JSON data structure, the logical expression adopting the prefix expression has greatly reduced data quantity. Therefore, the storage space in the database is greatly reduced by storing the logical expression in the database, and the operational efficiency in the two processes of abstracting to the logical expression and analyzing the logical expression to the object is improved by a small data amount. Further, the efficiency of extracting the user tag from the audio data of the conversation robot and the user is finally improved.
In one embodiment, the logical expression includes a condition group and a logical relationship between the condition groups, the condition group includes a sub-condition group and a logical relationship between the sub-condition group, and the sub-condition group includes at least one condition and a logical relationship between the conditions.
Specifically, for example, one logical expression is: (| A, (& B, C, (| E, F, G)), (| D, (& H, I))). The logical expression comprises a condition group and a logical relation between the condition groups, wherein the logical expression comprises three condition groups: a, (& B, C, (| E, F, G)), (| D, (& H, I)). And for a, the logical relationship between the three condition groups (& B, C, (| E, F, G)), (| D, (& H, I)) is an or relationship.
Further, the condition group includes a sub-condition group and a logical relationship between the sub-condition groups. In the second condition group (& B, C, (| E, F, G)), three sub-condition groups are included: b, C, (| E, F, G). The logical relationship among the three sub-condition groups of B, C, (| E, F, G) is a relationship of "and".
Still further, the set of sub-conditions includes at least one condition and a logical relationship between the conditions. B and C, wherein only one condition is contained under each of the two sub-condition groups. The sub-condition group (| E, F, G) includes three conditions E, F, G. And the logical relationship among the three conditions of E, F and G is the relationship of OR.
In the embodiment of the application, a logic formula is divided into condition groups from outside to inside, wherein the condition groups are sub-condition groups, and the sub-condition groups are further provided with conditions. And a logical relationship exists between the condition sets, and the logical relationship between the condition sets is arranged at the forefront of all the condition sets in a form of prefix. There is a logical relationship between the sub-condition groups, and the logical relationship between the sub-condition groups and the sub-condition groups is set at the forefront of all the condition groups in the form of a prefix. There is a logical relationship between the conditions, which is placed in the prefix form at the top of all the conditions. The logical relation in the logical formula is clear and definite, and the analysis or abstraction in the subsequent operation is convenient.
Compared with the traditional JSON data structure, the logic formula adopting the prefix expression has the advantage that the data volume of the logic formula is greatly reduced. Therefore, the storage space in the database is greatly reduced by storing the logical expression in the database, and the operational efficiency in the two processes of abstracting to the logical expression and analyzing the logical expression to the object is improved by a small data amount. Further, the efficiency of extracting the user tag from the audio data of the conversation robot and the user is finally improved.
In one embodiment, as shown in fig. 8, there is provided a user tag extraction apparatus 800 including:
an obtaining module 820, configured to obtain audio data of the conversation robot and the user;
the input module 840 is used for inputting audio data into a preset rule module, the preset rule module comprises a logic formula, a data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logics among the conditions;
and the analysis module 860 is used for analyzing the audio data through a logic formula to obtain the label of the user.
In one embodiment, as shown in FIG. 9, the analysis module 860 includes:
a logical formula query unit 862 configured to query a logical formula from the database through a preset rule module;
a character string analyzing unit 864, configured to analyze the character string corresponding to the logical expression to obtain the objects and the logical relationship between the objects;
a logical tree generating unit 866, configured to render the objects and the logical relationships between the objects into a logical tree;
the analyzing unit 868 is configured to analyze the audio data through the logic tree to obtain a tag of the user.
In an embodiment, the character string parsing unit is further configured to parse the character string in the logical expression through the first function to obtain the objects and the logical relationship between the objects.
In one embodiment, the character string parsing unit is further configured to sequentially input the character strings in the logical expression into the state machine for reading; respectively executing different commands in sequence according to the read values to obtain return values; and obtaining the objects and the logical relationship among the objects based on all the return values.
In one embodiment, there is provided a user tag extracting apparatus further comprising: and the logical formula storage module is used for abstracting the objects in the preset logical tree to obtain a logical formula and storing the logical formula into the database.
In an embodiment, the logical formula storage module is further configured to abstract the object in the preset logical tree into a logical formula through a second function, and store the logical formula in the database.
In one embodiment, the logical storage module is further configured to cycle through all objects in the preset logical tree in a recursive manner, and obtain values corresponding to logical relationships between the objects and values corresponding to the objects; and converting the values corresponding to the logical relations between the objects and the values corresponding to the objects into character strings to obtain a logical expression, and storing the logical expression into a database.
In one embodiment, the logical relationships are represented in the form of prefixes.
In one embodiment, the logical expression includes a condition group and a logical relationship between the condition groups, the condition group includes a sub-condition group and a logical relationship between the sub-condition group, and the sub-condition group includes at least one condition and a logical relationship between the conditions.
The division of each module in the user tag extraction apparatus is only used for illustration, and in other embodiments, the user tag extraction apparatus may be divided into different modules as needed to complete all or part of the functions of the user tag extraction apparatus.
Fig. 10 is a schematic diagram of an internal configuration of a server in one embodiment. As shown in fig. 10, the server includes a processor and a memory connected by a system bus. Wherein, the processor is used for providing calculation and control capability and supporting the operation of the whole server. The memory may include a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The computer program can be executed by a processor for implementing a user tag extraction method provided in the following embodiments. The internal memory provides a cached execution environment for the operating system computer programs in the non-volatile storage medium. The server may be a mobile phone, a tablet computer, or a personal digital assistant or a wearable device, etc.
The implementation of each module in the user tag extraction apparatus provided in the embodiment of the present application may be in the form of a computer program. The computer program may be run on a terminal or a server. The program modules constituted by the computer program may be stored on the memory of the terminal or the server. Which when executed by a processor, performs the steps of the method described in the embodiments of the present application.
The embodiment of the application also provides a computer readable storage medium. One or more non-transitory computer-readable storage media containing computer-executable instructions that, when executed by one or more processors, cause the processors to perform the steps of the user tag extraction method.
A computer program product containing instructions which, when run on a computer, cause the computer to perform a user tag extraction method.
Any reference to memory, storage, database, or other medium used by embodiments of the present application may include non-volatile and/or volatile memory. Suitable non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which acts as external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and bus dynamic RAM (RDRAM).
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (15)

1. A user tag extraction method is characterized by comprising the following steps:
acquiring audio data of a conversation robot and a user;
inputting the audio data into a preset rule module, wherein the preset rule module comprises a logic formula, the data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logic relations among the conditions;
and analyzing the audio data through the logic formula to obtain the label of the user.
2. The method of claim 1, wherein the analyzing the audio data by the logic to obtain the user's tag comprises:
inquiring the logic formula from a database through the preset rule module;
analyzing the character strings in the logical expression to obtain objects and the logical relation between the objects;
rendering the objects and the logical relationship between the objects into a logical tree;
and analyzing the audio data through the logic tree to obtain the label of the user.
3. The method according to claim 2, wherein the analyzing the character string corresponding to the logical expression to obtain an object and a logical relationship between the objects comprises:
and analyzing the character strings in the logical expression through a first function to obtain objects and the logical relation between the objects.
4. The method according to claim 3, wherein the analyzing the character strings in the logical expression through the first function to obtain the objects and the logical relationship between the objects comprises:
sequentially inputting the character strings in the logic formula into a state machine for reading;
respectively executing different commands in sequence according to the read values to obtain return values;
and obtaining the objects and the logical relationship among the objects based on all the return values.
5. The method of claim 1, further comprising: abstracting an object in a preset logic tree to obtain the logic formula, and storing the logic formula in a database.
6. The method of claim 1, further comprising: abstracting the objects in the preset logic tree into the logic formula through a second function, and storing the logic formula into a database.
7. The method according to claim 6, wherein abstracting the object in the preset logical tree into the logical expression through the second function, and storing the logical expression in a database comprises:
circularly traversing all objects in a preset logic tree in a recursive mode to obtain values corresponding to the logic relationship among the objects and the values corresponding to the objects;
and converting values corresponding to the logical relationship between the objects and values corresponding to the objects into character strings to obtain a logical expression, and storing the logical expression into a database.
8. The method of any of claims 1-7, wherein the logical relationship is represented in the form of a prefix.
9. The method of claim 8, wherein the logical expression comprises a condition set and a logical relationship between the condition set, wherein the condition set comprises a sub-condition set and a logical relationship between the sub-condition set, and wherein the sub-condition set comprises at least one condition and a logical relationship between the conditions.
10. A user tag extraction apparatus, comprising:
the acquisition module is used for acquiring audio data of the conversation robot and a user;
the input module is used for inputting the audio data into a preset rule module, the preset rule module comprises a logic formula, the data structure adopted by the logic formula is a prefix expression, and the logic formula comprises conditions and logics among the conditions;
and the analysis module is used for analyzing the audio data through the logic formula to obtain the label of the user.
11. The apparatus of claim 10, wherein the analysis module comprises:
the logic formula query unit is used for querying the logic formula from the database through the preset rule module;
the character string analysis unit is used for analyzing the character strings corresponding to the logical expression to obtain objects and the logical relation between the objects;
the logical tree generating unit is used for rendering the objects and the logical relations among the objects into a logical tree;
and the analysis unit is used for analyzing the audio data through the logic tree to obtain the label of the user.
12. The apparatus according to claim 11, wherein the character string parsing unit is further configured to parse the character string in the logical expression through a first function to obtain the objects and the logical relationship between the objects.
13. The apparatus according to claim 12, wherein the string parsing unit is further configured to sequentially input the strings in the logical expression into a state machine for reading; respectively executing different commands in sequence according to the read values to obtain return values; and obtaining the objects and the logical relationship among the objects based on all the return values.
14. A server comprising a memory and a processor, the memory having stored thereon a computer program, wherein the computer program, when executed by the processor, causes the processor to perform the steps of the user tag extraction method according to any one of claims 1 to 9.
15. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the user tag extraction method according to any one of claims 1 to 9.
CN202010440700.8A 2020-05-22 2020-05-22 User tag extraction method and device, server and computer readable storage medium Active CN111768767B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010440700.8A CN111768767B (en) 2020-05-22 2020-05-22 User tag extraction method and device, server and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010440700.8A CN111768767B (en) 2020-05-22 2020-05-22 User tag extraction method and device, server and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111768767A true CN111768767A (en) 2020-10-13
CN111768767B CN111768767B (en) 2023-08-15

Family

ID=72719676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010440700.8A Active CN111768767B (en) 2020-05-22 2020-05-22 User tag extraction method and device, server and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111768767B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113222459A (en) * 2021-05-31 2021-08-06 中国测试技术研究院 System and method for dynamically constructing food uncertainty evaluation model by expression tree

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020116196A1 (en) * 1998-11-12 2002-08-22 Tran Bao Q. Speech recognizer
CN107193843A (en) * 2016-03-15 2017-09-22 阿里巴巴集团控股有限公司 A kind of character string selection method and device based on AC automatic machines and postfix expression
CN108521525A (en) * 2018-04-03 2018-09-11 南京甄视智能科技有限公司 Intelligent robot customer service marketing method and system based on user tag system
CN109710811A (en) * 2018-11-28 2019-05-03 北京摩拜科技有限公司 Detection method, equipment and the application system of user's portrait
CN110019725A (en) * 2017-12-22 2019-07-16 科沃斯商用机器人有限公司 Man-machine interaction method, system and its electronic equipment
CN110532487A (en) * 2019-09-11 2019-12-03 北京百度网讯科技有限公司 The generation method and device of label
CN111081233A (en) * 2019-12-31 2020-04-28 联想(北京)有限公司 Audio processing method and electronic equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020116196A1 (en) * 1998-11-12 2002-08-22 Tran Bao Q. Speech recognizer
CN107193843A (en) * 2016-03-15 2017-09-22 阿里巴巴集团控股有限公司 A kind of character string selection method and device based on AC automatic machines and postfix expression
CN110019725A (en) * 2017-12-22 2019-07-16 科沃斯商用机器人有限公司 Man-machine interaction method, system and its electronic equipment
CN108521525A (en) * 2018-04-03 2018-09-11 南京甄视智能科技有限公司 Intelligent robot customer service marketing method and system based on user tag system
CN109710811A (en) * 2018-11-28 2019-05-03 北京摩拜科技有限公司 Detection method, equipment and the application system of user's portrait
CN110532487A (en) * 2019-09-11 2019-12-03 北京百度网讯科技有限公司 The generation method and device of label
CN111081233A (en) * 2019-12-31 2020-04-28 联想(北京)有限公司 Audio processing method and electronic equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113222459A (en) * 2021-05-31 2021-08-06 中国测试技术研究院 System and method for dynamically constructing food uncertainty evaluation model by expression tree

Also Published As

Publication number Publication date
CN111768767B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
CN110298019B (en) Named entity recognition method, device, equipment and computer readable storage medium
CN109583325B (en) Face sample picture labeling method and device, computer equipment and storage medium
EP3588279B1 (en) Automated extraction of rules embedded in software application code using machine learning
CN110444198B (en) Retrieval method, retrieval device, computer equipment and storage medium
CN108874992A (en) The analysis of public opinion method, system, computer equipment and storage medium
CN112612761B (en) Data cleaning method, device, equipment and storage medium
CN110609852A (en) Streaming data processing method and device, computer equipment and storage medium
CN110955608B (en) Test data processing method, device, computer equipment and storage medium
CN112380866A (en) Text topic label generation method, terminal device and storage medium
CN115438740A (en) Multi-source data convergence and fusion method and system
CN113419721B (en) Web-based expression editing method, device, equipment and storage medium
CN111768767B (en) User tag extraction method and device, server and computer readable storage medium
CN114449310A (en) Video editing method and device, computer equipment and storage medium
CN111400340A (en) Natural language processing method and device, computer equipment and storage medium
CN110851597A (en) Method and device for sentence annotation based on similar entity replacement
CN111831624A (en) Data table creating method and device, computer equipment and storage medium
CN111770357A (en) Bullet screen-based video highlight segment identification method, terminal and storage medium
CN110888983A (en) Positive and negative emotion analysis method, terminal device and storage medium
CN115686455A (en) Application development method, device and equipment based on spreadsheet and storage medium
CN116822491A (en) Log analysis method and device, equipment and storage medium
CN114089980A (en) Programming processing method, device, interpreter and nonvolatile storage medium
CN115169345A (en) Training method, device and equipment for text emotion analysis model and storage medium
CN115357697A (en) Data processing method, device, terminal equipment and storage medium
CN113946363A (en) Method and device for executing and configuring service data, computer equipment and storage medium
CN113255368A (en) Method and device for emotion analysis of text data and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant