WO2021180062A1 - 意图识别方法及电子设备 - Google Patents

意图识别方法及电子设备 Download PDF

Info

Publication number
WO2021180062A1
WO2021180062A1 PCT/CN2021/079723 CN2021079723W WO2021180062A1 WO 2021180062 A1 WO2021180062 A1 WO 2021180062A1 CN 2021079723 W CN2021079723 W CN 2021079723W WO 2021180062 A1 WO2021180062 A1 WO 2021180062A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
electronic device
sequence
entity
intention
Prior art date
Application number
PCT/CN2021/079723
Other languages
English (en)
French (fr)
Inventor
朱越
赵忠祥
李临
涂凌志
杨悦
张宝峰
崔倚瑞
李育儒
于超
宋子亮
李樱霞
唐鹏程
何诚慷
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021180062A1 publication Critical patent/WO2021180062A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • This application relates to the decision-making and reasoning sub-fields in the field of Artificial Intelligence (AI), and particularly relates to an intention recognition method and electronic equipment.
  • AI Artificial Intelligence
  • each user or family will have multiple smart devices.
  • users need electronic devices that can intelligently respond to their requests.
  • Figure 1 shows a scene of intention recognition in the prior art.
  • the electronic device will intelligently recognize the possible intention of the user according to the user's input as a candidate intention and show it to the user.
  • the electronic device will display a search result for the selected intent.
  • This application provides an intention recognition method and an electronic device, which predicts user intentions based on the entity sequence identified from data acquired within a period of time, which improves the accuracy of intention recognition.
  • the present application provides an intention recognition method.
  • the method includes: a first electronic device determines a first trigger; in response to the first trigger, the first electronic device acquires a first data sequence within a first time period, and A data sequence includes multiple data, and at least two of the multiple data have different input methods; the first electronic device determines the first intention of the user according to the first data sequence; the first electronic device determines the first intention according to the first intention Action to be performed.
  • the electronic device can obtain a complete description of the environment based on the environment perception of multiple devices and the multi-modal input of the user, and combine user input, environment perception and contextual information within a period of time to obtain a system that can respond to changes over time, And a complete and unbiased intention system that can be expanded with changes in the environment. Decisions are made based on this, such as inferring the actions the user wants to perform or the services needed in the next period of time, to decide which device to respond to the user’s This kind of demand provides the user with the precise response or service decision he needs.
  • the first electronic device determines the first intention of the user according to the first data sequence; including: the first electronic device determines the first entity sequence according to the first data sequence, and the first entity sequence includes at least An entity, an entity is an object, thing or action that exists objectively in the real world and can be distinguished from each other; the first electronic device determines the first intention according to the first entity sequence, where the first intention is used to determine the action sequence. In this way, the electronic device can determine the user's intention given the data sequence.
  • the first electronic device determines the first action to be performed according to the first intention, including: the first electronic device determines the first action sequence according to the first entity sequence and the first intention, the first action The sequence includes the first action to be performed; after the first electronic device determines the first action to be performed, it further includes: the first electronic device executes the first action to be performed.
  • the electronic device can determine the action that needs to be performed based on the entity and the intention, and then the electronic device can perform the determined action.
  • the first action to be executed includes the device identification and the action to be executed, and the first electronic device executes the first action to be executed, which specifically includes: the first electronic device determines the device in the first action to be executed Identify whether the identification is the device identification of the first electronic device; when it is determined that the device identification in the first to-be-executed action is the device identification of the first electronic device, the first electronic device executes the first to-be-executed action; otherwise, the first electronic device sends
  • the first instruction is for the second electronic device corresponding to the device identifier in the first action to be executed, and the first instruction is used to instruct the second electronic device to execute the first action to be executed.
  • the execution device corresponding to the first action to be executed may be the first electronic device or other electronic devices. According to the device identification of the first action to be executed, the first electronic device may determine that it is executing the first action by itself. For an action to be executed, an instruction is sent to the corresponding second electronic device to execute the first action to be executed. In this way, in a distributed scenario, the first electronic device can conveniently control other electronic devices to respond to user needs.
  • the method further includes: the first electronic device determines an abnormal feature vector set whose appearance frequency exceeds a preset first frequency threshold as a new entity, wherein the abnormal feature vector set is used during entity recognition , And the feature vector set that cannot be identified as the entity whose degree of discrimination exceeds the preset discrimination threshold.
  • the first electronic device can expand its own physical warehouse, thereby dynamically expanding the range of entities that can be identified by itself, and can further improve the accuracy of intent recognition.
  • the method further includes: the first electronic device determines that an abnormal action whose frequency of occurrence exceeds a preset second frequency threshold is a new intention, wherein the abnormal action is a new intention that has not occurred before and is not already intentional.
  • the action in the action sequence corresponding to the figure; the first electronic device establishes the correspondence between the new intention and the entity sequence according to the entity sequence recognized before the abnormal action occurs.
  • the first electronic device can expand its own intention warehouse and establish a new correspondence between intentions and action sequences. In this way, more personalized intentions of users can be identified, and more personalized intentions can be identified. Decisions that more closely match the needs enhance the user experience.
  • the first electronic device determines the first entity sequence according to the first data sequence, which specifically includes: the first electronic device extracts a feature vector from the first data sequence to obtain a first feature vector set, A feature vector set includes all feature vectors extracted from the first data sequence, and the feature vector is used to represent the features of the first data sequence; the first electronic device inputs the first feature vector set into the entity recognition model to obtain the first entity Sequence, the entity recognition model is the corresponding relationship between the feature vector and the entity obtained by training based on the entity data stored in the first electronic device.
  • the entity data is the storage form of the entity.
  • the entity data includes at least the entity number and the feature vector set representing the entity .
  • the first electronic device determines the first intention according to the first entity sequence, which specifically includes: the first electronic device determines multiple candidate intentions according to the first entity sequence and the stored knowledge graph; first The electronic device uses a preset reinforcement learning algorithm to determine the first intention from the multiple candidate intentions. As a result, the first intention is identified based on the knowledge graph and reinforcement learning, and the accuracy of intention recognition is improved.
  • the first electronic device determines multiple candidate intentions according to the first entity sequence and the stored knowledge graph, which specifically includes: determining the user's state information and scene information according to the first entity sequence and the knowledge graph ; Status information is used to indicate the current state of the user, and scene information is used to indicate the environment the user is currently in;
  • multiple candidate intents corresponding to the state information and the scene information are determined.
  • using a preset reinforcement learning algorithm to determine a first intention from multiple candidate intentions includes: determining an intention rocker corresponding to the multiple candidate intentions one-to-one; according to the first entity sequence, State information, scene information, an intention rocker corresponding to multiple candidate intentions one-to-one, and a reinforcement learning algorithm determine the first intention from the multiple candidate intentions.
  • the first electronic device determines the first intention according to the first entity sequence, which specifically includes: the first electronic device inputs the first entity sequence into the intention recognition model to obtain the first intention, and the intention recognition model is The corresponding relationship between the entity sequence and the intention obtained by training according to the corresponding entity sequence and the data of the intention.
  • the method before the first electronic device inputs the first entity sequence into the intent recognition model, the method further includes: the first electronic device inputs the test data to the first generator, and the first generator is processed to obtain the first A simulation data; the first electronic device inputs the test data and the first simulation data to the first discriminator, and the first discrimination result is obtained after processing by the first discriminator, and the first discrimination result is used to indicate the test data and the first simulation data
  • the first electronic device updates the weight coefficient of the first generator according to the first discrimination result to obtain the second generator; the first electronic device generates the second analog data in the second generator; the first electronic device will
  • the first target simulation data is input into a preset training network, and the intention recognition model is obtained through training.
  • the first target simulation data includes the second simulation data.
  • the first electronic device is configured with a group coarse-grained model and a fine-grained model; before the first electronic device inputs the first entity sequence into the intent recognition model, the method further includes: the first electronic device obtains the fine-grained model The mapping relationship between the label and the coarse-grained label; the first electronic device maps the fine-grained data in the training data set to coarse-grained data according to the mapping relationship; the first electronic device inputs the coarse-grained data to the group coarse-grained model for training, and passes multiple The joint learning of node devices updates the coarse-grained group model, and inputs the fine-grained data into the fine-grained model for training.
  • the multiple node devices include the first electronic device; the first electronic device combines the coarse-grained group model and the fine-grained model.
  • the granularity model obtains the intent recognition model, the label space of the intent recognition model is mapped to fine-grained labels, and the output result of the intent recognition model is used to update the fine-grained model.
  • the first electronic device is also configured with an individual coarse-grained model, and the tag space of the individual coarse-grained model is mapped to a coarse-grained label; the first electronic device combines the group coarse-grained model and the fine-grained model to obtain the intention
  • the recognition model includes: the first electronic device combines a group coarse-grained model, an individual coarse-grained model, and a fine-grained model to obtain an intention recognition model.
  • the method further includes: the first electronic device determines the dot data sequence to be recognized, the dot data sequence to be recognized is composed of dot data, and the dot data includes The user's operation data recorded by the first electronic device and/or the response data of the first electronic device to the user's operation; the first electronic device inputs the to-be-identified dot data sequence into the multi-instance learning model to obtain multiple sub-sequences; the multi-instance learning model It is a multi-example learning model that has been trained using the dot data sequence in the first electronic device; the first electronic device determines the intent of the first subsequence according to the preset intent rule, and the first subsequence is one of the multiple subsequences , The preset intention rule is used to determine the intention of the sequence according to the dot data in the sequence; the first electronic device updates the intention recognition model based on the determined intentions of the multiple sub-sequence
  • the first electronic device determines the first action sequence according to the first entity sequence and the first intention, which specifically includes: the first electronic device inputs the first entity sequence and the first intention into the action prediction model, The first action sequence is obtained, and the action prediction model is the entity sequence obtained by training according to the corresponding entity sequence, the data of the intention and the action sequence, and the correspondence relationship between the intention and the action sequence.
  • the first electronic device determines the first action sequence according to the first entity sequence and the first intention, which specifically includes: the first electronic device inputs the first entity sequence and the first intention into the rule engine to obtain The first action sequence, the rule engine contains the corresponding relationship between the entity sequence and the intention and the action sequence set according to the user's usage habits or usage scenarios.
  • the rule engine includes: a first node, the first node includes at least a first type node and a second type node; the first type node is used to input the first entity of the first entity in the rule engine
  • One attribute, the first semantic object is obtained from the memory to match the first entity, and the first matching result is obtained.
  • the first attribute is used to characterize the change frequency of the first entity
  • the second type of node is used to enter the rule engine according to the The second attribute of the second entity, the second semantic object is obtained from the file to match the second entity, and the second matching result is obtained.
  • the second attribute is used to characterize the change frequency of the second entity, and the second attribute is different from the first attribute ; Wherein, the first matching result and the second matching result are used together to determine whether to perform the first action to be performed.
  • the first time period has a corresponding relationship with the first trigger.
  • the first data sequence is entered by the first electronic device from touch operation input, sensor data input, text data input, voice data input, video data input, and communication with the first electronic device.
  • At least two input methods are obtained from the input of the transmission data of the smart device connected to the device; the first to-be-executed action includes one of starting the target application, starting the target service, loading the target application in the background, wirelessly connecting to the target device, and sending a notification message Action or service.
  • the embodiments of the present application also provide an electronic device, the electronic device includes: at least one memory, used to store a program; at least one processor, used to execute the program stored in the memory, when the program stored in the memory When executed, the processor is used to execute the method provided in the first aspect.
  • the embodiments of the present application also provide a computer storage medium, and the computer storage medium stores instructions.
  • the instructions run on a computer, the computer executes the method provided in the first aspect.
  • embodiments of the present application also provide a computer program product containing instructions, which when the instructions run on a computer, cause the computer to execute the method provided in the first aspect.
  • an embodiment of the present application also provides a rule engine execution device, which is characterized in that the device runs computer program instructions to execute the method provided in the first aspect.
  • the present application provides an intention recognition method, the method includes: a first electronic device determines a first trigger; in response to the first trigger, the first electronic device acquires first data within a first time period, The first data is used to determine an entity, which is an object, thing or action that objectively exists in the real world and can be distinguished from each other; the first electronic device determines a first entity sequence according to the first data, and the first entity The sequence includes at least one entity; the first electronic device determines a first intention according to the first entity sequence, and the first intention is used to determine an action sequence; the first electronic device determines an action sequence according to the first entity sequence and the first intention, Determine a first action sequence, where the first action sequence includes a first action to be performed; the first electronic device executes the first action to be performed.
  • the electronic device can obtain a complete description of the environment based on the environment perception of multiple devices and the multi-modal input of the user, and combine user input, environment perception and contextual information within a period of time to obtain a system that can respond to changes over time, And a complete and unbiased intention system that can be expanded with changes in the environment. Decisions are made based on this, such as inferring the actions the user wants to perform or the services needed in the next period of time, to decide which device to respond to the user’s This kind of demand provides the user with the precise response or service decision he needs.
  • the first action to be executed includes a device identifier and an action to be executed
  • the first electronic device executes the first action to be executed, which specifically includes: the first electronic device determines the first action to be executed Whether the device identification of the executing action is the device identification of the first electronic device; when it is determined that the device identification of the first to-be-executed action is the device identification of the first electronic device, the first electronic device executes the first electronic device Action to be executed; otherwise, the first electronic device sends a first instruction to the second electronic device corresponding to the device ID in the first action to be executed, and the first instruction is used to instruct the second electronic device to execute the first to-be-executed action.
  • the execution device corresponding to the first action to be executed may be the first electronic device or other electronic devices. According to the device identification of the first action to be executed, the first electronic device may determine that it is executing the first action by itself. For an action to be executed, an instruction is sent to the corresponding second electronic device to execute the first action to be executed. In this way, in a distributed scenario, the first electronic device can conveniently control other electronic devices to respond to user needs.
  • the method further includes: the first electronic device determines an abnormal feature vector set whose appearance frequency exceeds a preset first frequency threshold as a new entity, wherein the abnormal feature vector set is an existing entity During recognition, the distinguishing degree from the feature vector set that can be identified as an entity exceeds the preset distinguishing threshold value and the feature vector set that cannot be identified as an entity.
  • the first electronic device can expand its own physical warehouse, thereby dynamically expanding the range of entities that can be identified by itself, and can further improve the accuracy of intent recognition.
  • the method further includes: the first electronic device determines that an abnormal action whose frequency of occurrence exceeds a preset second frequency threshold is a new intention, wherein the abnormal action is unoccurring and not present. An action in an action sequence corresponding to an existing intention; the first electronic device establishes a correspondence between the new intention and the entity sequence according to the entity sequence recognized before the abnormal action occurs.
  • the first electronic device can expand its own intention warehouse and establish a new correspondence between intentions and action sequences. In this way, more personalized intentions of users can be identified, and more personalized intentions can be identified. Decisions that more closely match the needs enhance the user experience.
  • the first electronic device determines the first entity sequence according to the first data, which specifically includes: the first electronic device extracts a feature vector from the first data to obtain the first feature vector
  • the first feature vector set includes all feature vectors extracted from the first data, and the feature vector is used to represent the feature of the first data;
  • the first electronic device inputs the first feature vector set to the entity A recognition model to obtain the first entity sequence, the entity recognition model is a correspondence relationship between a feature vector and an entity obtained by training based on the entity data stored in the first electronic device, the entity data is the storage form of the entity, and the entity data It includes at least the number of the entity and the set of feature vectors representing the entity.
  • the first electronic device inputs the first feature vector set into the entity recognition model. After the entity is recognized, it may not only compose the recognized entity into the first entity sequence, but also The entity historically output by the entity recognition model and the entity obtained this time together form the first entity sequence, which is not limited here.
  • the entity recognition model can be stored in different locations.
  • the entity recognition model is preset and stored in the first electronic device; or, the entity recognition model is stored in the first electronic device.
  • the cloud servers accessible by electronic devices there is no limitation here.
  • the entity recognition model can be generated in different ways.
  • the entity recognition model is pre-trained by the manufacturer; or, the entity recognition model is the first electronic device according to the first
  • the physical data stored in the electronic device is obtained through training, which is not limited here.
  • the first electronic device determines the first intention according to the first entity sequence, which specifically includes: the first electronic device inputs the first entity sequence into the intention recognition model to obtain the first intention ,
  • the intention recognition model is the correspondence relationship between the entity sequence and the intention obtained by training according to the corresponding entity sequence and the data of the intention.
  • the intent recognition model can be stored in different locations.
  • the intent recognition model is preset and stored in the first electronic device; or, the intent recognition model is stored in the first electronic device.
  • the cloud servers that the device can access there is no restriction here.
  • the intent recognition model can be generated in different ways.
  • the intent recognition model is pre-trained by the manufacturer; or, the intent recognition model is the first electronic device according to the first The corresponding entity sequence and intent data stored in the electronic device are trained; or, the intent recognition model is obtained by training the first electronic device according to the corresponding entity sequence and intent data shared by other users, which is not limited here.
  • the first electronic device determines the first action sequence according to the first entity sequence and the first intention, which specifically includes: the first electronic device inputs the first entity sequence into an action prediction model To obtain the first action sequence, the action prediction model is the entity sequence obtained by training according to the corresponding entity sequence, the data of the intention and the action sequence, and the correspondence relationship between the intention and the action sequence;
  • the first electronic device can input the first entity sequence and the first intention into the action prediction model, predict the first action sequence, and dig out the user's potential needs to help make decisions.
  • the first electronic device determines the first action sequence according to the first entity sequence and the first intention, which specifically includes: the first electronic device determines the first action sequence according to a decision rule
  • the entity sequence and the first action sequence corresponding to the first intention sequence, and the decision rule is a correspondence relationship between the entity sequence and the intention and the action sequence set according to the user's usage habits or usage scenarios.
  • the first electronic device can directly determine the actions that may need to be performed directly according to the pre-stored decision rules, without using an action prediction model to predict, and can meet user needs faster and more accurately.
  • the action prediction module can be stored in a different location.
  • the action prediction model is preset and stored in the first electronic device; or, the action prediction model is stored in the first electronic device.
  • the cloud servers accessible by electronic devices there is no limitation here.
  • the action prediction module can have different generation modes.
  • the action prediction model is pre-trained by the manufacturer; or, the action prediction model is the first electronic device according to the first
  • the data of the corresponding entity sequence, intent and action sequence stored in the electronic device is obtained by training; or, the action prediction model is obtained by training the first electronic device according to the data of the corresponding entity sequence, intent and action sequence shared by other users , There is no limitation here.
  • the decision rule may be stored in a different location.
  • the decision rule may be preset and stored in the first electronic device; or, the decision rule may be stored in the first electronic device. There are no restrictions on the accessed cloud server.
  • the decision rule is pre-set by the manufacturer; or, the decision rule is set by the first electronic device according to the user’s usage habits or usage scenarios; or, the decision rule is set by other Shared by the user; or the decision rule is obtained by the user from a third-party data service provider, which is not limited here.
  • the first time period has a corresponding relationship with the first trigger, and when the first electronic device determines the first trigger, the first time period corresponding to the first trigger can be determined.
  • the first data is input by the first electronic device from touch operation input, sensor data input, text data input, voice data input, video data input, and the first electronic device. At least two input methods are obtained from the input of transmission data of a smart device interconnected by an electronic device. It is understandable that, in some embodiments, the first data can also be obtained from more other data input methods, which is not limited here.
  • the first action to be executed includes one of the actions or services of starting the target application, starting the target service, loading the target application in the background, wirelessly connecting to the target device, and sending a notification message. It can be understood that, in some embodiments, the first action to be executed may also be other actions or services, which is not limited here.
  • an embodiment of the present application also provides an electronic device, as the first electronic device, the first electronic device includes: one or more processors and a memory; the memory and the one or more processors Coupled, the memory is used to store computer program code, the computer program code includes computer instructions, the one or more processors call the computer instructions to cause the first electronic device to execute: determine a first trigger; respond to the first trigger , Acquire first data within a first time period, and the first data is used to determine an entity, which is an object, thing or action that objectively exists in the real world and can be distinguished from each other; according to the first data, determine the first An entity sequence, the first entity sequence includes at least one entity; according to the first entity sequence, a first intention is determined, and the first intention is used to determine an action sequence; according to the first entity sequence and the first intention, the first intention is determined An action sequence, where the first action sequence includes a first action to be executed; the first action to be executed is executed.
  • the electronic device can obtain a complete description of the environment based on the environment perception of multiple devices and the multi-modal input of the user, and combine user input, environment perception and contextual information within a period of time to obtain a response to changes over time
  • a complete and unbiased intent system that can be expanded with changes in the environment. Decisions are made based on this, such as inferring the actions or services that the user wants to perform in the next period of time, so as to decide on which device to respond to the user’s What kind of demand so as to provide the user with the precise response or service decision he needs.
  • the first action to be executed includes a device identifier and an action to be executed
  • the one or more processors are specifically configured to invoke the computer instruction to make the first electronic device execute: determine the Whether the device identification in the first action to be executed is the device identification of the first electronic device; when it is determined that the device identification in the first action to be executed is the device identification of the first electronic device, execute the first action to be executed Otherwise, send a first instruction to the second electronic device corresponding to the device identifier in the first action to be executed, and the first instruction is used to instruct the second electronic device to execute the first action to be executed.
  • the one or more processors are further configured to call the computer instructions to cause the first electronic device to execute: determine the set of abnormal feature vectors whose occurrence frequency exceeds the preset first frequency threshold as A new entity, where the abnormal feature vector set is a feature vector set that cannot be recognized as an entity whose degree of discrimination from a feature vector set that can be recognized as an entity exceeds a preset discrimination threshold during entity recognition.
  • the one or more processors are also used to call the computer instructions to make the first electronic device execute: determine the abnormal action whose frequency of occurrence exceeds the preset second frequency threshold as a new Intent, where the abnormal action is an action that has never occurred and is not in the action sequence corresponding to the existing intent; according to the entity sequence identified before the abnormal action occurs, the corresponding relationship between the new intention and the entity sequence is established .
  • the one or more processors are specifically configured to call the computer instructions to make the first electronic device execute: extract feature vectors from the first data to obtain a first feature vector set,
  • the first feature vector set includes all feature vectors extracted from the first data, and the feature vector is used to represent the features of the first data;
  • the first feature vector set is input into the entity recognition model to obtain the first feature vector
  • the entity sequence, the entity recognition model is the correspondence between the feature vector and the entity obtained by training based on the entity data stored in the memory
  • the entity data is the storage form of the entity
  • the entity data includes at least the entity number and the entity’s number representing the entity Feature vector collection.
  • the first feature vector set is input into the entity recognition model, and after the entity is recognized, not only the recognized entities can be formed into the first entity sequence, but also the history of the entity recognition model can be output
  • the entity of and the entity identified this time together form the first entity sequence, which is not limited here.
  • the entity recognition model can be stored in different locations.
  • the entity recognition model is preset and stored in the memory; or, the entity recognition model is stored in the first electronic device and is accessible In the cloud server, there is no limitation here.
  • the entity recognition model can be generated in different ways.
  • the entity recognition model is pre-trained by the manufacturer; or, the entity recognition model is the first electronic device according to the memory
  • the stored entity data is obtained through training, which is not limited here.
  • the one or more processors are specifically configured to invoke the computer instruction to cause the first electronic device to execute: input the first entity sequence into the intent recognition model to obtain the first intent,
  • the intention recognition model is the correspondence relationship between the entity sequence and the intention obtained by training according to the corresponding entity sequence and the data of the intention.
  • the intent recognition model can be stored in different locations.
  • the intent recognition model is preset and stored in the memory; or, the intent recognition model is stored in the first electronic device and is accessible.
  • the cloud server there is no limitation here.
  • the intent recognition model can be generated in different ways.
  • the intent recognition model is pre-trained by the manufacturer; or, the intent recognition model is the first electronic device according to the memory The stored corresponding entity sequence and intention data are trained; or, the intention recognition model is obtained by training the first electronic device according to the corresponding entity sequence and intention data shared by other users, which is not limited here.
  • the one or more processors are specifically configured to call the computer instructions to make the first electronic device execute: input the first entity sequence into the action prediction model to obtain the first action sequence ,
  • the action prediction model is the entity sequence obtained by training according to the corresponding entity sequence, the data of the intention and the action sequence, and the corresponding relationship between the intention and the action sequence;
  • the one or more processors are specifically configured to invoke the computer instruction to make the first electronic device execute: according to a decision rule, determine that the first entity sequence corresponds to the first intention sequence In the first action sequence, the decision rule is the corresponding relationship between the entity sequence, the intention and the action sequence set according to the user's usage habits or usage scenarios.
  • the action prediction module can be stored in a different location.
  • the action prediction model is preset and stored in the memory; or, the action prediction model is stored in the first electronic device. There are no restrictions on the accessed cloud server.
  • the action prediction module can have different generation methods.
  • the action prediction model is pre-trained by the manufacturer; or, the action prediction model is the first electronic device according to the memory The stored corresponding entity sequence, intention and action sequence data are trained; or, the action prediction model is obtained by training the first electronic device according to the corresponding entity sequence, intention and action sequence data shared by other users, here Not limited.
  • the decision rule can be stored in different locations.
  • the decision rule is preset and stored in the memory; or, the decision rule is stored in a cloud accessible by the first electronic device.
  • the server there is no limitation here.
  • the decision rule is pre-set by the manufacturer; or, the decision rule is set by the first electronic device according to the user’s usage habits or usage scenarios; or, the decision rule is set by other Shared by the user; or the decision rule is obtained by the user from a third-party data service provider, which is not limited here.
  • the first time period has a corresponding relationship with the first trigger, and when the first trigger is determined, the first time period corresponding to the first trigger can be determined.
  • the first data is from touch operation input, sensor data input, text data input, voice data input, video data input, and the smart device interconnected with the first electronic device. At least two input methods are available in the input of the transmission data of the device. It is understandable that, in some embodiments, the first data can also be obtained from more other data input methods, which is not limited here.
  • the first action to be executed includes one of the actions or services of starting the target application, starting the target service, loading the target application in the background, wirelessly connecting to the target device, and sending a notification message. It can be understood that, in some embodiments, the first action to be executed may also be other actions or services, which is not limited here.
  • the embodiments of the present application also provide a chip, which is applied to an electronic device, the chip includes one or more processors, and the processor is used to invoke computer instructions to make the electronic device execute the second Aspect and the method described in any possible implementation of the second aspect.
  • the embodiments of the present application also provide a computer program product containing instructions.
  • the computer program product When the computer program product is run on an electronic device, the electronic device can execute any one of the second aspect and the second aspect. The method described in the implementation method.
  • an embodiment of the present application further provides a computer-readable storage medium, including instructions, when the instructions are executed on an electronic device, the electronic device is caused to execute any one of the second aspect and the second aspect.
  • a computer-readable storage medium including instructions, when the instructions are executed on an electronic device, the electronic device is caused to execute any one of the second aspect and the second aspect. The method described in the implementation method.
  • the embodiments of the present application provide an intention recognition method, which can obtain user perception data, and determine multiple candidate intents based on the user perception data and the stored knowledge graph, and then use a preset reinforcement learning algorithm , Determine the target intent from multiple candidate intents.
  • user perception data is used to represent user behavior information.
  • the user perception data may include multiple data, and at least two of the multiple data have different input modes.
  • multiple candidate intents can be determined according to the user perception data and the stored knowledge graph, and preset reinforcement learning is adopted. Algorithm to determine the target intent from multiple candidate intents. In this way, since the user perception data only represents the user's behavior information, and does not indicate the user's intention, it is possible to proactively identify the user's intention without the user's own intention, thereby improving the user experience.
  • the above method of "determining multiple candidate intentions based on user perception data and stored knowledge graphs" may include: the intention recognition device determines entities in the user perception data and the description data of the entities, and according to The entity and entity description data, as well as the knowledge graph, determine the user's status information and scene information. After that, the intention recognition device determines a plurality of candidate intentions corresponding to the state information and the scene information according to the corresponding relationship between the state information, the scene information, and the candidate intentions. Among them, the state information is used to indicate the current state of the user, and the scene information is used to indicate the environment the user is currently in.
  • the above-mentioned method of "using a preset reinforcement learning algorithm to determine a target intention from multiple candidate intentions” may include: the intention recognition device determines an intention rocker corresponding to the multiple candidate intentions one-to-one , And based on user perception data, state information, scene information, one-to-one correspondence with multiple candidate intentions, and a reinforcement learning algorithm to determine the target intention from multiple candidate intentions.
  • the intention recognition method provided in the embodiment of the present application may further include: the intention recognition device determines the intention corresponding to the target intention according to the user perception data, state information, scene information, and the intention rocker corresponding to the target intention. Confidence degree, and according to the intent confidence degree, determine the target interaction mode used to show the target’s intent. After that, the intention recognition device uses the target interaction mode to display the content of the target's intention. Among them, the intention confidence is used to indicate the degree of agreement between the target intention and the real intention.
  • the present application can select the target interaction that displays the target intention according to the confidence interval and the interaction mode corresponding to the confidence interval. Mode, alleviating the problem of degrading user experience caused by showing low-confidence intentions.
  • the above method of "determining the target interaction mode used by the display target according to the intention confidence level” may include: the intention recognition device determines the target to which the intention confidence level belongs in a plurality of pre-stored confidence intervals Confidence interval, and according to the business corresponding to the target intention, the target interaction mode is determined from the level of interaction mode corresponding to the target confidence interval.
  • a confidence interval corresponds to a level of interaction mode
  • a level of interaction mode includes one or more interaction modes.
  • the intent recognition method provided in the embodiment of the present application may further include: the intent recognition device uses the target interaction mode to display the content of the target intent within a preset period of time, and recognizes the target operation on the target intent. , And determine the target value corresponding to the target operation according to the target operation and preset rules. After that, the intention recognition device updates multiple candidate intentions according to the target value, and updates the parameters used to determine the target intention in the reinforcement learning algorithm. Among them, the target value is used to indicate the actual degree of conformity between the target intention and the real intention.
  • the mobile phone after displaying the schematic diagram, the mobile phone only considers whether the user clicks on the intention, but in actual applications, the user's feedback may include other operations besides whether to click, which leads to inaccurate feedback obtained by analysis.
  • the user's feedback may include other operations besides whether to click, which leads to inaccurate feedback obtained by analysis.
  • feedback operations within a preset time period there are many types of feedback operations, and different feedback operations can be used to obtain different target values, which increases the accuracy of feedback information.
  • the above-mentioned method of "update multiple candidate intentions according to the target value” may include: when the intention recognition device determines that the target value is less than a preset threshold, or when it is determined that the target value is less than the preset threshold If the number of times is equal to the preset number of times, delete the target intent among the multiple candidate intents.
  • rocker arm set in the prior art since the rocker arm set in the prior art is fixed, it includes all the intended rocker arms pre-stored in the mobile phone. However, in this application, it is realized that the rocker arm set changes with the change of the candidate intent, thereby realizing the rapid support of the user's interest transfer and intent change, and improving the user experience.
  • an embodiment of the present application also provides an intention recognition device, which includes modules for executing the intention recognition method of the third aspect or any one of the possible implementations of the third aspect. .
  • An embodiment of the present application also provides an intention recognition device, which includes a memory and a processor.
  • the memory and the processor are coupled.
  • the memory is used to store computer program code, and the computer program code includes computer instructions.
  • the intention recognition apparatus executes the intention recognition method as in any one of the third aspect and the possible implementation of the third aspect.
  • the embodiments of the present application also provide a chip system, which is applied to the intention recognition device mentioned in the third aspect.
  • the chip system includes one or more interface circuits and one or more processors.
  • the interface circuit and the processor are interconnected by wires; the interface circuit is used to receive signals from the memory of the intention recognition device and send signals to the processor, and the signals include computer instructions stored in the memory.
  • the intention recognition device executes the intention recognition method as in the first aspect and any one of its possible implementation manners.
  • the embodiments of the present application also provide a computer-readable storage medium.
  • the computer-readable storage medium includes computer instructions.
  • the intention recognition device executes the third Aspect and the intention recognition method of any possible implementation in the third aspect.
  • the embodiments of the present application also provide a computer program product.
  • the computer program product includes computer instructions.
  • the intent recognition device executes operations such as those in the third aspect and the third aspect. Intent recognition method of any possible implementation in the aspect.
  • an embodiment of the present application provides a model training method, which is applied to any node device among multiple node devices, and the node device is configured with a group coarse-grained model and a fine-grained model.
  • the method includes:
  • the node device obtains the mapping relationship between fine-grained labels and coarse-grained labels, and maps the fine-grained data in the training data set to coarse-grained data according to the mapping relationship; then inputs the coarse-grained data to the group coarse-grained model for training, and inputs the fine-grained data To the fine-grained model for training; the group coarse-grained model and the fine-grained model have their own update timing, the group coarse-grained model updates the group coarse-grained model through the joint learning of multiple node devices; the node device combines the group coarse-grained model and the fine-grained model.
  • the granularity model is used to obtain a joint model, the label space of the joint model is mapped to a fine-grained label, and the output result of the joint model is used to update the fine-grained model.
  • the label space of the sample data in the training data set in the node device is mapped to fine-grained labels.
  • coarse-grained labels are introduced.
  • the coarse-grained labels are used to unify the label space of each node device, thereby ensuring that the
  • each node device can be unified on the coarse-grained task, and multiple node devices can also perform joint training.
  • the node device obtains the mapping relationship between the fine-grained label and the coarse-grained label, and then maps the fine-grained data in the training data set to coarse-grained data according to the mapping relationship; the node device uses the coarse-grained data to locally train the group coarse-grained model, and passes The joint learning of multiple node devices updates the group coarse-grained model until the coarse-grained label converges, so that the coarse-grained model has group characteristics. And the node device uses the fine-grained data to be input to the fine-grained model for training, and the result (fine-grained label) output by the joint model based on the loss function is used to reversely update the fine-grained model until the fine-grained label converges.
  • the joint model in this application takes into account the group characteristics, and the fine-grained model of each node device can match the group coarse-grained model to the specific fine-grained label, so that the mark space of the joint model is the end-side corresponding fine-grained label Space, the joint model also takes into account the individual characteristics of each node device.
  • inputting coarse-grained data to the group coarse-grained model for training may specifically include: the node device inputs the coarse-grained data to the group coarse-grained model for training, and determines the first corresponding to the group coarse-grained model.
  • Information the first information may be gradients, model parameters (such as weight values), or models (network architecture and model parameters);
  • the update process of the group coarse-grained model may be: the node device sends the first information to the central control device; Then the node device receives the second information, the second information is used to update the group coarse-grained model, and the second information is obtained after the central control device integrates the received first information uploaded by multiple node devices.
  • each node device trains the group coarse-grained model through local data.
  • each node device only transmits its first information (such as parameter values) to the central control unit.
  • the central control device integrates the received parameter values, that is, integrates the characteristics of the local data in each node device among multiple node devices, and delivers the integrated parameter values
  • each node device can update the local group coarse-grained model according to the parameter value issued by the central control device, that is, complete an update, so that the group coarse-grained model has a group character.
  • the node device is also configured with an individual coarse-grained model; combining the group coarse-grained model and the fine-grained model to obtain a joint model may specifically include: combining the group coarse-grained model, individual coarse-grained model, and fine-grained model In order to obtain the joint model; the node device uploads the individual coarse-grained model to the central control device, and then the node device can receive the updated individual coarse-grained model sent by the central control device; where the updated individual coarse-grained model is: central control The device selects and integrates at least two individual coarse-grained models with a correlation degree higher than the threshold from the individual coarse-grained models uploaded by multiple node devices.
  • the group coarse-grained model, the individual coarse-grained model, and the fine-grained model are combined into an overall model.
  • the group coarse-grained model can mine the laws of the group and can provide a good starting point for the fine-grained model in the node device.
  • the combination of the coarse-grained model and the fine-grained model of the group includes:
  • the coarse-grained model and the fine-grained model are combined based on the weights of the group coarse-grained model and the weights of the fine-grained model.
  • the combination of the coarse-grained model and the fine-grained model based on the weight of the group coarse-grained model and the weight of the fine-grained model may include: in the output layer of the joint model, according to the mapping of the fine-grained label and the coarse-grained label Relationship, combining the weight value of each coarse-grained label in the label space of the coarse-grained model into the weight value of each fine-grained label in the label space of the fine-grained model.
  • the two models can be combined based on the weight of the group coarse-grained model and the weight of the fine-grained model, and the weight of the group coarse-grained model and the weight of the fine-grained model are added to obtain the weight of the overall model.
  • the weight of the fine-grained label is based on the weight of the coarse-grained label corresponding to the fine-grained label.
  • the weight of the fine-grained label is equivalent to an offset maintained by the fine-grained model, and the output of the overall model (joint model) is mapped to the individual fine-grained Tags enable end-to-end personalization of the output results of the joint model.
  • the node device mapping the fine-grained data in the training data set to coarse-grained data according to the mapping relationship may specifically include: the node device obtains the training data set, and the label space of the sample data in the training data set is fine-grained Label, and then, the node device replaces the label space of the sample data with the coarse-grained label according to the mapping relationship between the fine-grained label and the coarse-grained label to obtain the coarse-grained data.
  • the coarse-grained data is used to train the population coarse-grained model.
  • the joint model is an application prediction model
  • the coarse-grained label is the category label obtained after classification according to the function of the application
  • the fine-grained label is the name of the application
  • the sample data in the training data set is: time The name of the message and its corresponding application.
  • the method further includes: the node device obtains the current time information; the time information is input to the trained joint model, and the joint model outputs The prediction result is used to indicate the target application and preload the target application.
  • the joint model may be an application prediction model.
  • the node device predicts which application the user may use through the application prediction model, and preloads the target application, which saves the response time of starting the target application and improves the user experience.
  • the embodiment of the present application also provides another model training method, which is applied to a joint learning system.
  • the joint learning system includes multiple node devices and central control devices.
  • the node devices are configured with a group coarse-grained model and a fine-grained model.
  • Model the method is applied to the central control device, the central control device obtains the fine-grained labels of multiple node devices, the central control device classifies the multiple fine-grained labels, determines multiple categories, and uses the category as the coarse-grained label; and determines the fine-grained labels.
  • the model is trained, and the group coarse-grained model is updated through the joint learning of multiple node devices; the fine-grained data is input to the fine-grained model for training; the group coarse-grained model and the fine-grained model are combined to obtain a joint model.
  • the label space is a fine-grained label, and the output of the joint model is used to update the fine-grained model.
  • the method further includes: the central control device receives the first information sent by multiple node devices, and then the central control device integrates the received first information uploaded by the multiple node devices to obtain the first information Second information, and then send second information to multiple node devices, and the second information is used to update the coarse-grained group model.
  • each node device trains the group coarse-grained model through local data.
  • each node device only transmits its first information (such as parameter values) to the central control unit.
  • the central control device integrates the received parameter values, that is, integrates the characteristics of the local data in each node device among multiple node devices, and delivers the integrated parameter values
  • each node device can update the local coarse-grained group model according to the parameter value issued by the central control device, that is, complete an update, so that the local coarse-grained group model has a group character.
  • the node device is also configured with an individual coarse-grained model;
  • the central control device receives individual coarse-grained models sent by multiple node devices, and determines the correlation between the individual coarse-grained models uploaded by multiple node devices Then, select at least two target individual coarse-grained models with a correlation higher than the threshold from the individual coarse-grained models uploaded by multiple node devices and integrate them to obtain the updated individual coarse-grained model; finally, the updated individual coarse-grained model
  • the individual coarse-grained model is sent to the node device corresponding to the target individual coarse-grained model.
  • the group coarse-grained model, the individual coarse-grained model, and the fine-grained model are combined into an overall model.
  • the group coarse-grained model can mine the laws of the group and can provide a good starting point for the fine-grained model in the node device.
  • the individual coarse-grained model can bridge the gap between the group and the individual in a few cases.
  • determining the correlation between the individual coarse-grained models uploaded by multiple node devices may include: the central control device determines the user portrait of the user to which each node device belongs; and then determines the similarity of the user portraits Correlation between individual coarse-grained models of node devices.
  • individual coarse-grained models corresponding to users with the same or similar characteristics can be integrated according to user portraits, so that individual coarse-grained models can bridge the gap between group and individuality in a few cases.
  • determining the correlation between the individual coarse-grained models uploaded by multiple node devices may further include: the central control device determines the distribution information of the multiple coarse-grained tags output by each individual coarse-grained model; Then, the correlation between individual coarse-grained models is determined based on the distribution information.
  • the central control device does not need to obtain user-related data, and determines the correlation between individual coarse-grained models according to the distribution information of multiple coarse-grained tags output by the individual coarse-grained models, thereby protecting the privacy of users.
  • the embodiments of the present application also provide a node device, the node device is configured with a group coarse-grained model and a fine-grained model, and the node device includes a transceiver module and a processing module;
  • the transceiver module is used to obtain the mapping relationship between fine-grained labels and coarse-grained labels
  • the processing module is used to map the fine-grained data in the training data set to coarse-grained data according to the mapping relationship obtained by the transceiver module;
  • the processing module is also used to input coarse-grained data into the group coarse-grained model for training;
  • the transceiver module is used to update the coarse-grained group model through the joint learning of multiple node devices
  • the processing module is also used to input fine-grained data into the fine-grained model for training; combine the group coarse-grained model and the fine-grained model to obtain a joint model.
  • the label space of the joint model is mapped to a fine-grained label, and the output result of the joint model is used for Update the fine-grained model.
  • the processing module is also used to input coarse-grained data into the coarse-grained population model for training, and determine the first information corresponding to the coarse-grained population model;
  • the transceiver module is also used to send the first information to the central control device; and to receive the second information, the second information is obtained after the central control device integrates the received first information uploaded by multiple node devices; second The information is used to update the group coarse-grained model;
  • the node device also includes an individual coarse-grained model
  • the processing module is also used to combine the group coarse-grained model, individual coarse-grained model and fine-grained model to obtain a joint model.
  • the transceiver module is also used to upload the individual coarse-grained model to the central control device; and receive the updated individual coarse-grained model sent by the central control device; wherein the updated individual coarse-grained model It is: the central control device selects at least two individual coarse-grained models whose correlation degree is higher than the threshold from the individual coarse-grained models uploaded by multiple node devices and integrates them.
  • the processing module is also used to combine the coarse-grained model and the fine-grained model based on the weight value of the group coarse-grained model and the weight value of the fine-grained model.
  • the processing module is also used to calculate the weight value of each coarse-grained label in the label space of the coarse-grained model according to the mapping relationship between the fine-grained label and the coarse-grained label in the output layer of the joint model.
  • the weight value of each fine-grained label merged into the label space of the fine-grained model.
  • the processing module is also used to obtain a training data set.
  • the label space of the sample data in the training data set is a fine-grained label; according to the mapping relationship between the fine-grained label and the coarse-grained label, the sample data is The label space is replaced with coarse-grained labels to obtain coarse-grained data.
  • the joint model is an application prediction model
  • the coarse-grained label is the category label obtained after classification according to the function of the application
  • the fine-grained label is the name of the application.
  • the processing module is also used to obtain current time information; the time information is input to the trained joint model, the joint model outputs the prediction result, and the prediction result is used to indicate the target application; preload the target application .
  • the embodiments of the present application also provide a central control device, which is applied to a joint learning system.
  • the joint learning system includes multiple node devices and central control devices.
  • the node devices are configured with a group coarse-grained model and a fine-grained model.
  • the central control device includes a processing module and a transceiver module;
  • the transceiver module is used to obtain fine-grained labels of multiple node devices
  • the processing module is used to classify multiple fine-grained labels, determine multiple categories, and use the categories as coarse-grained labels; and determine the mapping relationship between fine-grained labels and coarse-grained labels;
  • the transceiver module is also used to send the mapping relationship to multiple node devices; so that the node device maps the fine-grained data in the training data set to coarse-grained data according to the mapping relationship; inputs the coarse-grained data to the group coarse-grained model for training, and Update the group coarse-grained model through the joint learning of multiple node devices; input fine-grained data into the fine-grained model for training; combine the group coarse-grained model and the fine-grained model to obtain a joint model, and the mark space of the joint model is fine-grained Label, the output result of the joint model is used to update the fine-grained model.
  • the transceiver module is configured to receive first information sent by multiple node devices
  • the processing module is also used to integrate the received first information uploaded by multiple node devices to obtain second information; the transceiver module is also used to send second information to multiple node devices, and the second information is used to update the group Coarse-grained model.
  • the node device is also configured with an individual coarse-grained model
  • the transceiver module is also used to receive individual coarse-grained models sent by multiple node devices;
  • the processing module is also used to determine the correlation between the individual coarse-grained models uploaded by multiple node devices; select at least two target individual coarse-grained models with a correlation higher than the threshold from the individual coarse-grained models uploaded by multiple node devices Perform integration to get the updated individual coarse-grained model;
  • the transceiver module is also used to send the updated individual coarse-grained model to the node device corresponding to the target individual coarse-grained model.
  • the processing module is also used to determine the user portrait of the user to which each node device belongs;
  • the processing module is also used to determine the correlation between the individual coarse-grained models of the node device according to the similarity of the user portrait.
  • the processing module is also used to determine the distribution information of multiple coarse-grained labels output by each individual coarse-grained model; determine the correlation between individual coarse-grained models based on the distribution information.
  • the embodiments of the present application also provide a node device, including a processor, the processor and a memory are coupled, the memory stores program instructions, and the above fourth aspect is implemented when the program instructions stored in the memory are executed by the processor. Any method.
  • an embodiment of the present application also provides a central control device, including a processor, the processor and a memory are coupled, the memory stores program instructions, and the above fourth is implemented when the program instructions stored in the memory are executed by the processor. Aspect method.
  • the embodiments of the present application also provide a computer-readable storage medium, including a program, which, when run on a computer, causes the computer to execute the method in any one of the foregoing fourth aspects.
  • an embodiment of the present application also provides a chip system, the chip system includes a processor, and is configured to support node devices to implement the functions involved in the fourth aspect.
  • the chip system further includes a memory, and the memory is used to store necessary program instructions and data of the node device, or used to store necessary program instructions and data of the central control device.
  • the chip system can be composed of chips, and can also include chips and other discrete devices.
  • the embodiments of the present application provide a neural network-based data processing method, which can be applied to a server in the process of generating simulation data, or a component of the server (such as a processor, a chip, or a chip system, etc.)
  • the server first inputs the test data to the first generator, and the first generator is processed to obtain the first simulation data; then, the server inputs the test data and the first simulation data to the first generator.
  • a discriminator after processing by the first discriminator, a first discrimination result is obtained, and the first discrimination result is used to indicate the difference between the test data and the first simulation data; thereafter, the server then according to the first discrimination result Update the weight coefficient of the first generator to obtain the second generator; finally, the server generates the second simulation data in the second generator.
  • the server updates and optimizes the weight coefficients in the first generator through the processing process of the first generator and the first discriminator in the generative confrontation neural network to obtain the second generator, and uses the characteristics of the generative confrontation network , Reduce the deviation between the simulated data generated in the generator and the original input test data, thereby improving the data quality of the simulated data generated by the neural network.
  • the method further includes: the server uses the first target simulation data to input a preset training network, and the prediction model is obtained through training.
  • the first target simulation data includes the second simulation data.
  • the server can use the second simulation data generated by the second generator obtained by the generative countermeasure network as part of the input data of the preset training network to train to obtain the prediction model, because the second simulation data The deviation from the original input test data is small. Therefore, the second simulation data participates in the training process of the training network, which can improve the prediction effect of the subsequent prediction model, so that the training in the simulation environment can obtain better predictions. Model.
  • the method further includes: the server inputs the second target simulation data into the prediction model, and the target prediction result is obtained through the prediction model processing, and the second target simulation data includes the second simulation data.
  • the server can use the second simulation data generated by the second generator obtained by the generative countermeasure network as part of the input data of the prediction model, that is, obtain the target prediction corresponding to the generated simulation data in the prediction model. As a result, the problem of too little training data in the prediction model is solved.
  • the method further includes: the server sends the prediction model to the client; then, the server receives the initial prediction result sent by the client, and the initial prediction result is the prediction model performed on the user operation data.
  • the server inputs the target prediction result and the initial prediction result to the second discriminator for training, and outputs the second discrimination result, which is used to indicate the difference between the target prediction result and the initial prediction result
  • the server updates the weight coefficient of the second generator according to the second discrimination result to obtain a third generator; finally, the server generates third simulation data in the third generator.
  • the server may send the prediction model to the client, and receive the initial prediction result obtained by the client using user operation data to train in the prediction model, and use the simulation data to obtain the target prediction in the prediction model.
  • the result and the initial prediction result are used as the input of the second discriminator to obtain the weight coefficient used to update the second generator, update the second generator to obtain the third generator, and generate the third generator in the third generator.
  • Simulation data is obtained by the server using the second discriminator to update the weight coefficients of the second generator.
  • the third simulation data can further utilize generative countermeasures.
  • the characteristics of the network further reduce the deviation between the third simulation data generated in the third generator and the original input test data, thereby further improving the data quality of the simulation data generated by the neural network.
  • the server updates the weight coefficient of the second generator according to the second discriminating result
  • obtaining the third generator includes: if the first condition is satisfied, updating the second generator according to the second discriminating result The weight coefficient of the generator to obtain the third generator; wherein, the first condition includes: when the empirical distribution measure between the target preset result and the initial prediction result is less than a first preset value; and/or, When the value of the loss function corresponding to the second discriminator is greater than the second preset value; and/or, when the loss function of the prediction model is less than the third preset value.
  • the server can perform the process of updating the weight coefficient of the second generator according to the second discrimination result when the above-mentioned first condition is satisfied, that is, through the restriction of the first condition, in the second discriminator and/or prediction model
  • the server only executes the process of updating the weight coefficient of the second generator only when the model effect of the second generator reaches a certain condition, which can further optimize the data quality of the third simulation data generated by the updated third generator.
  • the first target simulation data further includes the test data.
  • the server inputs into the preset training network for training to obtain the input data of the prediction model.
  • the first target simulation data may also include test data, which can further enrich the input of the training network, so that the training network can be trained more Multiple data features to improve the prediction effect of the prediction model in the subsequent execution of the prediction process.
  • the server updates the weight coefficient of the first generator according to the first discriminating result
  • obtaining the second generator includes: if the second condition is met, updating the first generator according to the first discriminating result The weight coefficient of the generator to obtain the second generator; wherein, the second condition includes: when the empirical distribution metric between the test data and the first simulation data is less than a fourth preset value; and/or, in the When the value of the loss function corresponding to the first discriminator is greater than the fifth preset value.
  • the server can perform the process of updating the weight coefficient of the first generator according to the first discrimination result when the above second condition is satisfied, that is, through the restriction of the second condition, the model effect of the first discriminator is reached
  • the server executes the process of updating the weight coefficient of the first generator, which can further optimize the data quality of the second simulation data generated by the updated second generator.
  • the method before generating the second simulation data in the second generator, if the second condition is not met, the method further includes: inputting the test data to the second generator, and The second generator obtains fourth simulation data after processing; the test data and the fourth simulation data are input to the first discriminator, and after processing by the first discriminator, a third discrimination result is obtained, the third discrimination result It is used to indicate the difference between the test data and the fourth simulation data; the weight coefficient of the second generator is updated according to the third discrimination result.
  • the server may input the test data to the second generator when the above-mentioned second condition is not met, and obtain the third discrimination result for updating the second generator through the further processing of the first discriminator, That is, the characteristics of the generative confrontation network can be further used to optimize the weight coefficient of the second generator.
  • the prediction model is an intention decision model.
  • the method can be applied in the process of discriminating intentional decision-making.
  • the prediction model can be an intentional decision-making model in the process, thereby providing a specific implementation method of the prediction model and improving the scheme. The achievability.
  • the embodiment of the present application also provides another neural network-based data processing method, which can be applied to the client in the process of generating simulation data, or a component of the client (such as a processor). , Chip or chip system, etc.), in this method, the client receives the prediction model from the server; then, the client obtains user operation data; after that, the client inputs the user operation data to the prediction model, and is trained Get the initial prediction result;
  • the client sends the initial prediction result to the server, and the initial prediction result is used as the input of the discriminator, and the discrimination result for updating the weight coefficient of the generator is obtained after processing by the discriminator.
  • the client can use user operation data as the input data of the prediction model sent by the server, and after training to obtain the initial prediction result, send the initial prediction result to the server, where the initial prediction result is used as the input of the discriminator After processing by the discriminator, the discriminant result for updating the weight coefficient of the generator is obtained, so that the server can use the characteristics of the generative confrontation network to reduce the deviation between the simulated data generated in the generator and the original input test data Therefore, the data quality of the simulation data generated by the neural network is improved; in addition, since the client only needs to send the initial prediction result corresponding to the user operation data to the server, compared with the way the client sends the user operation data to the server, it can avoid The user’s privacy is leaked, thereby enhancing the user experience.
  • the process for the client to obtain user operation data specifically includes: in response to the user operation, the client obtains the initial operation data corresponding to the user operation; thereafter, the client extracts the data characteristics of the initial operation data , Get the user operation data.
  • the client can obtain the user operation data input into the prediction model by obtaining the initial operation data corresponding to the user operation and performing feature extraction, which provides a specific implementation for the client to obtain user operation data. Ways to improve the feasibility of the solution.
  • an embodiment of the present application also provides a neural network-based data processing device, which includes:
  • the first processing unit is configured to input the test data to the first generator, and obtain the first simulation data after being processed by the first generator;
  • the second processing unit is used to input the test data and the first simulation data to the first discriminator, and obtain a first discrimination result after being processed by the first discriminator, and the first discrimination result is used to indicate the test data And the difference between the first simulation data;
  • the first update unit is configured to update the weight coefficient of the first generator according to the first discrimination result to obtain the second generator
  • the first generating unit is used to generate second simulation data in the second generator.
  • the first processing unit and the second processing unit use generative countermeasures against the processing procedures of the first generator and the first discriminator in the neural network
  • the first update unit updates the weight coefficients in the first generator Optimize to obtain the second generator, and generate the second simulation data in the second generator through the first generation unit, that is, use the characteristics of the generative countermeasure network to reduce the simulation data generated in the generator and the original input test The deviation between the data, thereby improving the data quality of the simulation data generated by the neural network.
  • the device further includes:
  • the first training unit is configured to use the first target simulation data to input a preset training network to train to obtain a prediction model, and the first target simulation data includes the second simulation data.
  • the device further includes:
  • the third processing unit is configured to input the second target simulation data into the prediction model, and obtain a target prediction result through the prediction model processing, and the second target simulation data includes the second simulation data.
  • the device further includes:
  • the sending unit is used to send the prediction model to the client
  • the receiving unit is configured to receive an initial prediction result sent by the client, where the initial prediction result is obtained by training the prediction model on user operation data;
  • the second training unit is used to input the target prediction result and the initial prediction result to a second discriminator for training, and output a second discrimination result, which is used to indicate the difference between the target prediction result and the initial prediction result Difference between
  • a second update unit configured to update the weight coefficient of the second generator according to the second discrimination result to obtain a third generator
  • the second generating unit is used to generate third simulation data in the third generator.
  • the second update unit is specifically configured to:
  • the weight coefficient of the second generator is updated according to the second discrimination result to obtain the third generator; wherein, the first condition includes:
  • the first target simulation data further includes the test data.
  • the first update unit is specifically configured to:
  • the weight coefficient of the first generator is updated according to the first discrimination result to obtain the second generator; wherein, the second condition includes:
  • the device further includes:
  • a fourth processing unit configured to input the test data to the second generator, and obtain fourth simulation data after being processed by the second generator
  • the fifth processing unit is used to input the test data and the fourth simulation data to the first discriminator, and obtain a third discrimination result after processing by the first discriminator, and the third discrimination result is used to indicate the test data And the difference between the fourth simulation data;
  • the third update unit is configured to update the weight coefficient of the second generator according to the third discrimination result.
  • the prediction model is an intention decision model.
  • an embodiment of the present application also provides a neural network-based data processing device, which includes:
  • the transceiver unit is used to receive the prediction model from the server;
  • the transceiver unit is used to obtain user operation data
  • the training unit is used to input the user operation data into the prediction model, and obtain an initial prediction result after training;
  • the transceiver unit is configured to send the initial prediction result to the server.
  • the initial prediction result is used as the input of the discriminator, and the discrimination result for updating the weight coefficient of the generator is obtained after processing by the discriminator.
  • the training unit may use user operation data as the input data of the prediction model sent by the server, and after training to obtain the initial prediction result, the transceiver unit sends the initial prediction result to the server, where the initial prediction result is used for As the input of the discriminator, the discriminating result used to update the weight coefficient of the generator is obtained through the processing of the discriminator, so that the server can use the characteristics of the generative confrontation network to reduce the simulation data generated in the generator and the original input test The deviation between the data, thereby improving the data quality of the simulation data generated by the neural network; in addition, since the client only needs to send the initial prediction results corresponding to the user operation data to the server, compared to the client sending the user operation data to the server In this way, the user’s privacy can be avoided and the user experience can be improved.
  • the transceiver unit is specifically configured to:
  • the data characteristics of the initial operation data are extracted to obtain the user operation data.
  • the embodiments of the present application also provide a server, including a processor, the processor and a memory are coupled, the memory stores program instructions, and when the program instructions stored in the memory are executed by the processor, the device realizes the above-mentioned fifth A neural network-based data processing method in aspect and any one of its implementations.
  • the device can be an electronic device (such as a terminal device or a server device); or can be a component of the electronic device, such as a chip.
  • the embodiments of the present application also provide a client, including a processor, the processor and a memory are coupled, the memory stores program instructions, and when the program instructions stored in the memory are executed by the processor, the device realizes the above-mentioned first
  • the neural network-based data processing method in the five aspects and any one of its implementations.
  • the device can be an electronic device (such as a terminal device or a server device); or can be a component of the electronic device, such as a chip.
  • the embodiments of the present application also provide a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program. When it runs on a computer, the computer executes the fifth aspect and any of the above-mentioned fifth aspects.
  • an embodiment of the present application further provides a circuit system, the circuit system includes a processing circuit, and the processing circuit is configured to execute the neural network-based data processing method in the fifth aspect and any one of its implementation manners.
  • the embodiments of the present application also provide a computer program that, when running on a computer, causes the computer to execute the neural network-based data processing method in the fifth aspect and any one of its implementations.
  • the embodiments of the present application also provide a chip system, which includes a processor, and is used to support the server to implement the functions involved in the fifth aspect and any one of its implementations, for example, sending Or process the data and/or information involved in the above methods.
  • the chip system also includes a memory and a memory for storing necessary program instructions and data for the data processing device or the communication device.
  • the chip system can be composed of chips, and can also include chips and other discrete devices.
  • an embodiment of the present application provides an intention recognition method, including: an electronic device determines a dot data sequence to be recognized, the dot data sequence to be recognized is composed of dot data, and the dot data includes a user recorded by the electronic device The operation data of the electronic device and/or the response data of the electronic device to the user's operation; the electronic device inputs the to-be-identified dot data sequence into the multi-instance learning model to obtain multiple sub-sequences; the multi-instance learning model is the one in the electronic device that has been used A multi-example learning model trained on the dot data sequence; the electronic device determines the intent of the first subsequence according to a preset intent rule, the first subsequence is a subsequence of the multiple subsequences, and the preset intent rule is used for Determine the intent of the sequence based on the dot data in the sequence.
  • the electronic device may adopt a trained multi-example learning model to divide the dot data sequence generated by the user operation as the dot data sequence to be recognized into multiple sub-sequences with smaller granularity. Then use the second preset rule to determine the intention of each subsequence. Since the multi-instance learning model used is trained using the user's own dot data, the sub-sequences divided by the multi-instance learning model are more in line with the user's personalized usage habits. Then, the second preset rule is used to determine the intent of each subsequence, so that the identified intent is more accurate.
  • the electronic device determining the dot data sequence to be identified specifically includes: in response to a continuous operation of the user, the electronic device generates a plurality of dot data; the electronic device determines the plurality of dot data as the to-be-identified The dot data sequence.
  • the dot data of the dot data sequence to be recognized may be composed of dot data generated by the continuous operation of the user. For such data, it is very difficult to determine the intention of each dot data using other intention recognition methods. However, after inputting it into the multi-example learning model in the embodiment of the present application, it can be split into multiple sub-sequences, and then the intent of each sub-sequence is determined separately, so that the recognized intent is more accurate.
  • the dot data sequence to be identified may also include dot data generated by discontinuous operations, which is not limited here.
  • the electronic device may compose the dot data generated within a preset time period into the dot data sequence to be identified;
  • the electronic device may, when the unrecognized dot data accumulates to a preset cumulative number, combine all the unidentified dot data up to the preset cumulative number to form the dot data sequence to be recognized.
  • the method before the step of determining the dot data sequence to be recognized by the electronic device, the method further includes: the electronic device uses the initial dot data sequence to train a preset multi-instance learning model to obtain the multi-instance learning model;
  • the dot data sequence includes dot data generated by the user using the electronic device, and/or factory preset dot data.
  • the electronic device uses the initial dot data sequence to train a preset multi-example learning model to obtain the multi-example learning model, which specifically includes: the electronic device splits the initial dot data sequence into A plurality of sub-sequences; the preset splitting rule is used to divide the dot data sequence into different sub-sequences, and a sub-sequence can at least determine a clear intention according to the preset intention rule; the electronic device divides the multiple sub-sequences The sequence is used as a plurality of sequences to be processed, and training data is extracted from the plurality of sequences to be processed; the electronic device uses the training data to train the preset multi-example learning model to obtain the multi-example learning model.
  • the electronic device can use the initial dot data sequence to train a preset multi-instance learning model, thereby obtaining a usable multi-instance learning model. There is no need to manually label the dot data, which improves the labeling efficiency and scope of the dot data. , Saving time and cost.
  • the method further includes: the electronic device uses the to-be-recognized dot data sequence to train the multi-instance learning model, and update the multi-instance learning model.
  • the electronic device may use the to-be-recognized dot data sequence to train the multi-example learning model, and update the multi-example learning model through incremental training, which improves the accuracy of splitting subsequences of the multi-example learning model.
  • an embodiment of the present application also provides an electronic device, the electronic device includes: one or more processors and a memory; the memory is coupled with the one or more processors, and the memory is used to store the computer Program code, the computer program code includes computer instructions, the one or more processors call the computer instructions to make the electronic device execute: determine the dot data sequence to be identified, the dot data sequence to be identified is composed of dot data, the The dot data includes the user's operation data recorded by the electronic device and/or the response data of the electronic device to the user's operation; the dot data sequence to be identified is input into the multi-example learning model to obtain multiple sub-sequences; the multi-example learning model is The multi-example learning model trained with the dot data sequence in the electronic device; the intent of the first subsequence is determined according to the preset intent rule.
  • the first subsequence is a subsequence of the multiple subsequences, and the preset Intention rules are used to determine
  • the electronic device may adopt a trained multi-example learning model to divide the dot data sequence generated by the user operation as the dot data sequence to be recognized into multiple sub-sequences with smaller granularity. Then use the second preset rule to determine the intention of each subsequence. Since the multi-instance learning model used is trained using the user's own dot data, the sub-sequences divided by the multi-instance learning model are more in line with the user's personalized usage habits. Then, the second preset rule is used to determine the intent of each subsequence, so that the identified intent is more accurate.
  • the one or more processors are specifically configured to invoke the computer instructions to cause the electronic device to execute: in response to a continuous operation of the user, the electronic device generates a plurality of dot data; One dot data is determined as the dot data sequence to be identified.
  • the dot data sequence to be identified may also include dot data generated by discontinuous operations, which is not limited here.
  • the electronic device may compose the dot data generated within a preset time period into the dot data sequence to be identified;
  • the electronic device may, when the unrecognized dot data accumulates to a preset cumulative number, combine all the unidentified dot data up to the preset cumulative number to form the dot data sequence to be recognized.
  • the one or more processors are also used to call the computer instructions to make the electronic device execute: use the initial dot data sequence to train a preset multi-instance learning model to obtain the multi-instance learning model;
  • the dot data sequence includes dot data generated by the user using the electronic device, and/or factory preset dot data.
  • the one or more processors are specifically configured to call the computer instructions to make the electronic device execute: split the initial dot data sequence into multiple sub-sequences according to a preset split rule; Suppose the split rule is used to divide the dot data sequence into different sub-sequences, and one sub-sequence can at least determine a clear intention according to the preset intent rule; the multiple sub-sequences are regarded as multiple to-be-processed sequences, from the Extract training data from multiple sequences to be processed; use the training data to train the preset multi-instance learning model to obtain the multi-instance learning model.
  • the one or more processors are further configured to call the computer instructions to make the electronic device execute: use the to-be-identified dot data sequence to train the multi-instance learning model, and update the multi-instance learning Model.
  • the embodiments of the present application also provide a chip system, the chip system is applied to an electronic device, the chip system includes one or more processors, the processor is used to call computer instructions to make the electronic device execute The method described in the sixth aspect and any possible implementation manner of the sixth aspect.
  • the embodiments of the present application also provide a computer program product containing instructions.
  • the computer program product When the computer program product is run on an electronic device, the electronic device can execute any one of the sixth aspect and the sixth aspect. The method described in the implementation method.
  • an embodiment of the present application further provides a computer-readable storage medium, including instructions, which when the foregoing instructions run on an electronic device, cause the electronic device to execute any one of the sixth aspect and the sixth aspect.
  • the method described in the implementation method is not limited to:
  • the embodiment of the present application also provides a multi-example learning model training method, including: taking multiple sub-sequences or multiple sub-sequences as multiple to-be-processed sequences, and extracting training from the multiple to-be-processed sequences Data; the multiple sub-sequences are obtained by dividing the initial dot data sequence by the electronic device according to the first preset rule, and the multiple sub-sequences are obtained by the electronic device inputting the dot data sequence into the multi-example learning model and then outputting; the preset split The scoring rules are used to divide the dot data sequence into different scoring sequences, and a scoring sequence can determine at least one clear intention according to the preset intent rule; the preset intent rule is used to determine the intent of the sequence based on the dot data in the sequence;
  • the dot data includes the user's operation data recorded by the electronic device and/or the response data of the electronic device to the user's operation; the training data includes the package label and the feature vector
  • the training device can directly extract training data from the sequence to be processed to train the multi-example learning model, without the need to manually label the dotted data as the training data, which saves the training data labeling time and improves the training device Training efficiency.
  • the method further includes: inputting the plurality of sequences to be processed into the multi-instance learning model to obtain a plurality of subsequences; determining the current round of training The value of the loss function of the subsequent multi-instance learning model; determine the reduction of the value of the loss function of the multi-instance learning model obtained after this round of training compared to the value of the loss function of the multi-instance learning model obtained after the previous round of training Whether the small range is less than the preset reduction range; when it is determined that the reduction range is not less than the preset reduction range, the multiple sub-sequences are regarded as multiple to-be-processed sequences, and the electronic device executes multiple sub-sequences or multiple sub-sequences as multiple
  • the sequence to be processed is a step of extracting training data from the plurality of sequences to be processed; when it is determined that it is less than the preset reduction range, it is
  • iterative training may be used to train the multi-instance learning model to obtain a more accurate multi-instance learning model.
  • the method further includes: inputting a newly added dot data sequence into the multi-example learning model to obtain multiple subsequences; the newly added dot data sequence is a dot data sequence composed of newly added dot data in the electronic device Use the multiple subsequences as multiple to-be-processed sequences, and extract training data from the multiple to-be-processed sequences; use the training data to train the multi-example learning model, and update the multi-example learning model.
  • the electronic device can use newly added dot data to train the multi-instance learning model, and update the multi-instance learning model through incremental training, which improves the accuracy of splitting subsequences of the multi-instance learning model.
  • the method further includes: determining the value of the loss function of the multi-instance learning model after this round of training ; Determine whether the value of the loss function of the multi-instance learning model obtained after this round of training decreases less than the preset decrease compared to the value of the loss function of the multi-instance learning model obtained after the previous round of training; When it is determined not to be less than the preset reduction range, the multiple subsequences are regarded as multiple to-be-processed sequences, the multiple sub-sequences are regarded as multiple to-be-processed sequences, and the step of extracting training data from the multiple to-be-processed sequences is performed When it is determined that it is less than the preset reduction range, it is determined that the multi-example learning model obtained in this round of training is the completed multi-example learning model, and the multi-example learning model is updated.
  • iterative training can be used to perform incremental training on the multi-instance learning model to obtain a more accurate multi-instance learning model.
  • extracting the training data from the multiple to-be-processed sequences specifically includes: determining examples and example labels in the multiple to-be-processed sequences; the example is composed of two adjacent dot data; the example The label is used to indicate that the example is a positive example or a negative example; the package and package labels are determined according to the multiple to-be-processed sequences, the example, and the example labels; the package label is used to indicate that the package is a positive or negative package; The package includes an example of the dot data in the same sequence to be processed; the negative package includes the last dot data in a sequence to be processed and the first sequence in the next sequence to be processed that is continuous with the sequence to be processed An example of dot data composition; extract the feature vector matrix of each package, and use the feature vector matrix of each package and the corresponding package label as the training data.
  • the self-labeling of the training data can be realized by determining the example and the label of the example, determining the package and the package label, and extracting the feature vector matrix of each package and the corresponding package label as the training data. Mark the efficiency.
  • an embodiment of the present application also provides a training device, the training device includes: one or more processors and a memory; the memory is coupled with the one or more processors, and the memory is used to store the computer Program code, the computer program code includes computer instructions, the one or more processors call the computer instructions to make the training device execute: multiple sub-sequences or multiple sub-sequences as multiple to-be-processed sequences, from the multiple to-be-processed sequences Extracting training data from the processing sequence; the multiple sub-sequences are obtained by dividing the initial dot data sequence by the electronic device according to the first preset rule, and the multiple sub-sequences are obtained by the electronic device inputting the dot data sequence into the multi-example learning model and then outputting;
  • the preset split rule is used to divide the dot data sequence into different sub-sequences, and a sub-sequence can determine at least one clear intention according to the preset intent rule; the preset intent rule is
  • the training device can directly extract training data from the sequence to be processed to train the multi-example learning model, without the need to manually label the dotted data as the training data, which saves the training data labeling time and improves the training device Training efficiency.
  • the one or more processors are also used to call the computer instructions to make the training device execute: input the multiple to-be-processed sequences into the multi-example learning model to obtain multiple sub-sequences; The value of the loss function of the multi-instance learning model after the round of training; determine the value of the loss function of the multi-instance learning model obtained after this round of training compared to the value of the loss function of the multi-instance learning model obtained after the previous round of training Whether the reduction range of is smaller than the preset reduction range; when it is determined that it is not smaller than the preset reduction range, the multiple subsequences are regarded as multiple to-be-processed sequences, and the electronic device executes multiple subsequences or multiple subsequences as multiple subsequences.
  • the step of extracting training data from the multiple sequences to be processed when it is determined that it is less than the preset reduction range, the multi-instance learning model obtained in this round of training is determined to be the multi-instance learning model that has been trained .
  • the one or more processors are also used to call the computer instructions to make the training device execute: input the newly added dot data sequence into the multi-example learning model to obtain multiple sub-sequences; the newly added dot data sequence
  • the data sequence is a dot data sequence composed of newly added dot data in the electronic device; the multiple sub-sequences are used as multiple to-be-processed sequences, and training data is extracted from the multiple to-be-processed sequences; the training data is used for multiple examples
  • the learning model is trained, and the multi-example learning model is updated.
  • the one or more processors are also used to call the computer instructions to cause the training device to execute: determine the value of the loss function of the multi-instance learning model after the current round of training; determine the value compared to the previous one The value of the loss function of the multi-instance learning model obtained after the round of training, whether the reduction of the value of the loss function of the multi-instance learning model obtained after the current round of training is less than the preset reduction; when it is determined not to be less than the preset reduction When the amplitude is small, the multiple subsequences are regarded as multiple to-be-processed sequences, the multiple sub-sequences are used as multiple to-be-processed sequences, and the step of extracting training data from the multiple to-be-processed sequences is performed; when it is determined that it is less than the preset When the amplitude is reduced, it is determined that the multi-instance learning model obtained in this round of training is the completed multi-instance learning model, and the multi-instance
  • the one or more processors are specifically configured to invoke the computer instructions to make the training device execute: determine examples and example tags in the multiple to-be-processed sequences; the example is composed of two adjacent Dot data composition; the sample label is used to indicate that the sample is a positive sample or a negative sample; the package and package label are determined according to the multiple to-be-processed sequences, the sample and the sample label; the package label is used to indicate that the package is a positive package Or negative packet; the positive packet includes an example of the dot data in the same sequence to be processed; the negative packet includes the last dot data in a sequence to be processed and the next to be processed consecutive to the sequence to be processed An example of the first dot data composition in the sequence; extract the feature vector matrix of each packet, and use the feature vector matrix of each packet and the corresponding packet label as the training data.
  • an embodiment of the present application also provides a method for generating training data, including: determining examples and example labels in multiple sequences to be processed; the multiple sequences to be processed are multiple sub-sequences or multiple sub-sequences
  • the multiple sub-sequences are obtained by dividing the initial dot data sequence by the electronic device according to the first preset rule, and the multiple sub-sequences are obtained by the electronic device inputting the dot data sequence into the multi-instance learning model and then outputting; the first preset The rule is used to divide the dot data sequence into different sub-sequences, and a sub-sequence can determine at least one clear intention according to the second preset rule; the second preset rule is used to determine the intent of the sequence according to the dot data in the sequence
  • the example is composed of two adjacent dot data; the dot data includes the user's operation data recorded by the electronic device and/or the response data of the electronic device to the user's operation; the example tag is used to indicate that
  • the training device can determine the package and package label by extracting examples and example labels from the sequence to be processed, and then extract the feature vector matrix of each package, and combine the feature vector matrix of each package with the corresponding package label.
  • the training data the self-labeling of the training data is realized, and the labeling efficiency of the training data is improved.
  • extracting the feature vector matrix of each package and using the feature vector matrix of each package and the corresponding package label as the training data specifically includes: extracting the J dimension of each example in each package separately Eigenvector, where J is a positive integer; J-dimensional eigenvectors of K examples in a package form the eigenvector matrix of the package, and the eigenvector matrix of the package and the package label of the package are used as one of the training data Data, the K is a positive integer.
  • the J-dimensional feature vector is used to represent: the text feature of the example, and/or, the context feature of the example, and or, the unique features of each dot data in the example, and/or, the dot data in the example Statistical Features.
  • the J-dimensional feature vector of the example may include features of various aspects of the example, so that the training data contains more information, and the training effect of using the training data for multi-example learning model training is improved.
  • an embodiment of the present application also provides a training device, the training device includes: one or more processors and a memory; the memory is coupled with the one or more processors, and the memory is used to store the computer Program code, the computer program code includes computer instructions, the one or more processors call the computer instructions to cause the training device to execute: determine examples and example tags in a plurality of sequences to be processed; the plurality of sequences to be processed are multiple Sub-sequences or multiple sub-sequences; the multiple sub-sequences are obtained by dividing the initial dot data sequence by the electronic device according to the first preset rule, and the multiple sub-sequences are output by the electronic device after inputting the dot data sequence into the multi-example learning model Obtained; the first preset rule is used to divide the dot data sequence into different sub-sequences, and a sub-sequence can at least determine a clear intention according to the second preset rule; the second preset rule is used according to the
  • the training device can determine the package and package label by extracting examples and example labels from the sequence to be processed, and then extract the feature vector matrix of each package, and combine the feature vector matrix of each package with the corresponding package label.
  • the training data the self-labeling of the training data is realized, and the labeling efficiency of the training data is improved.
  • the one or more processors are specifically configured to invoke the computer instructions to make the training device execute: extract the J-dimensional feature vector of each example in each package, where J is a positive integer;
  • the J-dimensional eigenvectors of K examples in a package constitute the eigenvector matrix of the package, and the eigenvector matrix of the package and the package label of the package are used as one of the training data in the training data, and the K is a positive integer.
  • the J-dimensional feature vector is used to represent: the text feature of the example, and/or, the context feature of the example, and or, the unique features of each dot data in the example, and/or, the dot data in the example Statistical Features.
  • an embodiment of the present application provides a method for executing a rule engine.
  • the method may include: determining the first fact data input into the rule engine; and obtaining the first fact data from the memory according to the first attribute of the first fact data.
  • the semantic object matches the first fact data.
  • the first attribute is used to characterize the change frequency of the first fact data; the second fact data input into the rule engine is determined; the second fact data is obtained from the file according to the second attribute of the second fact data.
  • the second semantic object matches the second fact data, and the second attribute is used to characterize the frequency of change of the second fact data, where the second attribute is different from the first attribute; the first matching result and the second matching result corresponding to the first fact data The second matching result corresponding to the fact data determines whether to perform the first operation.
  • the semantic object based on the attributes of the fact data, it is determined to load the semantic object from the memory or the file, and based on the determined semantic object to match the fact data, so that a part of the semantic object used to match the fact data in the rule engine can be stored in the memory , The other part of the semantic objects used to match the fact data is stored in the file, which can release some redundant memory, reduce the memory overhead during the operation of the rule engine, and improve the ability of the rule engine.
  • the rule engine includes a first node, and the first node includes at least a first type node and a second type node, where the first type node is related to the first attribute, and the second type node is related to the second type node.
  • obtaining the first semantic object from the memory to match the first fact data includes: according to the first semantic index of the first type node corresponding to the first attribute, from the first semantic index Obtain the first semantic object from the memory indicated by a semantic index, and match the first fact data based on the first semantic object; obtain the second semantic object versus the second fact data from the file according to the second attribute of the second fact data
  • the matching specifically includes: obtaining the second semantic object from the file indicated by the second semantic index according to the second semantic index of the second type node corresponding to the second attribute, and matching the second fact data based on the second semantic object .
  • the method before obtaining the first semantic object from the memory indicated by the first semantic index according to the first semantic index of the first type node corresponding to the first attribute, the method further includes: determining the first type node The number of changes of the recorded first fact data is different from the number of changes of the first fact data input to the rule engine.
  • the semantic object is loaded from the memory for matching, avoiding frequent loading
  • the case of semantic objects improves the matching efficiency.
  • the method before obtaining the second semantic object from the file indicated by the second semantic index according to the second semantic index of the second type node corresponding to the second attribute, the method further includes: determining the second type node The number of changes of the recorded second fact data is different from the number of changes of the second fact data input to the rule engine.
  • the method further includes one or more of the following: determining that the number of changes of the first fact data recorded in the node of the first type is the same as the number of changes of the first fact data input to the rule engine , Use the previous matching result recorded by the node of the first type as the first matching result; determine that the number of changes of the second fact data recorded in the node of the second type is the same as the number of changes of the second fact data input to the rule engine, use The previous matching result recorded by the second type node is taken as the second matching result.
  • the method further includes one or more of the following: when reconstructing the rules in the rule engine, determining the first change times of the first fact data recorded in the first type node; If the number of changes is less than the preset number threshold, switch the node of the first type to the node of the second type; when reconstructing the rules in the rule engine, determine the second number of changes of the second fact data recorded in the second type of node; If the second number of changes is greater than the preset number threshold, the second type of node is switched to the first type of node.
  • the node type is switched, and the semantic object corresponding to the fact data with a low frequency of change is prevented from occupying memory persistently.
  • the problem of slow loading efficiency when the semantic object corresponding to the fact data with a high frequency of change is loaded from the file is also avoided.
  • the rule engine includes a second node; according to the first matching result corresponding to the first fact data and the second matching result corresponding to the second fact data, determining whether to perform the first operation specifically includes: When the first matching result indicates that the matching is successful, and the second matching result indicates that the matching is successful, the third semantic object is obtained from the file indicated by the semantic index of the second node, and the first operation corresponding to the third semantic object is performed.
  • the semantic object required to be executed by the corresponding rule can be persisted in the file, which prevents the semantic object from occupying memory for a long time, and can release some redundant memory.
  • the first fact data includes at least one of time and location; the second fact data includes at least one of age and season.
  • the first operation includes one or more of the following: reminding the weather, reminding the road condition, reminding the user to rest, entertain or work, recommend a manual, and preload actions or services.
  • an embodiment of the present application also provides a rule engine, the rule engine includes: a first node, the first node includes at least a first type node and a second type node; the first type node is used to input The first attribute of the first fact data in the rule engine is obtained, the first semantic object is obtained from the memory to match the first fact data, and the first matching result is obtained. The first attribute is used to characterize the change frequency of the first fact data; The second type node is used to obtain the second semantic object from the file to match the second fact data according to the second attribute of the second fact data input to the rule engine to obtain the second matching result, and the second attribute is used to represent the second attribute of the second fact data. 2.
  • the rule engine may be an artificial intelligence (Artificial Intelligence, AI) model.
  • the semantic objects of some nodes are stored in the memory in the rule engine, and the semantic objects of another part of the nodes are stored in the file, thereby releasing some redundant memory, reducing the memory overhead during the operation of the rule engine, and improving The ability of the rules engine.
  • the first type node is specifically used to obtain the first semantic object from the memory indicated by the first semantic index according to the first semantic index corresponding to the first attribute, and to obtain the first semantic object based on the first semantic object pair The first fact data is matched; the second type node is specifically used to obtain the second semantic object from the file indicated by the second semantic index according to the second semantic index corresponding to the second attribute, and to match the second semantic object based on the second semantic object Fact data is matched.
  • the first type of node before the first type of node obtains the first semantic object from the memory to match the first fact data, it is also used to determine the number of changes of the first fact data recorded in the first type of node and The change times of the first fact data input to the rule engine are different.
  • the second type node is also used to determine the number of changes of the second fact data recorded in the second type node before obtaining the second semantic object from the file and matching the second fact data.
  • the number of changes of the second fact data input to the rule engine is different.
  • the first type of node is also used when the number of changes of the first fact data recorded in the first type of node is the same as the number of changes of the first fact data input to the rule engine, use The previous matching result recorded by the node of the first type is taken as the first matching result.
  • the second type of node is also used when the number of changes of the second fact data recorded in the second type of node is the same as the number of changes of the second fact data input to the rule engine, use The previous matching result recorded by the second type node is taken as the second matching result.
  • the rule engine further includes a second node, and the second node is used for when the first matching result indicates that the matching is successful, and the second matching result indicates that the matching is successful, the semantic index of the second node indicates Obtain the third semantic object from the file, and execute the first operation corresponding to the third semantic object.
  • the first fact data includes at least one of time and location; the second fact data includes at least one of age and season.
  • the first operation includes one or more of the following: reminding the weather, reminding the road condition, reminding the user to rest, entertain or work, recommend a manual, and preload actions or services.
  • an embodiment of the present application also provides a device for executing a rule engine, including: at least one memory, used to store a program; at least one processor, used to execute a program stored in the memory, when the program stored in the memory When executed, the processor is used to execute the method provided in the seventh aspect.
  • the embodiments of the present application also provide a computer storage medium, in which instructions are stored in the computer storage medium, and when the instructions are executed on a computer, the computer executes the method provided in the seventh aspect.
  • the embodiments of the present application also provide a computer program product containing instructions, which when the instructions run on a computer, cause the computer to execute the method provided in the seventh aspect.
  • an embodiment of the present application also provides a rule engine execution device, which runs computer program instructions to execute the method provided in the seventh aspect.
  • the device may be a chip or a processor.
  • the device may include a processor, which may be coupled with a memory, read instructions in the memory and execute the method as provided in the seventh aspect according to the instructions.
  • the memory may be integrated in the chip or the processor, or may be independent of the chip or the processor.
  • Fig. 1 is a schematic diagram of a scene of intention recognition in the prior art
  • Figure 2 is a schematic diagram of an entity recognition scenario in an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a relationship between an intention and a slot in an embodiment of the present application
  • FIG. 4 is a schematic diagram of a scenario in which dot data is generated in an embodiment of the present application.
  • FIG. 5 is a schematic diagram of another scenario for generating dot data in an embodiment of the present application.
  • Fig. 6 is an exemplary schematic diagram of a dot data sequence in an embodiment of the present application.
  • FIG. 7 is an exemplary schematic diagram of dividing the dot data sequence into sub-sequences in an embodiment of the present application.
  • FIG. 8 is another exemplary schematic diagram of dividing the dot data sequence into sub-sequences in an embodiment of the present application.
  • Fig. 9 is an exemplary schematic diagram of using a multi-instance learning model in an embodiment of the present application.
  • FIG. 10 is an exemplary schematic diagram of dot data in an embodiment of the present application.
  • FIG. 11 is a schematic diagram of the basic structure of a knowledge graph provided by an embodiment of the present application.
  • FIG. 12 is a formal schematic diagram of the model learning target on the node device side in an embodiment of the present application.
  • FIG. 13 is a schematic diagram of an exemplary structure of an electronic device in an embodiment of the present application.
  • Fig. 14 is a block diagram of an exemplary software structure of an electronic device in an embodiment of the present application.
  • FIG. 15 is a block diagram of an exemplary software structure of an intention recognition decision-making system in an embodiment of the present application.
  • FIG. 16 is a schematic diagram of an intention recognition scene in an embodiment of the present application.
  • FIG. 17 is a schematic diagram of a rule topology diagram in a rule engine provided by an embodiment of the present application.
  • FIG. 18 is a schematic diagram of the structure of a mode node in the rule topology diagram shown in FIG. 17;
  • FIG. 19 is a schematic diagram of type switching between mode nodes and result nodes in the rule topology diagram shown in FIG. 17;
  • 20 is a schematic diagram of another rule topology diagram in the rule engine provided by an embodiment of the present application.
  • FIG. 21 is a schematic flowchart of a method for executing a rule engine according to an embodiment of the present application.
  • FIG. 22 is a schematic structural diagram of a rule engine provided by an embodiment of the present application.
  • FIG. 23 is a schematic diagram of a data flow in the training method of a multi-example learning model in an embodiment of the present application.
  • FIG. 24 is a schematic flowchart of a training method of a multi-example learning model in an embodiment of the present application.
  • FIG. 25 is an exemplary schematic diagram of determining an example and an example label in an embodiment of the present application.
  • FIG. 26 is an exemplary schematic diagram of determining a package and a package label in an embodiment of the present application.
  • FIG. 27 is an exemplary schematic diagram of extracting a feature vector matrix of a packet in an embodiment of the present application.
  • FIG. 28 is an exemplary schematic diagram of training a multi-example learning model in an embodiment of the present application.
  • FIG. 29 is an exemplary schematic diagram of a multi-example learning model dividing a sequence to be processed into sub-sequences in an embodiment of the present application.
  • FIG. 30 is an exemplary schematic diagram of iterative training of a multi-example learning model in an embodiment of the present application.
  • FIG. 31 is an exemplary schematic diagram of iteratively generating sub-sequences of a multi-example learning model in an embodiment of the present application
  • FIG. 32 is a schematic diagram of a data flow in the update process of a multi-example learning model in an embodiment of the present application.
  • FIG. 33 is a schematic flowchart of an update process of a multi-example learning model in an embodiment of the present application.
  • FIG. 34 is an interactive schematic diagram of a training method of a multi-example learning model in an embodiment of the present application.
  • FIG. 35 is an interactive schematic diagram of the update training process of the multi-example learning model in the embodiment of the present application.
  • FIG. 36 is a schematic diagram of an artificial intelligence main body framework provided by an embodiment of the present application.
  • FIG. 37 is a schematic diagram of an application environment provided by an embodiment of the present application.
  • FIG. 38 is a schematic diagram of another application environment provided by an embodiment of the present application.
  • FIG. 39 is a schematic diagram of a neural network-based data processing method provided by an embodiment of the present application.
  • FIG. 40 is another schematic diagram of a neural network-based data processing method provided by an embodiment of the present application.
  • Figure 41a is another schematic diagram of a neural network-based data processing method provided by an embodiment of the present application.
  • Figure 41b is another schematic diagram of a neural network-based data processing method provided by an embodiment of the present application.
  • FIG. 42 is a schematic diagram of an architecture of a joint learning system in an embodiment of the present application.
  • FIG. 43 is a schematic flowchart of steps of an embodiment of a model training method in an embodiment of the present application.
  • FIG. 44a is a schematic diagram of a group coarse-grained model and a coarse-grained label mapping in an embodiment of the present application
  • FIG. 44b is a schematic diagram of the joint model of the group coarse-grained model and the fine-grained model and the fine-grained label mapping in an embodiment of the present application;
  • FIG. 45 is a schematic diagram of the end-cloud collaboratively updating the group coarse-grained model and the individual coarse-grained model in an embodiment of the present application;
  • FIG. 46a is a schematic diagram of individual coarse-grained model and coarse-grained label mapping in an embodiment of the present application.
  • FIG. 46b is a schematic diagram of a joint model of a group coarse-grained model, an individual coarse-grained model, and a fine-grained model and a fine-grained label mapping in an embodiment of the present application;
  • FIG. 47 is a schematic diagram of data flow of the intention recognition method in an embodiment of the present application.
  • FIG. 48 is a schematic flowchart of an intention recognition method in an embodiment of the present application.
  • FIG. 49 is an exemplary schematic diagram of a multi-example learning model dividing an input sequence into sub-sequences in an embodiment of the present application
  • FIG. 50 is one of the schematic flowcharts of an intention recognition method provided by an embodiment of the present application.
  • FIG. 51 is a second schematic flowchart of an intention recognition method provided by an embodiment of the present application.
  • FIG. 52 is one of the schematic diagrams showing the content of the target intention provided by an embodiment of the present application.
  • FIG. 53 is the second schematic diagram showing the content of the target intention provided by the embodiment of the present application.
  • FIG. 54 is the third schematic flowchart of an intention recognition method provided by an embodiment of the present application.
  • FIG. 55 is one of the schematic diagrams of the target operation provided by the embodiment of the present application.
  • FIG. 56 is the second schematic diagram of the target operation provided by the embodiment of the present application.
  • FIG. 57 is the third schematic diagram of the target operation provided by the embodiment of the present application.
  • FIG. 58 is a schematic diagram of a scene in which candidate intentions change according to an embodiment of the present application.
  • FIG. 59 is a schematic flowchart of an intention recognition method in an embodiment of the present application.
  • FIG. 60 is a schematic diagram of an example of a distributed scenario in which multiple devices are interconnected in an embodiment of the present application.
  • FIG. 61 is a schematic diagram of an information flow of entity extension in an embodiment of the present application.
  • Fig. 62 is a schematic diagram of an information flow intended to be expanded in an embodiment of the present application.
  • FIG. 63 is a schematic diagram of an exemplary structure of another electronic device in an embodiment of the present application.
  • first and second are only used for descriptive purposes, and cannot be understood as implying or implying relative importance or implicitly specifying the number of indicated technical features. Therefore, the features defined with “first” and “second” may explicitly or implicitly include one or more of these features. In the description of the embodiments of the present application, unless otherwise specified, “multiple” The meaning is two or more.
  • Single mode input refers to data that only uses a single input method. For example, only the data detected by the sensor or only the data input by the user is used.
  • Multi-modal input means that data of multiple input methods can be used.
  • electronic devices generally have multiple data input methods such as user operation input, environment perception input, text input, voice input, and visual input.
  • the multi-modal input may also include data input obtained from other smart devices interconnected with the electronic device.
  • the specific interconnection method is not limited, and it may be a direct point-to-point connection, such as a Bluetooth connection, a local area network connection, or an Internet connection.
  • the electronic device can obtain the user’s voice control commands from the connected smart speaker as an input method, and can obtain the user’s song playlist from the connected smart speaker as an input method, or from the connected TV.
  • the user's most frequently used temperature can be obtained from the connected air conditioner as an input method, and the recognized person information can be obtained from the connected camera as an input method, etc., here Not limited.
  • Multi-modal input refers to data that can use these different input methods.
  • the multi-modal input can use all input data.
  • the multi-modal input includes at least two input data.
  • the multi-modal input also It may only be possible to obtain data for one input method, which is specifically determined according to the current input environment and requirements. Multi-modal input is not necessarily limited to data that must use more than two input methods.
  • the reason why multimodal input is used in the embodiments of this application is that the entity learning framework (including entity recognition and context) requires a sufficiently accurate description of the state of the environment, but some devices are limited by objective factors such as hardware performance and available resources.
  • the ability to perceive and describe the environment is weak, such as low accuracy, high noise, etc., or can only observe and describe certain specific environments. Therefore, it is necessary to integrate the information obtained by these devices to provide a complete description of the environment.
  • Context in programming languages, generally refers to the surrounding environment related to the current job. For example, the previous state and the next state related to the current operation.
  • the context information generally refers to the data in the electronic device at the current moment, and the data in the electronic device in the pane for a period of time before the current moment.
  • the time pane refers to a period of time.
  • entities refer to objects, things, or actions that exist objectively in the real world and can be distinguished from each other.
  • an entity can be considered an instance of a certain concept.
  • person name is a concept, or entity type
  • Xiao Ming is a kind of "person name” entity
  • time is an entity type
  • Mid-Autumn Festival is a kind of "time” entity .
  • FIG. 2 is a schematic diagram of an entity recognition scenario. As shown in Figure 2: The captured photos are mapped to different object entities, such as students, hats, coats, etc. through the object recognition algorithm; the applications opened by the user in the history can be mapped to games, entertainment, videos, food and other entities through the application market classification ; The dialogue or text input recognized by the voice can be mapped to the entity of the action and location such as air ticket booking, Nanjing, Shanghai, etc.
  • ⁇ m denote the physical space corresponding to the m-th modal input
  • ⁇ m denote the mapping function of the m-th modal input to the physical space: ⁇ m: Xm ⁇ m (can be used in some scenarios Other Xm as augmentation).
  • can be obtained by collecting annotated data and using learning algorithms to learn, or it can be obtained using artificial preset rules such as artificial classification and labeling of applications in the similar application market.
  • the unified feature space ⁇ is the mapping function from the input X to the unified feature space ⁇ .
  • the entity can be stored in the electronic device in the form of [Entity Identifier (id), Entity Name, Entity Representation].
  • entity id is used to uniquely identify an entity; the entity name corresponds to the nouns of objects, things or actions in the real world, and the entity name may or may not exist; the entity representation is composed of some feature (embedding) vectors, and Yu represents the characteristics of the entity.
  • the entity representation may also be composed of feature vectors in other forms, such as a text form, which is not limited here.
  • Entity recognition is the process of identifying the type of entity you want to get from the input data you get.
  • Entity recognition can be performed through entity learning, or entity recognition can be performed through preset rules, which is not limited here.
  • entity recognition There are many ways to realize entity recognition, and different entity recognition methods can also be used for different input types. For example, word segmentation and deep conditional random fields can be used for entity recognition of text input data; fast target detection algorithm (FastRCNN) can be used for entity recognition of visual input data; profiling data can be extracted for entity recognition of user operations; sensor applications can be called Application Programming Interface (API) performs entity recognition on environmental perception data; Named Entity Recognition (NER) can be used to perform entity recognition on voice input data. It can be understood that for each input type, Many different machine learning techniques can be used for entity recognition, for example, machine learning techniques such as logistic regression, which are not limited here.
  • FastRCNN fast target detection algorithm
  • profiling data can be extracted for entity recognition of user operations
  • sensor applications can be called Application Programming Interface (API) performs entity recognition on environmental perception data
  • NER Named Entity Recognition
  • NER Named Entity Recognition
  • Entity sequence refers to a collection of identified entities within a period of time, which contains at least one entity.
  • the entity recognition is triggered from this time, and the length of the time pane for entity recognition this time is 30 seconds.
  • the entity identified in these 30 seconds is: enter the garage and approach the vehicle at 8 o'clock in the morning, then the content of this entity identification can form an entity sequence [enter the garage; approach the vehicle; time is 8 o'clock in the morning]. If the entity sequence formed after the previous entity recognition is triggered is [open Alipay; make a payment; receive a shopping message], they can form a longer entity sequence as [open Alipay; make a payment; receive a shopping message; Enter the garage; approach the vehicle; the time is 8 o'clock in the morning].
  • the entity arrangement in the entity sequence may have or not have sequential characteristics:
  • the entities in the entity sequence can exchange storage locations at will without affecting the entity sequence being recognized as the same entity sequence.
  • the entity sequence [enter the garage; approach the vehicle; time is 8 am] and the entity sequence [time is 8 am; enter the garage; approach the vehicle] can be regarded as the same entity sequence.
  • the entity sequence [enter the garage; approach the vehicle; time is 8 am] and the entity sequence [time is 8 am; enter the garage; approach the vehicle] can be considered as different entity sequences .
  • an entity sequence with sequential characteristics there are many ways to determine the order of the entities: it can be sorted according to the time sequence in which the entities were identified. For example, if the identified entities are in the order of entering the garage and approaching the vehicle, the time is At 8 o'clock in the morning, a sequence of entities sorted by time can be formed [enter the garage; approach the vehicle; time is 8 o'clock in the morning]; electronic equipment can store an entity priority list, which can be based on the entity priority list. Priority, the identified entities are sorted in order of priority from high to low or from low to high. Entities with the same priority are sorted by the pre-stored default entities to form an entity sequence.
  • a sequence of entities sorted by priority can be formed [time is 8 am; entering the garage ; Approaching the vehicle], in the sequence of entities with sequential characteristics, there can be many ways to determine the sequence of the entities, which are not limited here.
  • Intent means that the electronic device recognizes what the user’s actual or potential needs are.
  • intent recognition is a classifier that divides user needs into a certain type; or, intent recognition is a sorter, which sorts the set of potential user needs according to possibility.
  • Intent recognition also known as SUC (Spoken Utterance Classification), as the name suggests, is to classify the natural language conversation input by the user, and the classified category corresponds to the user's intention. For example, "How is the weather today", the intent is "ask the weather”.
  • intent recognition can be regarded as a typical classification problem.
  • the classification and definition of intent can refer to the ISO-24617-2 standard, which has 56 detailed definitions. The definition of intent has a lot to do with the positioning of the system itself and the knowledge base it possesses, that is, the definition of intent has a very strong domain relevance. It is understandable that in the embodiments of the present application, the classification and definition of intentions are not limited to the ISO-24617-2 standard.
  • the slot is the parameter of the intention.
  • An intent may correspond to several slots. For example, when asking for a bus route, you need to provide necessary parameters such as departure place, destination, and time. The above parameters are the slots corresponding to the intention of "asking bus route".
  • the main goal of the semantic slot filling task is to extract the pre-defined semantic slot values in the semantic frame from the input sentence on the premise that the semantic frame of a specific domain or specific intention is known.
  • the semantic slot filling task can be transformed into a sequence labeling task, that is, using the classic IOB notation method to mark a word as the beginning, continuation (inside), or non-semantic slot (outside) of a certain semantic slot.
  • Intent and slot position can let the system know which specific task to perform, and give the type of parameters needed to perform the task.
  • Slot definition Slot 1: Time, Date; Slot 2: Location, Location.
  • Figure 3 is a schematic diagram of a relationship between an intention and a slot in an embodiment of the application.
  • two necessary slots are defined for the "Ask the weather” task, which are "time” and "location".
  • the above definition can solve the task requirement.
  • a system often needs to be able to handle several tasks at the same time.
  • a weather station should be able to answer the question of “inquiring about the weather” as well as the question of “inquiring about the temperature”.
  • an optimized strategy is to define higher-level domains, such as "inquiring about the weather” intentions and “inquiring about temperature” intentions in the "weather” domain.
  • the domain can be simply understood as a collection of intents.
  • the advantage of defining the domain and performing domain recognition first is that it can constrain the scope of domain knowledge and reduce the search space for subsequent intent recognition and slot filling.
  • NLU Natural Language Understanding
  • the user's intent and the corresponding slot value can be identified from the user input.
  • the goal of intent recognition is to identify user intent from the input.
  • a single task can be simply modeled as a two-category question, such as "asking for the weather” intent, which can be modeled as “asking for the weather” or “not as for asking about the weather” during intent recognition.
  • "Weather” two classification problem When it comes to the need for the system to handle multiple tasks, the system needs to be able to distinguish each intent. In this case, the two-classification problem is transformed into a multi-classification problem.
  • the task of slot filling is to extract information from the data and fill it into the pre-defined slots.
  • the intent and the corresponding slots have been defined.
  • the system should Can extract “Today” and “Shanghai” and fill them into the “Time” and “Location” slots respectively.
  • Traditional machine learning models based on feature extraction have been widely used in slot filling tasks.
  • methods based on deep learning have gradually been applied to slot filling tasks.
  • deep learning models can automatically learn the hidden features of the input data. For example, the maximum entropy Markov model that can utilize more contextual features is introduced into the slot filling process.
  • An action sequence can contain at least one action to be executed.
  • an action to be performed is an action or service that the device needs to perform.
  • a to-be-executed action may include at least a device ID and an action/service ID.
  • the expression form of a to-be-executed action may be [serial number, device identification, action/service], where the sequence number can indicate the number of the to-be-executed action, or the order of the to-be-executed action in the sequence of actions.
  • the device identifier indicates which device needs to execute the action to be executed, and the action/service indicates what kind of action or service the action to be executed is to execute.
  • An action sequence can contain only one action to be executed, or it can contain multiple actions to be executed.
  • the device identifiers in these actions to be executed can be the electronic device itself that determines the action sequence, or other electronic devices. The place is not limited.
  • most of the actions to be executed in the action sequence are preloaded actions/services, such as background preloaded applications, etc. In actual applications, they can also be directly executed actions/services, such as connecting to Bluetooth, etc. , There is no limitation here.
  • the action sequence contains only one action to be executed, and the device in this action to be executed is identified as mobile phone A itself:
  • the manifestation of the action to be executed may have a serial number, such as [1, mobile phone A, turn on Bluetooth], or no serial number, such as [mobile phone A, turn on Bluetooth]. Since there is only one to-be-executed action in the determined action sequence, and the device identifier in the to-be-executed action corresponds to the mobile phone A itself, the mobile phone A directly executes the to-be-executed action and turns on Bluetooth.
  • the action sequence contains multiple actions to be executed, and the device identifiers in these multiple actions to be executed are all mobile phone A itself:
  • serial number There is no serial number in the manifestation of these multiple actions to be executed, or there is a serial number, but the serial number is only the number of the action to be executed, and is not set as the execution sequence of the action to be executed:
  • the 2 actions to be executed are [mobile phone A, turn on Bluetooth] [mobile phone A, turn on WIFI], or [1, mobile phone A, turn on Bluetooth] [2, mobile phone A, turn on WIFI]. Since the device identifiers in the two actions to be performed in the determined action sequence are both the mobile phone A itself, the mobile phone A executes the two actions to be performed and turns on the WIFI, which does not completely limit the execution order of the two actions to be performed.
  • serial numbers there are serial numbers in the manifestations of these multiple actions to be executed, and the serial numbers are set as the execution order of the actions to be executed:
  • the 2 actions to be executed are [1, mobile phone A, turn on Bluetooth] [2, mobile phone A, turn on WIFI], because the device identifiers of the two to-be-executed actions in the determined sequence of actions are mobile phone A itself, and have identifiers Perform sequential numbering. Therefore, mobile phone A turns on Bluetooth first, and then turns on WIFI.
  • the action sequence contains multiple actions to be executed, and the device identifiers in the multiple actions to be executed are all smart device B:
  • serial number is only the number of the action to be executed, and is not set as the execution sequence of the action to be executed:
  • the 2 actions to be executed are [Smart Device B, Switch Low Temperature Mode] [Smart Device B, Dehumidification], or [1, Smart Device B, Switch Low Temperature Mode] [2, Smart Device B, Dehumidification]. Since the device identifiers in the two actions to be executed in the determined action sequence are both smart device B, mobile phone A can send two instructions to smart device B, or only one instruction can be sent to smart device B, instructing the smart device to switch to low temperature Mode, dehumidification, and does not limit the order of its execution.
  • serial numbers there are serial numbers in the manifestations of these multiple actions to be executed, and the serial numbers are set as the execution order of the actions to be executed:
  • the 2 actions to be executed are [1, smart device B, wake up] [2, smart device B, dehumidification], because the device identifications of the two to-be-executed actions in the determined sequence of actions are both smart device B and have identifications
  • the order of execution is numbered. Therefore, mobile phone A can send two instructions to smart device B, or only one instruction to smart device B. After receiving the instruction, smart device B wakes up first and then dehumidifies in the order of sequence numbers.
  • serial number is only the number of the action to be executed, and is not set as the execution sequence of the action to be executed:
  • the 3 actions to be performed are [smart device B, switch low temperature mode] [mobile phone A, turn on Bluetooth] [smart device C, switch to eye protection mode], or [1, smart device B, switch low temperature mode] [2 , Mobile phone A, turn on Bluetooth] [3, smart device C, switch to eye protection mode].
  • Mobile phone A sends instructions to smart device B according to the device corresponding to the device identification in the three actions to be performed, smart device B switches to low temperature mode, performs the Bluetooth-on operation by itself, sends instructions to smart device C, smart device C starts, and this The execution of the three actions does not limit the order of execution.
  • serial numbers there are serial numbers in the manifestations of these multiple actions to be executed, and the serial numbers are set as the execution order of the actions to be executed:
  • the 3 actions to be performed are [1, smart device B, switch to low temperature mode] [2, mobile phone A, turn on Bluetooth] [3, smart device C, switch to eye protection mode].
  • Mobile phone A first sends instructions to smart device B according to the devices corresponding to the device IDs in these three actions to be performed and the sequence number indicating the execution order, smart device B switches to low temperature mode, then performs the Bluetooth-on operation by itself, and finally sends instructions to the smart device C, the smart device C switches to the eye protection mode.
  • serial number There is no serial number in the manifestation of these multiple actions to be executed, or there is a serial number, but the serial number is only the number of the action to be executed, and is not set as the execution sequence of the action to be executed:
  • the 3 actions to be performed are [smart device B, switch to low temperature mode] [smart device B, ventilation] [smart device C, switch to eye protection mode], or [1, smart device B, switch to low temperature mode] [ 2. Smart device B, ventilation] [3, smart device C, switch to eye protection mode].
  • Mobile phone A can send one or two instructions to smart device B according to the device corresponding to the device identifiers in these three actions to be performed. Smart device B switches to low temperature mode and ventilates, sends instructions to smart device C, and smart device C starts. And the execution of these three actions does not limit the execution order.
  • serial numbers There are serial numbers in the manifestations of these multiple actions to be executed, and the serial numbers are set as the execution order of the actions to be executed:
  • the 3 actions to be performed are [1, smart device B, switch to low temperature mode] [2, smart device B, ventilation] [3, smart device C, switch to eye protection mode].
  • Mobile phone A first sends one or two instructions to smart device B according to the device corresponding to the device identification in the three actions to be executed and the sequence number indicating the execution order.
  • Smart device B first switches to low temperature mode, then ventilates, and finally sends the instruction
  • smart device C switches to the eye protection mode.
  • the actions to be executed in the embodiments of the present application may be any of the above situations, which are not limited here.
  • one entity sequence can correspond to one intention or multiple intentions.
  • the same entity sequence can correspond to one intent or multiple intents.
  • one entity sequence can correspond to multiple intents.
  • the intention corresponding to an entity sequence is game playing, the intention corresponding to the entity sequence is also entertainment.
  • two different entity sequences may correspond to two different intentions, or they may correspond to the same intention, which is not limited here.
  • an entity sequence [Play, Doraemon, Episode 4, Turn on TV] can have the corresponding intention: "Play video”, and the corresponding slot can be: “Equipment, TV”, “Content, Doraemon”, “ Anthology, four”; another different entity sequence [8 AM, turn on the light]
  • the corresponding intention can be: “Increase the ambient brightness”
  • the corresponding slot can be: “Time, 8 AM”, “Equipment, lamp” ", two different entity sequences correspond to two different intents and slots.
  • an entity sequence [Play, Doraemon, Episode 4, Turn on TV] can have the corresponding intention: "Play video”, and the corresponding slot can be: “Device, TV”, “Content, Doraemon”, “Anthology, 4"; and another different entity sequence [Play, Doraemon, Episode 4, turn on the projector]
  • the corresponding intention can also be: "Play video”
  • the corresponding slot can be: "Equipment, projection
  • Two different entity sequences can correspond to the same intention.
  • the corresponding relationship between the intent and the action sequence, a set of entity sequences and the intent correspond to an action sequence.
  • a set of entity sequence [Play, Doraemon, episode 4, turn on TV] and the intention to play a video the corresponding action sequence can be [1, TV, player preloaded Doraemon episode 4], another set of entities
  • the sequence [8 AM, turn on the light] and the intention to increase the brightness of the environment, the corresponding action sequence can be [1, smart curtain, open curtain].
  • Each group of entity sequence and intent can correspond to an action sequence.
  • the dot data is the user's daily operation data recorded locally by the electronic device and/or the response data of the electronic device to the user's operation.
  • the dot data may be user operation data and/or response data to the user operation recorded after the electronic device executes the determined action to be performed.
  • the action to be executed is to open application A
  • the electronic device can open application A; if the user does not use the application A, but closes the application A, the user's operation to close the application A is recorded; if the user uses the application A, record the user's operation using the application A.
  • the input mode of the dot data can also be multi-modal input.
  • the electronic device When the user performs some operations in the electronic device, such as: input content, click a button, enter a page, open a pop-up box, open a certain application, etc., the electronic device will record the user's operations and electronic The device responds based on the operation. These user operations and the response actions of the electronic device recorded by the electronic device are pieces of dotted data.
  • Fig. 4 is a schematic diagram of a scenario in which dot data is generated in an embodiment of the application.
  • the process may be:
  • step 1 The user wakes up the voice assistant and tells the voice assistant to open the video application A;
  • step 2 the voice assistant opens the video application A according to the user's expression.
  • At least two dot data can be generated:
  • Dot data 1 Dot data generated by the voice assistant and received from the user stating that the video application A is to be opened;
  • Dot data 2 Dot data for the electronic device to open the video application A.
  • the process can be:
  • step 1 The user operates the electronic device to return to the main interface
  • step 2 in response to the user's click, open the application music.
  • At least two more dot data can be generated:
  • Dot data 4 The electronic device opens the application music.
  • dot data can be saved in a data exchange format, such as using JS object notation (JSON), etc., or in forms, databases, etc. Save the dot data, you can also save the dot data in other ways, there is no limitation here.
  • JSON JS object notation
  • the electronic device can also tag each dot data to indicate the generation method and function of each dot data. For example, the number of the dot data, the generation time, the source application, the intention, etc. can be marked, which is not limited here. And due to factors such as different applications or different operating environments, the labels added to each piece of dot data are often incomplete.
  • the user when the user uses the voice assistant or directly opens the application, the dot data is generated, the user can also generate dot data when performing other operations on the electronic device:
  • FIG. 5 it is a schematic diagram of another scenario where dot data is generated in an embodiment of this application.
  • the process can be:
  • step 1 the user opens the browser
  • step 2 The user searches for keyword 1 in the default search engine that appears in the browser;
  • step 3 the user selects the desired search result 3 from multiple search results
  • step 4 the user views the content of the search result 3.
  • the electronic device can generate the following dot data:
  • Dot data 5 Open the browser of the electronic device
  • Keyword 1 is received in the default search engine
  • Dot data 7 Search result 3 is determined among the multiple search results searched by keyword 1;
  • Dot data 8 The electronic device displays the content of the search result 3.
  • the continuous multiple dot data stored in the electronic device forms a dot data sequence.
  • a dot data sequence such as [Dot Data 1] [Dot Data 2] [Dot Data 3] [Dot Data 4] is generated.
  • the dot data generated in the scene shown in Figure 4 can be saved continuously with the dot data generated in the scene shown in Figure 5 to generate [Dot data 1] [Dot data 2] [Dot data 3] [Dot data 4] [Dot data 5] [Dot data 6] [Dot data 7] [Dot data 8] Such a dot data sequence.
  • dot data sequence can be represented in the form of a list, an array, a matrix, etc., which is not limited here.
  • the dot data sequence generated by the continuous operation of the user often corresponds to the same intention.
  • (a) and (b) in FIG. 4 indicate that the user's intention is to open the video application A.
  • (C) and (d) in Figure 4 indicate that the user's intention is to open the application music.
  • (A), (b), (c), and (d) in FIG. 5 indicate that the user's intention is to obtain the content of the search result 3.
  • the dot data sequence generated may contain multiple intents. It is difficult to use existing models or rules to predict which continuous dot data corresponds to which intent. However, by using the method in the embodiment of the present application, each intention in the dot data sequence can be more accurately identified.
  • the continuous operation of the user can be specifically understood as: the user has performed multiple operations and the time interval between the multiple operations is less than the first preset time interval.
  • the user may perform operation (c) in FIG. 4 within 2 seconds after performing operation (a) in FIG. 4; and perform operation (c) in FIG. 4 again within 2 seconds after performing operation (c) in FIG. Figure 5 (a) operation.
  • the operation (a) in FIG. 4, the operation (c) in FIG. 4, and the operation (a) in FIG. 5 performed by the user can be referred to as the continuous operation of the user.
  • the embodiment of the application does not limit the dot data sequence to be generated by the user's continuous operation.
  • the dot data generated by the user's continuous operation can form the dot data sequence, and the dot data generated by the user's discontinuous operation can also be Make up the dot data sequence. Only the dot data sequence composed of dot data generated by the continuous operation of the user is difficult to predict which of the continuous dot data corresponds to which intention according to the conventional method using existing models or rules.
  • FIG. 6 is an exemplary schematic diagram of the dot data sequence in the embodiment of the application.
  • the most common operation users use is to open an application and return to the main interface, and sometimes the voice assistant is used to perform some actions.
  • Fig. 6 is part of user-operated electronic equipment management data obtained from a real scene. For easy viewing, mark the dot data of the voice assistant as V, mark the dot data of the operation performed by the electronic device as A, and mark the dot data of the electronic device back to the desktop as L.
  • FIG. 6 is an exemplary schematic diagram showing the relationship between the dot data sequence and the dot data, and does not mean that it is the storage and display mode of the dot data and the dot data sequence in practical applications.
  • the dot data and dot data sequence can be stored and displayed in the form of tables, arrays, matrices, databases, etc., which are not limited here.
  • the second preset rule is used to determine the intention of each sequence according to the dot data in each sequence.
  • the first preset rule is used to divide the dot data sequence into different sub-sequences, and a sub-sequence can at least determine a clear intention according to the second preset rule.
  • the first preset rule may also be referred to as a preset split rule
  • the second preset rule may also be referred to as a preset intention rule
  • the first preset rule and the second preset rule may be combined into one rule or rule set, or two rules or rule sets that run separately, which are not limited here.
  • the first preset rule and the second preset rule can be preset at the factory, or can be downloaded or updated from the server, which is not limited here.
  • FIG. 7 it is an exemplary schematic diagram of dividing the dot data sequence into sub-sequences in the embodiment of this application.
  • the first preset rule is: dividing the dot data generated by a series of continuous operations from the on-screen to the rest-screen by the user each time into a sub-sequence.
  • the second preset rule is: the last used application that was closed before the user goes off the screen is the user's intention.
  • the dot data of the sequence B1 segment is generated by a series of continuous operations after the screen is turned on once to the rest screen;
  • the dot data of the sequence B2 is the dot data from the rest of the screen to the rest screen after another light-up. It is generated by a series of continuous operations;
  • the dot data of the sequence B3 segment is generated by a series of continuous operations between the screen after another bright screen.
  • the electronic device can divide the dot data sequence A1 into three sub-sequences: sub-sequence B1, sub-sequence B2, and sub-sequence B3.
  • each sub-sequence can at least determine a clear intention according to the second preset rule.
  • the intent of the sub-sequence B1 is the last used application that is closed before the screen stops: open the video application A.
  • the intent of sub-sequence B2 is the last used application that is closed before the screen: turn on the recorder.
  • the intent of sub-sequence B3 is the last used application that is closed before the screen: open the weather.
  • FIG. 8 another exemplary schematic diagram of dividing the dot data sequence into sub-sequences in the embodiment of this application.
  • the first preset rule is: divide the dot data that generates two adjacent dot data with a time interval less than the preset dot time interval into a sub-sequence.
  • the second preset rule is: the last application opened in each sub-sequence is the user's intention.
  • the time interval of each adjacent dot data of the sequence C1 segment is less than the preset dot time interval; the time interval of each adjacent dot data of the sequence C2 segment is less than the preset dot time interval; the sequence C3 is generated The time interval of each adjacent dot data of the segment is less than the preset dot time interval; the time interval between the last dot data of the sequence C1 segment and the first dot data of the sequence C2 segment is not less than the preset dot time interval; the sequence C2 is generated The time interval between the last dot data of the segment and the first dot data of the generated sequence C3 segment is not less than the preset dot time interval.
  • the electronic device can divide the dot data sequence A2 into three sub-sequences: sub-sequence C1, sub-sequence C2, and sub-sequence C3.
  • each sub-sequence can at least determine a clear intention according to the second preset rule.
  • the intent of sub-sequence C1 is the last open application in the sub-sequence: open map navigation.
  • the intent of the sub-sequence C2 is the last open application in the sub-sequence: turn on the recorder.
  • the intent of the sub-sequence C3 is the last open application in the sub-sequence: open the weather.
  • FIGS. 7 and 8 are two exemplary schematic diagrams of dividing the dot data into sub-sequences according to the first preset rule and the second preset rule in an embodiment of the present application.
  • first preset rules and second preset rules are set, so that the first preset rule is used to divide the dot data sequence into different sub-sequences, and one sub-sequence is at least The effect of a clear intention can be determined, and it is not limited here.
  • the second preset rule is only used to determine the intention of the sequence, and the intention of the sequence determined by the second preset rule is one of the multiple intentions of the sequence or the only intention of the sequence. Not limited.
  • the second preset rule may be to extract the intent information and slot information of the dot data from the sequence according to the deep learning model, so as to determine the intent of the sequence, which is not limited here.
  • the multi-instance learning model is used to divide the continuous dot data that may not belong to the same intention in each sequence to be processed into different types according to the possibility that the continuous dot data in each sequence to be processed belongs to the same intention. In a subsequence with a smaller granularity, multiple subsequences are obtained.
  • the sequence to be processed may be a sub-sequence divided into the dot data sequence using the first preset rule, or may be a sub-sequence with a smaller granularity divided into the sub-sequence using the multi-instance learning model.
  • the sequence to be processed can also be understood as the dot data sequence input to the multi-example learning model.
  • the multi-instance learning model used in the embodiments of this application can be any multi-instance learning model, such as ORLR model, Citation-kNN model, MI-SVM model, C4.5-MI model, BP-MIP model, Ensemble Learning- MIP models, etc., are not limited here.
  • Multi-instance learning was originally used in the classification of drug molecular shape and drug activity in the pharmaceutical field. Multi-instance learning takes a bag as a training unit, and a bag as a collection of instances (Instance, or Pair).
  • two adjacent pieces of dot data can form an example.
  • Each example can have a label, and example labels include positive (Positive) and negative (Negtive).
  • An example with a positive example label can be called a positive example, and an example with a negative example label can be called a negative example.
  • the example composed of two adjacent dot data located in the same sequence to be processed is a positive example, and the example composed of two adjacent dot data located in different sequences to be processed is a negative example.
  • Two adjacent dot data may mean that the start times of the two dot data are adjacent.
  • the example is to determine whether the continuous dot data corresponds to the same intention.
  • the example composed of them is marked as a positive example, which means that the two dot data are continuous.
  • the dot data in different to-be-processed sequences corresponds to different intentions, so the example of its composition is marked as a negative example, which means that the two dot data are not continuous.
  • the training set is composed of a set of bags, each bag has a bag label, and the bag label includes positive and negative.
  • a package with a positive package label may be called a positive package, and a package with a negative package label may be called a negative package.
  • the package label can indicate whether the package is a positive package or a negative package, which is not limited here.
  • the multi-instance learning model can train the model using the features of the examples in the package and the package label, and finally use the trained model to predict the sample label of the unknown example.
  • the examples composed of the dot data in the same sequence to be processed can be collectively used as a positive packet, and the positive packet contains at least one positive example.
  • An example consisting of the last dot data in a sequence to be processed and the first dot data in the next sequence to be processed that is continuous with the sequence to be processed can be used as a negative packet, and the examples in the negative packet are all negative examples .
  • Two adjacent dot data in the dot data sequence form an example, that is, 4 examples can be obtained: example [A, B], example [B, C], example [C, D], and example [D, E] .
  • example [A, B] and the example [B, C] are examples composed of two adjacent dot data in the same sequence to be processed (sub-sequence 1), therefore, the example [A, B] and the example [ B, C] are all positive examples;
  • the example [C, D] is an example composed of two adjacent dot data in different sequences to be processed (sub-sequence 1 and sub-sequence 2), the example [C, D] is a negative example;
  • the example [D, E] is an example composed of two adjacent dot data in the same sequence to be processed (sub-sequence 2), the example [D, E] is a positive example;
  • example [A, B], example [B, C] composed of dot data [A] [B] [C] in the same sub-sequence 1 is regarded as a positive package;
  • Example [C, D] composed of the last dot data [C] in sub-sequence 1 and the first dot data [D] in sub-sequence 2 continuous with the sub-sequence 1 is regarded as a negative packet;
  • example [D] [E] composed of dot data [D] [E] in the same sub-sequence 2 is regarded as a positive packet;
  • M-1 examples can be formed. If the number of sequences to be processed is N, 2N-1 packets can be obtained. Both M and N are positive integers.
  • this is an exemplary schematic diagram of using a multi-instance learning model to divide each sequence to be processed into smaller-granularity sequences in an embodiment of this application.
  • the two obtained to-be-processed sequences are:
  • Sequence I1 to be processed 1V, 2A, 3L, 4A, 5V, 6A, 7L, 8A, 9L, 10A, 11L;
  • Sequence I2 to be processed 12V, 13A, 14L, 15V, 16A, 17L, 18V, 19A, 20L, 21A.
  • the two to-be-processed sequences I1 and I2 can generate 3 packages, respectively:
  • B1 Positive package, including 10 positive examples: [1V, 2A] [2A, 3L] [3L, 4A] [4A, 5V] [5V, 6A] [6A, 7L] [7L, 8A] [8A, 9L ⁇ 9L, 10A ⁇ 10A, 11L ⁇ ;
  • Negative package including 1 negative example: [11L] [12V];
  • B3 Positive package, including 9 positive examples: [12V, 13A] [13A, 14L] [14L, 15V] [15V, 16A] [16A, 17L] [17L, 18V] [18V, 19A] [19A, 20L ] [20L, 21A].
  • the feature extraction method in the embodiment of the present application can be used to extract the feature of each example in each package of B1, B2, and B3 to obtain the feature vector of each feature.
  • the dimension of the feature vector of each feature is J, and if there are K examples in a package, the features extracted from the package can form the feature vector matrix JxK.
  • the specific process of extracting the features of the examples and composing the feature vector matrix please refer to the following (10) Dot data sequence package in the description of the term and the content of the feature vector matrix of the package, which will not be repeated here.
  • one package can be used as a training unit, and the eigenvector matrix of a package and the package label of the package are input into the multi-instance learning model. train. For example, first input the eigenvector matrix of B1 and the bag label of B1, then input the eigenvector matrix of B2 and the bag label of B2, then input the eigenvector matrix of B3 and the bag label of B3, and so on.
  • the multi-example model obtained by training can be used to divide the to-be-processed sequences I1 and I2 into smaller-granularity sub-sequences.
  • the trained model can directly predict the sample label of the example. Therefore, the sequence to be processed can be directly input into the multi-instance learning model to re-predict. Process the sample label of each example in the sequence. According to the sample label, the sequence to be processed can be divided into smaller-granularity sequences, and each sequence corresponds to an independent intent.
  • the to-be-processed sequences I1 and I2 are input to the trained multi-instance learning model and then divided into smaller-granularity sub-sequences:
  • Subsequence i1 1V, 2A, 3L, 4A;
  • Subsequence i2 5V, 6A, 7L;
  • Subsequence i6 15V, 16A, 17L;
  • the second preset rule can also be used to determine the intent of each subsequence.
  • the loss function is a measure of how well the predictive model performs in terms of predicting the expected result.
  • Each machine learning model has its corresponding loss function. The better the prediction result of the model, the smaller the value of the loss function.
  • the multi-example learning model is trained, and the to-be-processed sequence is divided into smaller-granularity sequences Rear.
  • the electronic device may also continue to use the smaller-granularity sequence obtained by the division as the sequence to be processed, and iteratively train the multi-example learning model, thereby dividing the sequence to be processed at this time into smaller-granularity sequences.
  • the electronic device can obtain the value of the loss function of the multi-instance learning model.
  • the electronic device can determine that using the existing dot data sequence no longer has a greater gain in the training of the multi-sample model, and the electronic device can The finally obtained multi-instance learning model is used as the completed multi-instance learning model.
  • the electronic device can use the trained multi-example learning model to perform sequence division on the new dot data sequence.
  • the example is composed of two adjacent dot data in the dot data sequence.
  • the electronic device can extract the features of the example from the two dot data of the example to form a feature vector of the example.
  • the feature of an example can contain multiple dimensions. Since the example contains two adjacent dot data, the characteristics of the example are closely related to the characteristics of the dot data.
  • FIG. 10 it is an exemplary schematic diagram of the dot data in the embodiment of this application.
  • the dot data is saved in the format of a JSON structure. In actual applications, the dot data can also be saved in other ways, which is not limited here.
  • (A), (b), and (c) in Figure 10 are three adjacent dot data in the dot data sequence.
  • (A) in FIG. 10 is an example of the voice assistant dot data V;
  • (b) in FIG. 10 is an example of action dot data A;
  • (c) in FIG. 10 is an example of returning the desktop dot data L.
  • X is the first dot data in the example
  • Y is the second dot data in the example.
  • the characteristics are described in different types:
  • the dot data generated by some user operations will contain a lot of content (such as the dot data of the voice assistant), while the dot data generated by some user operations contains less content (such as the dot data of opening an application), which can be reflected by the text characteristics of the example. How much of the data content is dotted in the example.
  • the text characteristics of the example may include the total number of keywords in the dotted data in the example, and the total length of the dotted data string in the example.
  • the text characteristics of the example can include:
  • text features can also be extracted from the dot data as example text features, such as word2vec features, word segmentation features, etc., which are not limited here.
  • the voice assistant dot data V shown in (a) in FIG. 10 and the action dot data A shown in (b) in FIG. 10 to form an example as an example. If the string of the first dotted data in the example is very long and the string of the second dotted data is very short, then the two dotted data corresponding to this example are likely to be continuous and correspond to the same intent.
  • the total length of the string in the example the length of the JSON string of the dotted data X + the length of the JSON string of the dotted data Y.
  • the user's current operation is "open the address book", and the next operation is "make a call”. If the contact you click on when you open the address book is the same as the contact you called, the two adjacent pieces of dot data are likely to correspond to the same intent. There can be many similar contextual features.
  • the context features of the example may include:
  • the dot data is saved in the format of a JSON structure, whether the values of some JSON keywords are the same. For example, whether the scene information of the dot data X and the dot data Y are the same.
  • the voice assistant dot data V shown in (a) in FIG. 10 and the action dot data A shown in (b) in FIG. 10 take the voice assistant dot data V shown in (a) in FIG. 10 and the action dot data A shown in (b) in FIG. 10 to form an example as an example.
  • the application package of Dot Data X (Voice Assistant Dot Data V) is named “com.huawei.hivoice”, which means that the voice assistant does dot data.
  • the application package name of the dot data Y (action dot data A) is "com.ali.pay”, which means "open a shopping application”. You can maintain a whitelist, map the application package name to one-hot, or use the word2Vec method to convert it into a feature vector.
  • the time stamp difference is the difference between tm in the dot data X and tm in the dot data Y. In addition, it can also be compared whether the information contained in the scene (scnens) of the dot data A
  • An example consists of two pieces of dotted data.
  • the text features of the above examples and the contextual features of the examples are the common features of dotted data X and dotted data Y in the example.
  • unique features of dotted data X or dotted data Y can be extracted.
  • the unique characteristics of each dot data in the example can include:
  • the statistical characteristics of the dotted data that is, the characteristics of the statistical information of the dotted data.
  • Statistics can reflect the differences of different users. For example, the average time that user 1 uses an application daily is t1, and the average time that user 2 uses the same application daily is t2, which is a complete intent for user 1 within t1, but It may not be for user 2.
  • the statistical characteristics of each dot data in the example can include:
  • each type of feature may also have other different similar features. As an exemplary feature, it is not limited here.
  • J features can be determined as exemplary features.
  • a different feature of the example can be used as a dimension of the example feature, and the J features of the example can constitute the J-dimensional feature vector of the example.
  • x (i) is used to represent the feature vector of the i-th example, Represents the first feature extracted from the i-th example, Represents the second feature extracted from the i-th example, and so on, Indicates the c-th feature extracted from the i-th example until the J-th feature is extracted from the i-th example, then the feature vector of the i-th example
  • One package contains one or more examples, and one example contains a multi-dimensional feature vector. Therefore, the features of the examples in a package can form a feature vector matrix. If the eigenvector of an example is a J-dimensional eigenvector and the package contains K examples, the eigenvector matrix of the package is a J ⁇ K eigenvector matrix.
  • the knowledge graph is a structured semantic knowledge base, and its basic unit is the "entity, relationship, entity” triplet, or the "entity, attribute, attribute value” triplet. Generally, attribute value can also be understood as a constant entity.
  • the knowledge graph usually consists of two parts: general knowledge and personal knowledge. Among them, general knowledge may include: group behavior, psychology, sociology, behavior, user tags, user survey results, etc. Personal knowledge can include: data mining of user behavior, interpersonal networks, property information, interests, hobbies, habits, etc. Personal knowledge can be updated in real time. The embodiments of the present application do not specifically limit what content is specifically included in general knowledge or personal knowledge.
  • the knowledge graph is usually composed of nodes and edges. Nodes represent entities or attribute values, and edges represent attributes or relationships. In the knowledge graph, edges connect various nodes to form a network structure. Among them, each node corresponds to a unique identity (identity, ID), and each edge corresponds to a unique identity.
  • ID identity
  • ID unique identity
  • the knowledge graph can be applied to related scenarios such as knowledge reasoning, search, natural language understanding, e-commerce, question and answer, and can make precise and refined answers.
  • FIG. 11 shows the basic structure of the knowledge graph.
  • the knowledge graph includes node 11, node 13, and node 14.
  • Node 11 and node 13 are connected by edge 12, and node 11 and node 14 are connected by edge 15.
  • node 11 represents entity A
  • edge 12 represents relationship F
  • node 13 represents entity B
  • node 14 represents attribute value C
  • edge 15 represents attribute J.
  • edge 12, and node 13 form a triple of "entity, relationship, entity", which is specifically used to indicate that "there is a relationship F between entity A and entity B”.
  • the node 11, the node 14 and the edge 15 form a triple of "entity, attribute, attribute value", which is specifically used to indicate "the attribute value of the attribute J of the entity A is the attribute value C".
  • the entity in the embodiment of the present application may be a person's name, an object's name, a place name, an occupation, and so on.
  • the attributes can be name, age, height, weight, longitude, latitude, brand, fuel consumption, etc.
  • the relationship can be father-child, mother-child, spouse, geographic area affiliation, affiliation, etc.
  • the two entities “user A” and “car” can be node 11 and node 13, respectively, and edge 12 indicates that "user A" "owns” "car” relation.
  • the attribute can be age (edge 15), and the attribute value can be 20 years old (node 14). It is easy to know that the age of user A is 20 years old.
  • the rate of return r i for each rocker arm is unknown and not all the same.
  • the player's goal is to obtain the greatest return q with a limited number of opportunities to press the rocker arm.
  • One solution is to try enough times for each rocker arm, get the average return of each rocker arm by statistics, and use the average return of each rocker arm to estimate the true rate of return r i of each rocker arm. Then select the rocker arm with the largest return rate to perform the remaining steps. In the above process, the more times for exploration (exploration), the more accurate the average return of each rocker arm is.
  • the electronic device recognizes the user's intention, and displays relevant content of the recognized intention to the user, and expects the user's positive feedback operation.
  • Each intention can be regarded as a rocker arm, and the relevant content of each exhibition diagram can be regarded as pressing the rocker arm. Only by exploring each intention multiple times can the correct probability of each intention be accurately assessed.
  • bandit algorithms can be divided into “context-free bandit algorithms (context-free bandit)" and “contextual bandit algorithms (contextual bandit) using context information".
  • the bandit algorithm can compromise the exploration and utilization of the rocker arm, while taking into account the exploration process and the utilization process, so that not only the rocker with high return rate (high confidence) will be displayed, but also the low confidence and less exploration times will be displayed. Rocker arm.
  • each specific input is an instance, usually represented by feature vectors.
  • X ⁇ R denote the feature space
  • (X (1) ,Y (1) ), (X (2) ,Y (2) ),...,(X (m) , Y (m) ) represents the private data set of m node devices.
  • (X (1), Y (1)) in the X (1) represents the feature space of a node device, Y (1) a mark space of a node device; (X (2), Y ( 2) In ), X (2) represents the feature space of the second node device, Y (2) represents the label space of the second node device, and (X (i) ,Y (i) ) represents X (i) The feature space of the i-th node device, Y (i) represents the label space of the i-th node device, and so on.
  • feature space can be understood as a collection of input data.
  • Marked space can be understood as a collection of output data.
  • x (i) j ⁇ X represents the jth example in X (i)
  • Y (i) j represents the label vector corresponding to X (i) j
  • y (i) j is the An input feature in the input data set of i node devices.
  • (x (i) j ,y (i) j ) A combination that actually exists is the j-th sample data in the i-th node device.
  • the label can be a label vector in the label space, or it can also be understood as an output vector in the label space, such as y (i) j .
  • the tag can be a tag or a collection of multiple tags.
  • “coarse-grained” and “fine-grained” actually provide two levels.
  • the first level is coarse-grained labels
  • the second level is fine-grained labels.
  • a level of label is added.
  • the coarse-grained label is the output of the first level
  • the fine-grained label is further subdivided under the coarse-grained label. Label.
  • the coarse-grained tags are "music" applications and "video” applications.
  • the fine-grained labels are "Kugou Music”, “QQ Music”, “Netease Music”, “Tencent Video”, “iqiyi Video”, “Watermelon Video” and so on.
  • coarse-grained tags can be understood as meaning that the intention of an action is concealed; fine-grained tags can be understood as a service with an action concealed, or an action to be executed, etc.
  • coarse-grained tags correspond to intents, and fine-grained tags correspond to services or actions to be executed.
  • the coarse-grained label is "Music” applications
  • the fine-grained label is "Kugou Music”
  • the service that needs to be executed at this time is to open cool Dog music
  • the fine-grained label is "display a reminder card”
  • the action to be performed at this time is to display a reminder card.
  • the node device may be a terminal device (or also referred to as user equipment).
  • the terminal device can represent any computing device.
  • the terminal device can be a smart phone, a tablet computer, a wearable device (such as glasses, watches, earphones, etc.), a personal computer, a computer workstation, a vehicle-mounted terminal, a terminal in driverless driving, a terminal in assisted driving, and a smart home.
  • the terminal such as speakers, smart screens, sweeping robots, air conditioners, etc.
  • multiple node devices may all take a mobile phone as an example.
  • the node device can also be referred to as "end side” for short.
  • the central control device may be a cloud server or a server.
  • the central control device uses a cloud server as an example.
  • This central control device can also be referred to as "cloud side” for short.
  • the APP recommendation refers to recommending applications for users according to the operating habits of the end-side users on the APP, thereby providing services of pre-loading the applications, improving the response speed of the applications, and improving the user experience.
  • the number of node devices is not limited.
  • the number of node devices is described by taking three as an example. The three node devices are node device 1, node device 2, and node device 3, respectively.
  • Node device 1 QQ Music NetEase Music Tencent Video Today's headlines Taobao Gaode Map
  • Node device 2 kugou music Migu Music
  • Node device 3 kuwo music Youku Video Bilibili Taobao Jingdong Baidu map
  • the first data sample in “Node Device 1" is: Open QQ Music at 8:00.
  • x (1) 1 corresponds to "8:00” in (x (1) 1 , y (1) 1 )
  • y (1) 1 corresponds to "QQ Music”.
  • the first data sample in "Node Device 2" is: Open Kugou Music at 8:10.
  • (x (2) 1, y (2) 1) in x (2) 1 corresponds to "8:10”
  • y (2) 1 corresponds to the "cool dog music.”
  • the first data sample in "Node Device 3" is: Open Baidu Map at 7:30.
  • (x (3) 1, y (3) 1) in x (3) 1 corresponds to "7:30”
  • y (3) 1 corresponds to the "Baidu map.”
  • the input feature is not limited in this solution.
  • the input feature can also include user scene information, user status information, etc., for example, the user scene information can be Whether the user is indoors or outdoors, etc., the user status information may include: whether the user is walking, sitting or lying down, and the user's mood (which can be obtained from some sensory information such as heart rate).
  • tags may include: QQ Music, NetEase Music, Tencent Video, etc.
  • tags can include: Kugou Music, Migu Music, iQiyi, NetEase News, etc.
  • tags can include: Kuwo Music, Youku Video, Bilibili, Taobao, etc.
  • the label space in each node device is different. At this time, if you want to perform joint training on each end-side data, you need to unify the end-side tasks, that is, unify the end-side label space (or can also be called “label space”).
  • the original label is used as the fine-grained label, and the label of the previous level of the fine-grained label is introduced, and the situation that the tasks of each end-side is not uniform is unified through the label of the previous level.
  • the first-level label also called “coarse-grained label”
  • the second-level label also called “fine-grained label”
  • coarse-grained labels to unify the label space of each node device (also called To mark the space)
  • fine-grained tags can be QQ music, Kugou music, Migu music, iQiyi, Netease News and other applications.
  • the category can be regarded as coarse-grained. Label.
  • coarse-grained tags include "music" tags, "videos” tags, “online shopping” tags, and "maps" tags. Please refer to the description of the following embodiment for the method for joint training of multiple node devices. It should be noted that the application scenarios are not limited in this solution, and the foregoing application scenarios are only exemplary descriptions.
  • each node device is loaded with a "group coarse-grained model” and a "fine-grained model”.
  • group coarse-grained model” and “fine-grained model” can be trained using different training data sets according to different application scenarios, and the application scenarios are not limited.
  • the label space of the group coarse-grained model is mapped to coarse-grained labels
  • the label space of fine-grained labels is mapped to fine-grained labels.
  • the group coarse-grained model in each node device is jointly trained by multiple node devices in the system, and the fine-grained label is trained and updated locally on the node device.
  • Rules are inference sentences composed of conditions and conclusions. When there are facts that satisfy the conditions, the corresponding conclusions can be activated.
  • the rule can include a condition part (left hand side, LHS) and a conclusion part (right hand side, RHS).
  • LHS left hand side
  • RHS right hand side
  • the condition part of the rule can be called the if part
  • the conclusion part of the rule can be called the then part.
  • the pattern is the smallest condition divided by the condition part of the rule. Multiple patterns can form the conditional part of the rule. For example, if the condition part of the rule is "Age is greater than 20 and age is less than 30", there are two modes in the rule, one of which is “Age is greater than 20" and the other is "Age is less than 30.” ".
  • a fact object is an object that bears real things or facts, which can be understood as input parameters required by the rule engine.
  • the login fact object may contain the following facts: login name, login device, number of successful logins in the past hour, and number of failed logins in the past hour.
  • the electronic device predicts the user's intention only based on the information obtained by the user's single-modal input at the current moment, but only using the current moment's user data and device information cannot accurately predict the current moment's intention. Because the user’s continuous behavior and device status changes over a period of time will reflect the underlying logic of the event and provide a basis for predicting their intentions, but if you ignore the contextual information, it is inevitable that a certain accident will occur at a certain moment. The real intention of the user is not related, which leads to great limitations and poor accuracy in the recognition of the user's intention in the prior art.
  • the electronic device can accurately and unbiasedly identify the user's intention based on the complete environment description and multi-modal user input, combined with domain knowledge and existing rules, and make the intention decision for the user, if appropriate Respond to appropriate user needs or provide appropriate services on the device.
  • FIG. 16 it is a schematic diagram of a scene intended to be identified in an embodiment of this application.
  • the electronic device can predict the user's intention through the information obtained by the multi-mode input such as operation input, environmental perception, text input, voice input and visual input.
  • the multi-mode input such as operation input, environmental perception, text input, voice input and visual input.
  • an electronic device when an electronic device is connected to wifi, it can trigger a 30-minute entity recognition, and then use the currently connected WiFi information, open Alipay for mobile payment, and receive shopping text messages, which are three independent events that occur successively.
  • the contextual entity sequence determines that the user may be shopping in the mall.
  • the electronic device in a distributed scenario, can obtain a complete description of the environment based on the environment perception of multiple devices and the multimodal input of the user, and combine the user input, environment perception and context in a certain time pane.
  • Information obtain a complete and unbiased intention system that can respond to changes over time and expand with changes in the environment, and make decisions based on this, such as inferring the actions that users want to perform or the services they need in the next period of time. Decide which device to respond to the user's needs.
  • the solution provided by the embodiment of the present application is suitable for the decision to accurately provide the user with the response or service he needs in a distributed scenario where information input is multi-source and complex, and depends on the time factor.
  • the electronic device 100 may be the electronic device, node device, etc. described above.
  • FIG. 13 is a schematic structural diagram of an electronic device 100 provided by an embodiment of the present application.
  • the electronic device 100 may have more or fewer components than shown in the figure, may combine two or more components, or may have different component configurations.
  • the various components shown in the figure may be implemented in hardware, software, or a combination of hardware and software including one or more signal processing and/or application specific integrated circuits.
  • the electronic device 100 may include: a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2.
  • Mobile communication module 150 wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, buttons 190, motor 191, indicator 192, camera 193, display 194, Subscriber identification module (subscriber identification module, SIM) card interface 195, positioning device (not shown in the figure) and so on.
  • SIM Subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light Sensor 180L, bone conduction sensor 180M, etc.
  • the structure illustrated in the embodiment of the present invention does not constitute a specific limitation on the electronic device 100.
  • the electronic device 100 may include more or fewer components than those shown in the figure, or combine certain components, or split certain components, or arrange different components.
  • the illustrated components can be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units.
  • the processor 110 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (NPU) Wait.
  • AP application processor
  • modem processor modem processor
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller memory
  • video codec digital signal processor
  • DSP digital signal processor
  • NPU neural-network processing unit
  • the different processing units may be independent devices or integrated in one or more processors.
  • the processor 110 may obtain a semantic object from the memory to match the fact data, or obtain a semantic object from a file to match the fact data, and may also determine whether to perform a corresponding operation based on the matching result, that is, execute The following steps are described in Figure 21; in addition, the processor 110 may also be used to construct a rule topology map in the rule engine. In an example, the processor 110 may train an intent recognition model, an action prediction model, a multi-instance learning model, etc., or update parameters in the model. In an example, the processor 110 may be used to execute the intention recognition method provided in this solution.
  • the controller may be the nerve center and command center of the electronic device 100.
  • the controller can generate operation control signals according to the instruction operation code and timing signals to complete the control of fetching instructions and executing instructions.
  • a memory may also be provided in the processor 110 to store instructions and data.
  • the memory in the processor 110 is a cache memory.
  • the memory can store instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory. Repeated accesses are avoided, the waiting time of the processor 110 is reduced, and the efficiency of the system is improved.
  • the memory may store a group coarse-grained model, an individual coarse-grained model, a fine-grained model, etc.
  • the processor 110 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, and a universal asynchronous transmitter/receiver (universal asynchronous) interface.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transmitter/receiver
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM subscriber identity module
  • USB Universal Serial Bus
  • the interface connection relationship between the modules illustrated in the embodiment of the present invention is merely a schematic illustration, and does not constitute a structural limitation of the electronic device 100.
  • the electronic device 100 may also adopt different interface connection modes in the foregoing embodiments, or a combination of multiple interface connection modes.
  • the charging management module 140 is used to receive charging input from the charger.
  • the charger can be a wireless charger or a wired charger.
  • the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the external memory, the display screen 194, the camera 193, and the wireless communication module 160.
  • the wireless communication function of the electronic device 100 can be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor, and the baseband processor.
  • the antenna 1 and the antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in the electronic device 100 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
  • Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
  • the antenna can be used in combination with a tuning switch.
  • the mobile communication module 150 can provide a wireless communication solution including 2G/3G/4G/5G and the like applied to the electronic device 100.
  • the mobile communication module 150 may include at least one filter, a switch, a power amplifier, a low noise amplifier (LNA), and the like.
  • the mobile communication module 150 can receive electromagnetic waves by the antenna 1, and perform processing such as filtering, amplifying and transmitting the received electromagnetic waves to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modem processor, and convert it into electromagnetic waves for radiation via the antenna 1.
  • at least part of the functional modules of the mobile communication module 150 may be provided in the processor 110.
  • at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be provided in the same device.
  • the modem processor may include a modulator and a demodulator.
  • the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal.
  • the demodulator then transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the application processor outputs a sound signal through an audio device (not limited to the speaker 170A, the receiver 170B, etc.), or displays an image or video through the display screen 194.
  • the modem processor may be an independent device.
  • the modem processor may be independent of the processor 110 and be provided in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 160 can provide applications on the electronic device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), bluetooth (BT), and global navigation satellites.
  • WLAN wireless local area networks
  • BT wireless fidelity
  • GNSS global navigation satellite system
  • FM frequency modulation
  • NFC near field communication technology
  • IR infrared technology
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2, frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110.
  • the wireless communication module 160 may also receive a signal to be sent from the processor 110, perform frequency modulation, amplify, and convert it into electromagnetic waves to radiate through the antenna 2.
  • Bluetooth can be used to implement data exchange between the electronic device 100 and other short-distance devices (such as mobile phones, smart watches, etc.).
  • the Bluetooth in the embodiments of the present application may be an integrated circuit or a Bluetooth chip.
  • the antenna 1 of the electronic device 100 is coupled with the mobile communication module 150, and the antenna 2 is coupled with the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology.
  • the electronic device 100 implements a display function through a GPU, a display screen 194, an application processor, and the like.
  • the GPU is an image processing microprocessor, which is connected to the display screen 194 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations and is used for graphics rendering.
  • the processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
  • the display screen 194 is used to display images, videos, and the like.
  • the display screen 194 may be a touch screen, and the touch screen may specifically include a touch panel and a display.
  • the touchpad can collect touch events on or near the user of the electronic device 100 (for example, the user uses a finger, a stylus, or any other suitable object to operate on the touchpad or near the touchpad), and Send the collected touch information to other devices (for example, the processor 110).
  • the display may be used to display information input by the user or information provided to the user and various menus of the electronic device 100.
  • the display can be configured in the form of a liquid crystal display, an organic light emitting diode, etc.
  • the electronic device 100 can implement a shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, and an application processor.
  • the camera 193 is used to capture still images or videos.
  • the object generates an optical image through the lens and is projected to the photosensitive element.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then transfers the electrical signal to the ISP to convert it into a digital image signal.
  • ISP outputs digital image signals to DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other formats of image signals.
  • the electronic device 100 may include one or N cameras 193, and N is a positive integer greater than one.
  • Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the electronic device 100 selects the frequency point, the digital signal processor is used to perform Fourier transform on the energy of the frequency point.
  • Video codecs are used to compress or decompress digital video.
  • the electronic device 100 may support one or more video codecs. In this way, the electronic device 100 can play or record videos in multiple encoding formats, such as: moving picture experts group (MPEG) 1, MPEG2, MPEG3, MPEG4, and so on.
  • MPEG moving picture experts group
  • MPEG2 MPEG2, MPEG3, MPEG4, and so on.
  • NPU is a neural-network (NN) computing processor.
  • NN neural-network
  • applications such as intelligent cognition of the electronic device 100 can be realized, such as image recognition, face recognition, voice recognition, text understanding, and so on.
  • the NPU may be used to generate dot data for speech recognition, image recognition, or text understanding. In some embodiments of the present application, the NPU may be used to extract training data from the dot data sequence to train the multi-instance learning model. In some embodiments of the present application, the NPU may be used to determine the intent of the subsequence according to a preset intent rule. There is no limitation here. In some embodiments of the present application, applications such as intelligent cognition of the rule engine can be realized through the NPU, such as text understanding, decision reasoning, etc.
  • the external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example, save music, video and other files in an external memory card.
  • the internal memory 121 may be used to store computer executable program code, where the executable program code includes instructions.
  • the processor 110 executes various functional applications and data processing of the electronic device 100 by running instructions stored in the internal memory 121.
  • the internal memory 121 may include a storage program area and a storage data area.
  • the storage program area can store an operating system, at least one application required for a function (such as a face recognition function, a fingerprint recognition function, a mobile payment function, etc.) and so on.
  • the storage data area can store data created during the use of the electronic device 100 (such as face information template data, fingerprint information template, etc.) and the like.
  • the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash storage (UFS), and the like.
  • UFS universal flash storage
  • the electronic device 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. For example, music playback, recording, etc.
  • the audio module 170 is used to convert digital audio information into an analog audio signal for output, and is also used to convert an analog audio input into a digital audio signal.
  • the audio module 170 can also be used to encode and decode audio signals.
  • the audio module 170 may be provided in the processor 110, or part of the functional modules of the audio module 170 may be provided in the processor 110.
  • the speaker 170A also called “speaker” is used to convert audio electrical signals into sound signals.
  • the electronic device 100 can listen to music through the speaker 170A, or listen to a hands-free call.
  • the receiver 170B also called “earpiece” is used to convert audio electrical signals into sound signals.
  • the electronic device 100 answers a call or voice message, it can receive the voice by bringing the receiver 170B close to the human ear.
  • the microphone 170C also called “microphone”, “microphone”, is used to convert sound signals into electrical signals.
  • the user can make a sound by approaching the microphone 170C through the human mouth, and input the sound signal into the microphone 170C.
  • the electronic device 100 may be provided with at least one microphone 170C. In other embodiments, the electronic device 100 may be provided with two microphones 170C, which can implement noise reduction functions in addition to collecting sound signals. In other embodiments, the electronic device 100 may also be provided with three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and realize directional recording functions.
  • the pressure sensor 180A is used to sense the pressure signal and can convert the pressure signal into an electrical signal.
  • the pressure sensor 180A may be provided on the display screen 194.
  • the capacitive pressure sensor may include at least two parallel plates with conductive materials.
  • the electronic device 100 determines the strength of the pressure based on the change in capacitance.
  • the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A.
  • the electronic device 100 may also calculate the touched position according to the detection signal of the pressure sensor 180A.
  • touch operations that act on the same touch position but have different touch operation strengths may correspond to different operation instructions. For example: when a touch operation whose intensity of the touch operation is less than the first pressure threshold is applied to the short message application icon, an instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, an instruction to create a new short message is executed.
  • the gyro sensor 180B may be used to determine the movement posture of the electronic device 100.
  • the angular velocity of the electronic device 100 around three axes ie, x, y, and z axes
  • the gyro sensor 180B can be used for image stabilization.
  • the gyro sensor 180B detects the shake angle of the electronic device 100, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the shake of the electronic device 100 through reverse movement to achieve anti-shake.
  • the gyro sensor 180B can also be used for navigation and somatosensory game scenes.
  • the air pressure sensor 180C is used to measure air pressure.
  • the electronic device 100 calculates the altitude based on the air pressure value measured by the air pressure sensor 180C to assist positioning and navigation.
  • the magnetic sensor 180D includes a Hall sensor.
  • the electronic device 100 may use the magnetic sensor 180D to detect the opening and closing of the flip holster.
  • the electronic device 100 can detect the opening and closing of the flip according to the magnetic sensor 180D.
  • features such as automatic unlocking of the flip cover are set.
  • the acceleration sensor 180E can detect the magnitude of the acceleration of the electronic device 100 in various directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of electronic devices, and apply to applications such as horizontal and vertical screen switching, pedometers, and so on.
  • the electronic device 100 can measure the distance by infrared or laser. In some embodiments, when shooting a scene, the electronic device 100 may use the distance sensor 180F to measure the distance to achieve fast focusing.
  • the proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector such as a photodiode.
  • the light emitting diode may be an infrared light emitting diode.
  • the electronic device 100 emits infrared light to the outside through the light emitting diode.
  • the electronic device 100 uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 100. When insufficient reflected light is detected, the electronic device 100 can determine that there is no object near the electronic device 100.
  • the electronic device 100 can use the proximity light sensor 180G to detect that the user holds the electronic device 100 close to the ear to talk, so as to automatically turn off the screen to save power.
  • the proximity light sensor 180G can also be used in leather case mode, and the pocket mode will automatically unlock and lock the screen.
  • the ambient light sensor 180L is used to sense the brightness of the ambient light.
  • the electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived brightness of the ambient light.
  • the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in the pocket to prevent accidental touch.
  • the fingerprint sensor 180H is used to collect fingerprints.
  • the electronic device 100 can use the collected fingerprint characteristics to implement fingerprint unlocking, access application locks, fingerprint photographs, fingerprint answering calls, and so on.
  • the temperature sensor 180J is used to detect temperature.
  • the electronic device 100 uses the temperature detected by the temperature sensor 180J to execute a temperature processing strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold value, the electronic device 100 reduces the performance of the processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection.
  • the electronic device 100 when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to avoid abnormal shutdown of the electronic device 100 due to low temperature.
  • the electronic device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
  • Touch sensor 180K also called “touch panel”.
  • the touch sensor 180K may be disposed on the display screen 194, and the touch screen is composed of the touch sensor 180K and the display screen 194, which is also called a “touch screen”.
  • the touch sensor 180K is used to detect touch operations acting on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • the visual output related to the touch operation can be provided through the display screen 194.
  • the touch sensor 180K may also be disposed on the surface of the electronic device 100, which is different from the position of the display screen 194.
  • the button 190 includes a power-on button, a volume button, and so on.
  • the button 190 may be a mechanical button. It can also be a touch button.
  • the electronic device 100 may receive key input, and generate key signal input related to user settings and function control of the electronic device 100.
  • the motor 191 can generate vibration prompts.
  • the motor 191 can be used for incoming call vibration notification, and can also be used for touch vibration feedback.
  • touch operations applied to different applications can correspond to different vibration feedback effects.
  • Acting on touch operations in different areas of the display screen 194, the motor 191 can also correspond to different vibration feedback effects.
  • Different application scenarios for example: time reminding, receiving information, alarm clock, games, etc.
  • the touch vibration feedback effect can also support customization.
  • the indicator 192 may be an indicator light, which may be used to indicate the charging status, power change, or to indicate messages, missed calls, notifications, and so on.
  • the SIM card interface 195 is used to connect to the SIM card.
  • the SIM card can be inserted into the SIM card interface 195 or pulled out from the SIM card interface 195 to achieve contact and separation with the electronic device 100.
  • the electronic device 100 may support 1 or N SIM card interfaces, and N is a positive integer greater than 1.
  • the SIM card interface 195 can support Nano SIM cards, Micro SIM cards, SIM cards, etc.
  • the same SIM card interface 195 can insert multiple cards at the same time. The types of the multiple cards can be the same or different.
  • the SIM card interface 195 can also be compatible with different types of SIM cards.
  • the SIM card interface 195 may also be compatible with external memory cards.
  • the electronic device 100 interacts with the network through the SIM card to implement functions such as call and data communication.
  • the positioning device can provide a geographic location for the electronic device 100. It is understandable that the positioning device may specifically be a receiver of a positioning system such as a global positioning system (GPS), Beidou satellite navigation system, and Russian GLONASS. After receiving the geographic location sent by the above-mentioned positioning system, the positioning device sends the information to the processor 110 for processing, or sends the information to the memory for storage.
  • GPS global positioning system
  • Beidou satellite navigation system Beidou satellite navigation system
  • Russian GLONASS Russian GLONASS
  • the electronic device 100 can obtain user operations through various sensors, buttons 190, camera 193, earphone interface 170D, microphone 170C and other components in the sensor module 180.
  • the processor 110 responds to the user operations and executes the process of corresponding instructions. Dotting data will be generated in the middle, and the generated dot data can be stored in the internal memory 121.
  • the processor 110 can train a multi-instance learning model according to the multi-instance learning model training method and the training data generation method in the embodiment of the present application, and can use the multi-instance learning model to sort the data sequence according to the intention recognition method in the embodiment of the present application. Divide into sub-sequences with small granularity and consistent data intent, and determine the intent of each sub-sequence.
  • the steps in each method may be completed by the application processor in the processor 110 alone, by the NPU in the processor 110 alone, or by the application processor in the processor and the NPU in cooperation.
  • the completion may also be completed by other processors in the processor 110 in cooperation, which is not limited here.
  • FIG. 14 is a block diagram of the software structure of the electronic device 100 according to an embodiment of the present invention.
  • the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Communication between layers through software interface.
  • the Android system is divided into four layers, from top to bottom, the application layer, the application framework layer, the Android runtime and system library, and the kernel layer.
  • the application layer can include a series of application packages.
  • the application package can include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, short message, the identification and decision system 501 shown in the schematic diagram in Figure 15 (also called For application (application, App)).
  • applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, short message, the identification and decision system 501 shown in the schematic diagram in Figure 15 (also called For application (application, App)).
  • the intent recognition decision system 501 may include an intent recognition module 605, and the intent recognition module 605 may be used to recognize, store, and manage intents.
  • the intent recognition decision-making system 501 may include an action feedback module 608.
  • the action feedback module 608 may include the multi-example learning model described above.
  • the multi-example learning model may be obtained based on training of a multi-example learning model training module, where the multi-example learning model training module may be used to execute the multi-example learning model training method in the embodiment of the present application.
  • the multi-example learning model training module may be configured in the action feedback module 608, and may also be configured on the end side or the cloud side, which is not limited here.
  • the multi-example learning model training module may include a training data generation module, and the training data generation module is used to execute the training data generation method in the embodiment of the present application.
  • the multi-instance learning model training module may be another separate module independent of the action feedback module 608, which is not limited here.
  • the training data generation module in the multi-instance learning model training module may also be another separate module independent of the action feedback module 608 and the multi-instance learning model training module, which is not limited here.
  • the intent recognition module 605, the action feedback module 608, the multi-instance learning model training module and the training data generation module can also be located in other levels of the software architecture, such as the application framework layer, system library, kernel layer, etc. , There is no limitation here.
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer includes some predefined functions.
  • the application framework layer may include a window manager, a content provider, a view system, a phone manager, a resource manager, a notification manager, a local profile assistant (LPA), etc.
  • a window manager may include a window manager, a content provider, a view system, a phone manager, a resource manager, a notification manager, a local profile assistant (LPA), etc.
  • LPA local profile assistant
  • the window manager is used to manage window programs.
  • the window manager can obtain the size of the display screen, determine whether there is a status bar, lock the screen, take a screenshot, etc.
  • the content provider is used to store and retrieve data and make these data accessible to applications.
  • the data may include videos, images, audios, phone calls made and received, browsing history and bookmarks, phone book, etc.
  • the view system includes visual controls, such as controls that display text, controls that display pictures, and so on.
  • the view system can be used to build applications.
  • the display interface can be composed of one or more views.
  • a display interface that includes a short message notification icon may include a view that displays text and a view that displays pictures.
  • the phone manager is used to provide the communication function of the electronic device 100. For example, the management of the call status (including connecting, hanging up, etc.).
  • the resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.
  • the notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and it can automatically disappear after a short stay without user interaction.
  • the notification manager is used to notify download completion, message reminders, and so on.
  • the notification manager can also be a notification that appears in the status bar at the top of the system in the form of a chart or a scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialogue interface. For example, text messages are prompted in the status bar, prompt sounds, electronic devices vibrate, and indicator lights flash.
  • Android Runtime includes core libraries and virtual machines. Android runtime is responsible for the scheduling and management of the Android system.
  • the core library consists of two parts: one part is the function functions that the java language needs to call, and the other part is the core library of Android.
  • the application layer and application framework layer run in a virtual machine.
  • the virtual machine executes the java files of the application layer and the application framework layer as binary files.
  • the virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.
  • the system library can include multiple functional modules. For example: surface manager (surface manager), media library (Media Libraries), three-dimensional graphics processing library (for example: OpenGL ES), two-dimensional graphics engine (for example: SGL), etc.
  • the surface manager is used to manage the display subsystem, and provides a combination of two-dimensional (2-Dimensional, 2D) and three-dimensional (3-Dimensional, 3D) layers for multiple applications.
  • the media library supports playback and recording of a variety of commonly used audio and video formats, as well as still image files.
  • the media library can support multiple audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, synthesis, and layer processing.
  • the 2D graphics engine is a drawing engine for 2D drawing.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer contains at least display driver, camera driver, audio driver, sensor driver, and virtual card driver.
  • the corresponding hardware interrupt is sent to the kernel layer.
  • the kernel layer processes touch operations into original input events (including touch coordinates, time stamps of touch operations, etc.).
  • the original input events are stored in the kernel layer.
  • the application framework layer obtains the original input event from the kernel layer and identifies the control corresponding to the input event. Taking the touch operation as a touch click operation, and the control corresponding to the click operation is the control of the camera application icon as an example, the camera application calls the interface of the application framework layer to start the camera application, and then starts the camera driver by calling the kernel layer.
  • the camera 193 captures still images or videos.
  • FIG. 15 it is a block diagram of an exemplary software structure of the above-mentioned intention recognition decision-making system 501.
  • the intention recognition decision-making system 501 is used to map external multi-modal inputs, such as user operations, environmental perception, text input, voice input, visual input, etc., to high-level entities, and combine them with contextual high-level entities within a certain period of time.
  • entity sequence is formed together, and the entity sequence is mapped to the extensible intention system to obtain the user's current intention, combined with the existing domain knowledge, rules and extensible entity sequence, based on statistics and logic, the reasoning and decision should be made What kind of device needs to respond to the user, that is, map this intent to an action sequence and service chain, and feed it back to the intent system based on this, and make corrections to the alignment.
  • the intention recognition decision system 501 includes a multimodal input module 601, a knowledge base 602, an entity recognition module 603, a context module 604, an intention recognition module 605, a rule engine 606, a decision reasoning module 607, and an action feedback module 608.
  • the multi-modal input module 601 is used to obtain input data of various different input types.
  • user operation data such as the user's touch, pressing, and sliding on the electronic device 100 can be obtained; environmental perception data obtained by various sensors in the electronic device 100 can be obtained; text input data when the user searches for text in the electronic device 100 can be obtained
  • voice input data detected by the microphone of the electronic device 100 can be acquired; the visual input data such as pictures, videos, gestures, and facial expressions recognized by the camera in the electronic device 100 can be acquired.
  • Other types of input that can be obtained by the electronic device 100 can also be obtained, which is not limited here.
  • the data acquired by the multi-modal input module 601 may include dot data, user perception data, and so on.
  • the knowledge base 602 contains the existing domain knowledge, which can specifically include various trigger points for the entity recognition module 603 to initiate entity recognition, the length of the time pane for entity recognition corresponding to each trigger point, each trigger point and multi-modal input Correspondence of types of input methods, saved user habit rules, entity recognition models trained based on entities in entity warehouse unit 6033, and association relationships between entities.
  • the knowledge base 602 may include a knowledge graph.
  • the entity identification module 603 is used to identify, store and manage entities.
  • the entity recognition module 603 includes an entity extraction unit 6031, an entity management unit 6032, and an entity warehouse unit 6033.
  • the entity extraction unit 6031 is used to identify entities with specific meaning from the data acquired by the multimodal input module 601 according to the entity recognition model stored in the knowledge base 602; the entity warehouse unit 6033 is used to store entities; the entity management unit 6032 Used to regularly update and dynamically expand the physical warehouse.
  • the entity recognition module 603 can extract feature vectors from the multi-modal input data to obtain a feature vector set.
  • the feature vector set may include all the feature vectors extracted from the multi-modal input data, and the feature vector may be used to represent the characteristics of each data of the multi-modal input.
  • the entity recognition module 603 can input the obtained feature vector set into the entity recognition model to obtain the entity sequence.
  • the entity recognition model may be the correspondence relationship between the feature vector and the entity obtained by training based on the entity data stored in the electronic device, the entity data is the storage form of the entity, and the entity data includes at least the number of the entity and a set of feature vectors representing the entity .
  • the context module 604 is used to store context entities.
  • the contextual entity refers to the sequence of entities in the pane for a period of time recognized by the electronic device.
  • the number of entity sequences stored in the context module 604 can be preset, or can be controlled in real time according to the storage capacity of the electronic device, which is not limited here.
  • the intention recognition module 605 is used to recognize, store, and manage intentions.
  • the intention recognition module includes an intention mapping unit 6051, an intention management unit 6052, and an intention storage unit 6053.
  • the intention mapping unit 6051 is used to predict user intentions according to the entity sequence, the input is the entity sequence, and the output is the intention;
  • the intention warehouse unit 6053 is used to store the intention;
  • the intention management unit 6052 is used to periodically update and dynamically expand the intention warehouse unit 6053, Some newly-appearing intentions will be added to the intention warehouse unit 6053, and the intentions that have not appeared for a long time will be removed from the intention warehouse unit 6053.
  • the intent recognition module 605 may determine multiple candidate intents based on the pre-stored knowledge graph, and determine the target intent from the multiple candidate intents, as described below for details.
  • the intent recognition module 605 may have an intent recognition model, and the intent recognition model may be used to recognize the intent.
  • the characteristics of the generative countermeasure network can be used to reduce the deviation between the simulated data generated in the generator and the original input test data, so as to The data quality of the simulation data generated by the neural network is improved, so that the simulation data obtained by using the generative countermeasure network as part of the input data of the preset training network is trained to obtain a predictive model, for example, an intention recognition model. Since the deviation between the simulated data and the original input test data is small, participating in the training process of the training network through the simulated data can improve the prediction effect of the subsequent prediction model, so that the training in the simulated environment is better
  • the predictive model is the optimal intent recognition model.
  • the intention recognition model can be obtained based on a joint learning system.
  • the joint learning system may include multiple node devices, and each node device may be configured with a group coarse-grained model and a fine-grained model.
  • the coarse-grained model is trained, and the group coarse-grained model is updated through the joint learning of multiple node devices, and the fine-grained data is input to the fine-grained model for training; finally, the group coarse-grained model and the fine-grained model are combined to obtain the joint model
  • the label space of the joint model is mapped to fine-grained labels, and the output result of the joint model can be used to update the fine-grained model.
  • a joint model such as
  • the rule engine 606 is used to provide rules for reasoning and decision-making. In some simple scenarios, you do not need to use data to predict user intentions and make decisions for them, just decide which actions to perform in the scenario according to the rules.
  • the rule engine 606 can pre-store commonly used existing rules, and can also update the rules according to user custom rules stored in the knowledge base 602.
  • the rule engine 606 can obtain a knowledge graph from the knowledge base 602, and then predict the user's intention or actions to be performed in the scenario based on the knowledge graph.
  • the rule engine 606 may have one or more rules.
  • the rule engine 606 may include a rule topology map.
  • the rule topology graph can include root node (root node), type node (type node), pattern node (pattern node), combination node (merge node), result node (consequence node) and activation node (active node). Each node will be introduced separately below.
  • the root node (root node) is the input starting node, which can be the entrance of the rule engine, and all fact objects can enter the rule engine through the root node.
  • a rule engine can contain a root node.
  • the type node can define the type of fact data. After each fact in the fact object enters from the root node, it can enter the type node; the type node can be type checked, and only the facts matching its type can reach the node.
  • the number of type nodes can be determined by the number of types of facts included in the condition part of the rule. Exemplarily, when a rule is included in the rule topology, if the condition part of the rule contains two types of facts, then there are two type nodes; when the rule topology includes multiple rules, if there are multiple rules The condition part of contains 3 types of facts, so there are 3 type nodes.
  • the condition part of one rule is "Age is greater than 20 years old, and the location is outdoor", and the condition part of another rule is "Time is 8 am Location, location is at home”, there are three types of facts at this time, namely "time", "age” and "location”. Therefore, the topology graph can contain 3 types of type nodes.
  • the root node can determine the type of each fact in the fact object, for example, based on the class type; then the root node inputs each fact to the corresponding type node.
  • the fact object includes the following facts: the date is December, the time is 8 am, and the location is outdoors; then the fact object includes two types of facts, namely time and location, among which, "December, 8 am Click the two facts to enter the type node of time, and "outdoor" can enter the type node of location.
  • the factual data can be entities, intentions, etc.
  • a pattern node can store the semantic objects of the pattern in the rule and determine the fact that conforms to the pattern corresponding to the pattern node.
  • a pattern node can express a condition in a rule, and the expressed condition is a computer-understandable conditional expression; in addition, the pattern node can also express the matching result of the condition, calculate the conditional expression, and store the calculation result .
  • each mode node corresponds to a mode of the rule.
  • the condition part of the rule "Age is greater than 20 years old and the location is outdoor” then the rule topology graph can contain two mode nodes, and one mode node corresponds to The "age is greater than 20" in the condition part of the rule, and another mode node corresponds to the "location is outdoor” in the condition part of the rule.
  • the semantic object of the pattern in the rule is stored in the pattern node. It can be understood that the pattern node stores the calculation statement behind the pattern in the rule corresponding to the pattern node.
  • the fact that the pattern node determines the pattern corresponding to the pattern node can be understood as the pattern node can load its stored semantic object to judge the fact of entering the pattern node to determine the fact of entering the pattern node Whether it meets the facts of the mode corresponding to the mode node, for example, if the mode corresponding to the mode node is "age greater than 20 years old", it stores a calculation sentence for judging whether the age is greater than 20 years old. When entering the mode node, the fact is "age When it is 19 years old", the mode node can load the corresponding calculation sentence to judge the fact that "age is 19 years old".
  • the types of mode nodes can include two types: transient mode nodes and persistent mode nodes.
  • the semantic objects of transient mode nodes can be stored in memory, and the semantic objects of persistent mode nodes can be persisted in files.
  • the data change frequency of the fact of the mode corresponding to the transient mode node is higher than the data change frequency of the fact of the mode corresponding to the persistent mode node.
  • transient mode nodes are suitable for patterns that rely on frequent data changes, such as changes in time and geographic location
  • persistent mode nodes are suitable for patterns that rely on slow data changes, such as changes in age and seasons.
  • the pattern node selectively persists the semantic object to a file or loads it into the memory to be resident, so that it can release the pattern node that is not frequently accessed. Redundant memory is removed, and at the same time, the matching efficiency of frequently accessed nodes is not affected, so as to achieve the purpose of reducing memory.
  • the data structure of the pattern node can be represented by the state table and the pattern semantic index.
  • the state table can be used to cache the historical matching information of the pattern corresponding to the pattern node
  • the pattern semantic index can be used to index and obtain the semantic object of the pattern node.
  • the historical matching information may include: the identity of the pattern corresponding to the pattern node (i.e. ID in Fig. 18), the previous matching result of the pattern corresponding to the pattern node (i.e. isMatached in Fig.
  • the mode semantic index may include memory or files, where, when the mode semantic index includes memory, it means that the mode node is a transient mode node, When the pattern semantic index includes a file, it means that the pattern node is a persistent state pattern node.
  • the pattern semantic index of the transient mode node is to obtain the semantic object from the index in the memory
  • the pattern semantic index of the persistent mode node is to obtain the semantic object from the index in the file.
  • the previous matching result (ie isMatached in Figure 18) can be represented by a flag bit.
  • 1 means that the pattern corresponding to the pattern node matches, and 0 means that the pattern corresponding to the pattern node does not match, that is, 1 Represents true (true), 0 represents false (false); for example, the pattern corresponding to the pattern node is "age is greater than 20 years old", if the last input fact is "age is 19 years old", then the previous match The result can be indicated by the flag bit 0. If the fact entered last time is "age is 30 years old", the result of the previous match can be indicated by the flag bit 1 at this time.
  • the number of data changes of the facts corresponding to the pattern node can be understood as the number of data changes of the facts in the historical matching information of the pattern corresponding to the pattern node.
  • the pattern node is loaded in total If the semantic object is 4 times, the data change times of the facts in the historical matching information of the pattern corresponding to the pattern node are 4 times.
  • the pattern node loads the semantic object to judge the fact and update its state The number of data changes for the facts recorded in the table.
  • the number of changes of the fact data recorded in the state table of the mode node is 2 times, and the number of changes of the fact data entered into the rule engine is 3 times. If the two do not match, the mode node loads the semantic object to the current The input fact is judged. At this time, the mode node can update the number of data changes of the fact recorded by it to 3 times.
  • the last matching result can be used, and there is no need to update the previous one. The matching result, that is, the isMatached in Figure 18 does not need to be updated; otherwise, the last matching result needs to be updated and used, that is, the isMatached in Figure 18 is updated.
  • the number of data changes of the facts recorded in the state table of the model node can be used to determine whether to adjust the type of the model node when reconstructing the rule topology.
  • the number of data changes of the fact recorded in the state table of the mode node is greater than the preset number of thresholds, it indicates that the change frequency of the fact is relatively fast.
  • the rule topology is reconstructed, if the data changes before the reconstruction
  • the type of the mode node is a transient mode node, then the type of the mode node will continue to be maintained as a transient mode node during this reconstruction; if the type of the mode node before the reconstruction is a persistent mode node, it will be repeated this time.
  • the number of data changes of the facts recorded in the state table of the mode node 7 is different from the preset number of times threshold, and the corresponding facts have changed frequency. At this time, you can reconstruct The type of pattern node 7 is changed when the topology map is ruled.
  • the climate in most parts of China has four distinct seasons, and the period of seasonal changes is often 3 months, that is, the frequency of quarterly changes is low.
  • the temperature difference between day and night in China's Xinjiang region is often large. Sometimes the temperature at noon during the day is equivalent to summer, and the temperature at night is equivalent to winter. Therefore, it can be understood that the seasonal changes in this region are more frequent.
  • the default is to store the semantic object of the pattern node corresponding to the "quarter" in the rule engine in a file, the rule engine can meet the requirements when used in most areas of China.
  • the rule engine is used in the Xinjiang region of China, there will be frequent loading of semantic objects from files, resulting in lower execution efficiency of the rule engine.
  • the rule engine reconstructs its rule topology map in Xinjiang, China, the semantic object of the pattern node corresponding to "quarter” can be switched from being stored in a file to being stored in memory, that is, switching the corresponding "quarter” The type of pattern node.
  • the type of pattern node when the rule topology is constructed for the first time, can be determined based on empirical values. For example, when the fact corresponding to the mode node is "age”, since the change frequency of age is slow, the type of the mode node corresponding to the fact of "age” can be determined as the persistent mode node, and the semantic object can be stored in In the file; when the fact corresponding to the mode node is "time”, since the change frequency of time is relatively fast, the type of the mode node corresponding to the fact of "time” can be determined as a transient mode node, and the semantic object is stored In memory.
  • a merge node can combine the matching results of each pattern node corresponding to a rule and determine whether to trigger the rule.
  • There is at least one combination node and each combination node corresponds to a rule.
  • the combined node comprehensively expresses the semantic information and logical results of the combined mode.
  • Combination modes of different data types can be combined into the conditions of a certain rule by combining nodes. For example, the conditional part of the rule formed by combining "22 ⁇ age ⁇ 30" and “location is outdoor” is "22 ⁇ age ⁇ 30, location is outdoor”.
  • the combined node can determine to trigger the rule.
  • the matching result of one of the pattern nodes corresponding to a rule indicates that the matching fails, the combined node can determine to restrict triggering the rule, that is, not triggering the rule.
  • the combined node corresponding to the rule may correspond to the last mode node of the mode nodes combined through the chain.
  • you need to delete a rule you don't need to modify the rule topology directly, but mark the combined node corresponding to the rule as invalid; after that, the rule will be deleted when the rule topology is reconstructed next time.
  • the result node can store the semantic object of the action required by the rule, and the semantic object of the action required to load the rule when the combination node determines to trigger the rule.
  • each rule has a result node
  • the number of result nodes in the rule topology graph in the rule engine is at least one
  • each result node corresponds to a combined node.
  • the result node expresses the specific execution statement of a certain action in the rule. When the rule meets all the conditions, the corresponding action is triggered.
  • the types of result nodes can include two types: transient result nodes and persistent result nodes.
  • the semantic object of the transient result node can be stored in memory, and the semantic object of the persistent result node can be persisted in a file.
  • the type of the result node depends on the type of the mode node; among them, when the type of the mode node corresponding to each mode in a rule is a transient mode node, the type of the result node corresponding to the rule is transient.
  • State result node when there is a persistent state mode node in the type of each mode node in a rule, the type of the result node of the rule is a persistent state result node.
  • a rule includes two modes, and the types of mode nodes corresponding to these two modes are transient mode nodes, and the type of the result node corresponding to the rule is a transient result node; a rule includes two modes, The type of the mode node corresponding to one mode is the transient mode node, and the type of the mode node corresponding to the other mode is the persistent mode node, then the type of the result node corresponding to the rule is the persistent result node; a rule includes two Mode, the types of the mode nodes corresponding to these two modes are all persistent state mode nodes, and the type of the result node corresponding to the rule is the persistent state result node.
  • the data structure of the result node may include a pattern semantic index, and the pattern semantic index may be used to index the semantic object of the result node.
  • the pattern semantic index of the transient result node is to obtain the semantic object from the index in the memory
  • the pattern semantic index of the persistent result node is to obtain the semantic object from the index in the file.
  • the rule corresponding to the persistent result node is triggered at a lower frequency, and the rule corresponding to the transient result node is triggered at a higher frequency.
  • the rule is a weather reminder rule
  • the rule is triggered more frequently, so it can be inferred that the type of result node corresponding to the rule is a transient result node; when the rule is annual
  • the frequency of triggering of the rule is low, so it can be inferred that the type of the result node corresponding to the rule is a persistent result node.
  • the type of the result node corresponding to the rule can also be adaptively switched.
  • the type of the result node is switched You can refer to the relationship between the result node and the mode node described above. For example, as shown in Figure 19, when the rule topology is reconstructed, the type of mode node 7 has changed, and the rule corresponding to mode node 7 has only one mode node, so there is no influence of other mode nodes. Then you can switch the type of the result node corresponding to the rule.
  • the active node can execute the action corresponding to the rule after loading the semantic object of the action required by the rule in the result node. For example, when the rule is a weather reminder rule, after the rule is triggered, the activated node can perform a weather reminder.
  • the type of the mode node corresponding to mode a can be defined as a persistent mode node; mode a When it is a pattern with frequent changes in geographic location, such as "Are you at home” or "Are you away from home", the type of mode node corresponding to mode a can be defined as a transient mode node.
  • the state table and the corresponding semantic index can be generated according to the type of the mode node.
  • rule topology map creation process can be referred to the introduction of the rule topology map in the rule engine above, for example, how to determine the type of pattern node, etc., here is I will not repeat them one by one.
  • a year-end summary card pops up on a negative screen
  • a pattern node that is, "Age>20" in Figure 20
  • the type of the pattern node is defined.
  • the frequency of age fact data changes is low, so The type of this mode node is a persistent mode node.
  • the state table and semantic index of the mode node can be generated.
  • a combination node and a result node can be created. After that, each rule is compiled randomly or sequentially, and the rule topology as shown in FIG. 20 can be constructed.
  • the rule topology map After the rule topology map is constructed, the rule topology map can be used. The following describes the application process of the rule topology diagram in conjunction with FIG. 20.
  • FIG. 21 is a schematic flowchart of a method for executing a rule engine according to an embodiment of the present application. It can be understood that the method can be executed by any device, device, platform, or device cluster with computing and processing capabilities. As shown in Figure 21, the execution method of the rule engine includes:
  • Step S101 Determine the first fact data input into the rule engine; according to the first attribute of the first fact data, obtain the first semantic object from the memory to match the first fact data, and the first attribute is used to characterize the first fact data The frequency of change.
  • fact data can be input into the rule engine.
  • the first fact data can be determined.
  • fact data can be entered into the rule engine from the root node shown in FIG. 17.
  • the first fact data can be entities, intentions, and so on.
  • the first semantic object can be obtained from the memory to match the first fact data according to the first attribute of the first fact data, and the first attribute is used to characterize the change of the first fact data frequency.
  • the first fact data can be time or location.
  • the first attribute may be a type. For example, when the first attribute is a time type, it indicates that the first fact data changes more frequently. Exemplarily, this step may be performed by the transient mode node shown in FIG. 17.
  • Step S102 Determine the second fact data input into the rule engine; according to the second attribute of the second fact data, obtain the second semantic object from the file to match the second fact data, and the second attribute is used to characterize the second fact data The frequency of change, where the second attribute is different from the first attribute.
  • fact data can be input into the rule engine.
  • the second fact data can be determined.
  • fact data can be entered into the rule engine from the root node shown in FIG. 1.
  • the second fact data can be entities, intentions, and so on.
  • the second semantic object can be obtained from the file to match the second fact data according to the second attribute of the second fact data.
  • the second attribute is used to characterize the change of the second fact data. frequency.
  • the second fact data can be age or season.
  • the second attribute may be a type.
  • the second attribute is an age type, it indicates that the frequency of change of the second fact data is relatively slow.
  • the second attribute is different from the first attribute.
  • the first attribute is a time type
  • the second attribute may be an age type. Exemplarily, this step may be performed by the persistent mode node shown in FIG. 17.
  • Step S103 Determine whether to perform the first operation according to the first matching result corresponding to the first fact data and the second matching result corresponding to the second fact data.
  • the first operation may be: reminding the weather, reminding the road condition, reminding the user to rest, play or work, recommend a manual, or preload actions or services.
  • this step may be performed by the combined node shown in FIG. 17.
  • the rule engine involved in the method may include a second node.
  • step S103 may be specifically: when the first matching result indicates that the matching is successful, and the second matching result indicates that the matching is successful, the second node Obtain the third semantic object from the file indicated by the semantic index of the node, and execute the first operation corresponding to the third semantic object.
  • the second node may be the persistent state result node shown in FIG. 17.
  • performing the first operation corresponding to the third semantic object may be performed by the activation node shown in FIG. 17.
  • step S101 and step S102 can be changed, and this solution does not limit this.
  • step S102 is performed first, and then step S101; or, step S101 and step S102 are performed simultaneously, and so on.
  • this solution based on the attributes of the fact data, it is determined to load the semantic object from the memory or file, and based on the determined semantic object to match the fact data, so that part of the rule engine can be used to match the semantic object of the fact data Stored in the memory, another part of the semantic object used to match the fact data is stored in the file, which can release some redundant memory, reduce the memory overhead during the operation of the rule engine, and improve the ability of the rule engine.
  • the method based on this solution can greatly reduce the memory overhead of the end-side platform, which greatly improves the running ability of the rule engine on the end-side platform.
  • the execution method of the rule engine mentioned in this solution can also be applied to the cloud side.
  • the overhead of cloud-side server resources can be greatly reduced.
  • the ability of the rule engine is improved, when the rule engine is used for intent recognition and action decision-making, the execution efficiency of intent recognition and action decision-making can be significantly improved.
  • the input method of the data input to the rule engine is multi-modal input, the amount of input data is large and the types are mostly different. For example, some data changes more frequently, while some data changes slowly.
  • the rule engine in can load semantic objects from memory to match frequently changing data, and load semantic objects from files to match slower-changing data, thereby avoiding the semantic objects corresponding to slower-changing data from continuously occupying memory
  • the memory overhead during the operation of the rule engine is reduced, the capability of the rule engine is improved, and the execution efficiency of the rule engine is improved.
  • the rule engine involved in the method may include a first node, the first node includes at least a first type node and a second type node, wherein the first type node is related to the first attribute, and the second type node Related to the second attribute.
  • the first semantic object may be obtained from the memory indicated by the first semantic index according to the first semantic index of the first type node corresponding to the first attribute , And match the first fact data based on the first semantic object.
  • the first node may be the mode node shown in FIG. 1, and the first type of node may be the transient mode node shown in FIG. 17.
  • the second semantic object may be obtained from the file indicated by the second semantic index according to the second semantic index of the second type node corresponding to the second attribute, And matching the second fact data based on the second semantic object.
  • the second type of node may be the persistent mode node shown in FIG. 17.
  • the number of changes of the first fact data recorded in the node of the first type and the number of first fact data input to the rule engine may also be determined.
  • the number of changes in fact data is different.
  • the number of changes of the first fact data recorded in the first type node can be understood as the value of modCount in the state table of the mode node shown in FIG. 18.
  • the previous matching result recorded by the first type of node can be used As the first matching result.
  • the previous matching result recorded by the first type node can be understood as isMatached in the state table of the mode node shown in FIG. 18.
  • the number of changes of the second fact data recorded in the second type node and the second fact input to the rule engine is different.
  • the number of changes of the second fact data recorded in the second type node can be understood as the value of modCount in the state table of the mode node shown in FIG. 18.
  • the previous matching result recorded by the second type of node can be used As the second matching result.
  • the previous matching result recorded by the second type node can be understood as isMatached in the state table of the mode node shown in FIG. 18.
  • the rules in the rule engine when the rules in the rule engine are reconstructed, it may be determined whether to switch the first type node to the second type node based on the determined change times of the first fact data recorded in the first type node. Specifically, when the number of changes of the first fact data recorded in the node of the first type is less than the preset number threshold, it indicates that the frequency of changes of the first fact data is low at this time. If the object is stored in the memory, the memory may be occupied for a long time. Therefore, the node of the first type can be switched to the node of the second type at this time.
  • the rules in the rule engine when the rules in the rule engine are reconstructed, it may be determined whether to switch the second type node to the first type node based on the determined change times of the second fact data recorded in the second type node. Specifically, when the number of changes of the second fact data recorded in the second type node is greater than the preset number threshold, it indicates that the change frequency of the second fact data at this time is relatively large. At this time, if the semantics in the second type node are changed If the object is stored in a file, the loading efficiency of the semantic object is slow. Therefore, the second type of node can be switched to the first type of node at this time.
  • FIG. 22 is a schematic structural diagram of a rule engine provided by an embodiment of the present application.
  • the rule engine includes: a first node 61.
  • the first node 61 includes at least a first type node 611 and a second type node 612.
  • the first type node 611 can be used to obtain the first semantic object from the memory to match the first fact data according to the first attribute of the first fact data input to the rule engine to obtain the first matching result, the first attribute Used to characterize the change frequency of the first fact data.
  • the second type node 612 can be used to obtain the second semantic object from the file to match the second fact data according to the second attribute of the second fact data input into the rule engine to obtain the second matching result, and the second attribute is used for Characterizing the change frequency of the second fact data, the second attribute is different from the first attribute.
  • the first matching result and the second matching result are used together to determine whether to perform the first operation.
  • the first type node 611 may be the transient mode node shown in FIG. 17, and the second type 612 may be the persistent state mode node shown in FIG. 17.
  • the first fact data includes at least one of time and location; the second fact data includes at least one of age and season.
  • the first operation includes one or more of the following: reminding the weather, reminding the road conditions, reminding the user to rest, play or work, recommend the manual, preload actions or services.
  • the first type node 611 may be specifically used to obtain the first semantic object from the memory indicated by the first semantic index according to the first semantic index corresponding to the first attribute, and to compare the first semantic object based on the first semantic object. Fact data is matched.
  • the second type node 612 may be specifically used to obtain the second semantic object from the file indicated by the second semantic index according to the second semantic index corresponding to the second attribute, and to match the second fact data based on the second semantic object.
  • the first type node 611 before the first type node 611 obtains the first semantic object from the memory to match the first fact data, it can also be used to determine the number of changes of the first fact data recorded in the first type node 611 and The change times of the first fact data input to the rule engine are different.
  • the second type node 612 can also be used to determine the number of changes of the second fact data recorded in the second type node 612 before the second semantic object is obtained from the file to match the second fact data.
  • the number of changes of the second fact data input to the rule engine is different.
  • the first type node 611 can also be used when the number of changes of the first fact data recorded in the first type node 611 is the same as the number of changes of the first fact data input to the rule engine.
  • the previous matching result recorded by the one-type node 611 is taken as the first matching result.
  • the second-type node 612 can also be used when the number of changes of the second fact data recorded in the second-type node 612 is the same as the number of changes of the second fact data input to the rule engine.
  • the previous matching result recorded by the second-type node 612 is taken as the second matching result.
  • the rule engine may further include a second node 62.
  • the second node 62 can be used to obtain the third semantic object from the file indicated by the semantic index of the second node when the first matching result indicates that the matching is successful and the second matching result indicates that the matching is successful, and execute the third semantic object The corresponding first operation.
  • the second node 62 may be the result node shown in FIG. 1.
  • the rule engine may also include a third node, a fourth node, a fifth node, and a sixth node.
  • the third node may be the root node shown in FIG. 17, the fourth node may be the type node shown in FIG. 1, the fifth node may be the combined node shown in FIG. 17, and the sixth node may be The active node shown in Figure 17.
  • the first node may be the mode node shown in FIG. 17, and the second node may be the result node shown in FIG. 17.
  • rule engine can be configured in any device, device, platform, or device cluster with computing and processing capabilities.
  • the rule engine may be configured in a device including a processor and a memory, where the device may be a terminal or a server.
  • the decision reasoning module 607 in the intention recognition decision system 501 is used to make a decision for the user, that is, which action to perform on which device, and most of the actions to be performed are preloaded actions or services.
  • the decision reasoning module 607 may maintain an action sequence library, and may also contain the correspondence between entity sequences, intentions, and action sequences.
  • the decision inference module 607 can call the rules in the rule engine 606 to determine which action to perform.
  • the decision inference module 607 determines which device to perform according to the correspondence between the entity sequence, intent, and action sequence. What action to perform.
  • the decision reasoning module 607 may have an action prediction model, which can make a decision for the user.
  • the action prediction model may be obtained based on the above-mentioned method of obtaining the intention recognition model in the intention recognition module 605.
  • the action feedback module 608 is used to compare the predicted action sequence with the action sequence actually performed by the user to give feedback on whether the predicted result is correct.
  • the input of the action feedback module 608 is the action sequence predicted by the decision inference module 607, and the output is a comparison between the predicted result and the real result. If the two are the same, the feedback prediction is correct, and vice versa.
  • the result of the action feedback can be used to update the correspondence between the entity sequence and the intention, as well as the entity sequence, the correspondence between the intention and the action sequence. For example, if it is predicted that the user’s intention is to open the music player, the decision to execute is to preload QQ music in the background.
  • the action feedback module will record it at this time to update the entity sequence, the corresponding relationship between the intention and the action sequence. If it is predicted that the user’s intention is to open the music player, the decision to execute is to pre-load QQ music in the background, but the actual operation of the user is to open JD. At this time, the action feedback module will record it to update the entity sequence and intent. Correspondence, and the correspondence between entity sequence, intent and action sequence.
  • the action feedback module 608 may include a multi-instance learning model (not shown in the figure).
  • the multi-instance learning model can be used to divide the continuous dot data that may not belong to the same intention in each sequence to be processed into different granularity according to the possibility that the continuous dot data in each sequence to be processed belongs to the same intention.
  • the action feedback module 608 can determine the intent of each of the multiple subsequences according to the preset intent rule, where the preset intent rule can be used to determine the intent of the sequence according to the dot data in the sequence.
  • the action feedback module 608 determines the intention of each subsequence, it learns the action sequence actually performed by the user, compares it with the predicted action sequence, and gives feedback on whether the predicted result is correct.
  • the action feedback module 608 may also include a multi-example learning model training module (not shown in the figure).
  • the multi-instance learning model training module can execute the multi-instance learning model training method in this solution.
  • the training method of the multi-instance learning model in this solution please refer to the following description. It should be understood that the multi-example learning model training module can also be configured on the end side or the cloud side, which is not limited here.
  • the multi-modal input module 601 obtains data in a variety of different input modes, and sends the obtained data to the entity recognition module 603.
  • the entity extraction unit 6031 in the entity recognition module 603 extracts feature vectors from these data, inputs them to the entity recognition model obtained from the knowledge base 602, and outputs the recognized entities.
  • the entity extraction unit 6031 can identify the entity warehouse unit from these data according to the entity recognition model in the knowledge base 602 6033 stored entities.
  • the entity extraction unit 6031 After the entity extraction unit 6031 obtains the recognized entities, they send them to the context module 604 in the recognized order, and the context module 604 saves them as an entity sequence according to the received order.
  • a sequence of entities that all historically received entities are saved in the order they are received can be referred to as context entities.
  • the context module 604 sends the latest part of the entity sequence in the context entity (at least the entity sequence composed of the entities recognized in the time pane of the most recent entity recognition) to the intent recognition module 605.
  • the intent mapping unit 6051 in the intent recognition module 605 determines the intent corresponding to the entity sequence according to the corresponding relationship between the entity sequence and the intent stored in the intent warehouse unit 6053, and determines the entity sequence sent by the context module 604 and the intent mapping unit 6051. The intention is sent to the decision reasoning module 607.
  • the decision reasoning module 607 obtains the intention and entity sequence sent by the intention recognition module 6051, it determines the action sequence according to the stored entity sequence, the correspondence between the intention and the action sequence or the rules obtained from the rule engine 606, and sends it to the action feedback module 608.
  • the action feedback module 608 obtains the action sequence determined by the decision reasoning module 607, it compares the action sequence with the action sequence actually performed by the user, and sends the comparison result to the intention recognition module 605 and the decision reasoning module 607.
  • the intention recognition module 605 updates the correspondence between the entity sequence and the intention stored in the intention storage unit 6053 according to the comparison result
  • the decision inference module 607 updates the stored entity sequence and the correspondence between the intention and the action sequence according to the comparison result.
  • FIG. 23 is a schematic diagram of a data flow in the training method of a multi-example learning model in an embodiment of the application.
  • FIG. 24 is a schematic flowchart of a training method for a multi-example learning model in an embodiment of the application. The following describes the training method of the multi-example learning model in the embodiment of the present application with reference to the schematic diagram of the data flow shown in FIG. 23 and the schematic diagram of the process shown in FIG. 24:
  • the electronic device determines the initial dot data sequence
  • the management data is the daily operation data of the user recorded locally by the electronic device.
  • the initial dot data sequence may include dot data preset in the factory of the electronic device and/or dot data generated by the user using the electronic device.
  • the dot data in the initial dot data sequence does not need to be manually labeled, and can be used as training data to train a multi-example learning model.
  • the dot data sequence shown in FIG. 6 may be used as an initial dot data sequence.
  • the electronic device divides the initial dot data sequence into multiple sub-sequences according to the first preset rule.
  • the first preset rule is used to divide the dot data sequence into different sub-sequences, and one sub-sequence can determine at least one clear intention according to the second preset rule, and the second preset rule is used to determine the intent of the sequence.
  • the first preset rule and the second preset rule please refer to (13) The first preset rule, the second preset rule, and the sub-sequence in the above term introduction, which will not be repeated here.
  • the dot data generated by a series of continuous operations from the screen on to the rest of the user is divided into a sub-sequence.
  • the second preset rule is: the last used application that is closed before the user goes off the screen is the user's intention.
  • the dot data sequence described in FIG. 6 can be divided into multiple sub-sequences shown in FIG. 7: B1, B2, B3.
  • the electronic device can use the multiple sub-sequences obtained in S1302 or multiple sub-sequences obtained in S1307 as multiple to-be-processed sequences, perform feature extraction on the to-be-processed sequence to train a multi-example learning model, and use the trained multiple
  • the example learning model divides the sequence to be processed into smaller-granularity sequences. Specifically, the following steps can be performed:
  • the electronic device determines examples and example tags in the multiple to-be-processed sequences.
  • the electronic device composes an example of two adjacent dot data in multiple sequences to be processed.
  • the example label of an example composed of two dotted data located in the same sequence to be processed is determined to be positive, and the example label of an example composed of two dotted data located in different sequences to be processed is determined to be negative.
  • examples and example tags please refer to the description of examples and example tags in (14) Multi-instance learning model, examples and example tags, packages and package tags in the above term description, which will not be repeated here.
  • FIG. 25 is an exemplary schematic diagram of determining an example and an example label in an embodiment of the application.
  • the dot data sequence A1 composed of 12 dot data is divided into to-be-processed sequences B1, B2, and B3.
  • the electronic device can determine a total of 11 examples in the to-be-processed sequence: S1, S2, S3, S4, S5, S6, S7, S8, S9 , S10, S11.
  • the electronic device can determine:
  • example label of example S9 which is composed of the dot data also located in the sequence to be processed B2, is positive;
  • example label of example S11 which is composed of the dot data also located in the sequence to be processed B3, is positive;
  • example label of example S8, which is composed of the dotted data respectively located in the sequences B1 and B2 to be processed, is negative;
  • example label of example S10 which is composed of the dot data located in the sequence to be processed B2 and B3, is negative.
  • the electronic device determines the package and the package label according to multiple to-be-processed sequences, examples, and example labels;
  • the electronic device After the electronic device determines the example and the example label, it can determine the package and the package label according to the relationship between the example and the example label and multiple sequences to be processed. Take the examples of dotted data in the same sequence to be processed as a package, and make sure that the package label is positive; it will consist of the last dotted data in a sequence to be processed and the next sequence to be processed. Treat the example composed of the first dotted data in the sequence as a package, and determine that its package label is negative. Specifically, for the description of the package and package label, please refer to the description of the package and package label in (14) Multi-instance learning model, example and sample label, package and package label in the above term description, which will not be repeated here.
  • FIG. 26 is an exemplary schematic diagram of determining a package and a package label in an embodiment of the application.
  • the 11 examples in the 3 pending sequences B1, B2, and B3 constitute a total of 5 packets:
  • the example S9 composed of the dot data in the sequence B2 to be processed constitutes a package L3, and its package label is positive;
  • the example S11 composed of the dot data in the sequence B3 to be processed constitutes a package L5, and its package label is positive;
  • the example S8 composed of the last dot data of the sequence B1 to be processed and the first dot data of the sequence B2 to be processed forms a packet L2, and its packet label is negative;
  • the example S10 composed of the last dot data of the sequence B2 to be processed and the first dot data of the sequence B3 to be processed constitutes a packet L4, and its packet label is negative.
  • the electronic device extracts the feature vector matrix of the packet from the packet.
  • the electronic device can extract the features of the examples from each example in the package to obtain the feature vector of each example; and then compose the feature vector of each example in the package into a feature vector matrix of the package.
  • eigenvectors and eigenvector matrices please refer to the description of the eigenvector matrix and the eigenvector matrix of the example in (16) Dot data sequence package in the above term description, which will not be repeated here.
  • FIG. 27 is an exemplary schematic diagram of extracting a feature vector matrix of a packet in an embodiment of the application.
  • Package L1 contains examples S1, S2, S3, S4, S5, S6, S7.
  • the dot data in each example is a JSON structure
  • the 9-dimensional eigenvectors of the 7 examples in the package L1 can be formed into a 7*9 eigenvector matrix of the package to obtain the eigenvector matrix N1 of the package L1:
  • the extracted features of each dimension can also be of other types, which are not limited here.
  • the expression and storage modes of the eigenvectors of the example and the eigenvector matrix of the package may also adopt other expressions and storage modes, which are not limited here.
  • the electronic device inputs the feature vector matrix and the package label of each package into the multi-instance learning model to obtain a trained multi-instance learning model;
  • the multi-instance learning model is a deep learning model. After the electronic device obtains the feature vector matrix of each package, it sequentially inputs the feature vector matrix and package label of each package into the multi-instance learning model to obtain a trained multi-instance learning model.
  • the multi-instance learning model that has not been trained in the embodiments of the present application may be referred to as a preset multi-instance learning model.
  • the multi-instance learning model Before the training data extracted from the initial dot data sequence is input into the multi-instance learning model for training, the multi-instance learning model may be a preset multi-instance learning model.
  • the preset multi-instance learning model can be any untrained multi-instance learning model, such as ORLR model, Citation-kNN model, MI-SVM model, C4.5-MI model, BP-MIP model, Ensemble Learning -MIP models, etc., are not limited here.
  • FIG. 28 is an exemplary schematic diagram of training a multi-instance learning model in an embodiment of the application.
  • the electronic device inputs the feature vector matrix N1 extracted from the package L1 and the package label "positive" of the package L1 into the multi-example learning model, and then inputs the feature vector matrix N2 extracted from the package L2 and the package label "negative” of the package L2 into the multi-example learning model.
  • Example learning model then input the feature vector matrix N3 extracted from package L3 and the package label "positive” of package L3 into the multi-example learning model, and then input the feature vector matrix N4 extracted from package L4 and the package label "negative” of package L4 Input the multi-instance learning model, and then input the feature vector matrix N5 extracted from the package L5 and the package label "positive” into the multi-instance learning model, and then the trained multi-instance learning model can be obtained.
  • the electronic device inputs the multiple to-be-processed sequences into the trained multi-example learning model to obtain multiple sub-sequences;
  • the multi-instance learning model is used to divide each sequence to be processed into smaller-granularity sequences.
  • the sequence to be processed may be a sub-sequence obtained by dividing the dot data sequence using the first preset rule.
  • the electronic device can input the multiple to-be-processed sequences into the trained multi-instance learning model to obtain multiple sub-sequences, and the number of the multiple sub-sequences is greater than or equal to the number of the multiple to-be-processed sequences number.
  • FIG. 29 is an exemplary schematic diagram of the multi-example learning model in an embodiment of the application dividing multiple sequences to be processed into multiple smaller-granularity sub-sequences.
  • the trained multi-instance learning model can generate sub-sequences Z1, Z2, Z3, Z4, where the sequence B1 to be processed is divided into granularities The smaller subsequences Z1 and Z2.
  • the electronic device determines the value of the loss function of the multi-example learning model after the training.
  • the loss function is a measure of how well the predictive model performs in terms of predicting the expected result.
  • Each machine learning model has its corresponding loss function. The better the prediction result of the model, the smaller the value of the loss function.
  • the electronic device obtains the trained multi-instance learning model, and uses the trained multi-instance learning model to divide the multiple to-be-processed sequences into multiple sub-sequences, the value of the loss function of the trained multi-instance learning model can be obtained.
  • the electronic device corresponds to the adopted multi-instance learning model Calculate the loss function of, and determine that the value of the loss function of the multi-instance learning model after training is 10%.
  • the electronic device determines whether the reduction range of the value of the loss function is less than a preset reduction range
  • the electronic device After the electronic device obtains the value of the loss function of the trained multi-instance learning model, it can determine whether the value of the loss function decreases by less than the preset decrease range.
  • the electronic device Since the electronic device has not determined the value of the loss function of the trained multi-instance learning model before the first run, after the electronic device obtains the value of the loss function of the trained multi-instance learning model for the first time, It can be directly determined by default that the reduction range of the value of the loss function is not less than the preset reduction range.
  • the electronic device may use the multiple sub-sequences as multiple to-be-processed sequences, and perform steps S1303 to S1309.
  • the electronic device may perform step S1310.
  • FIG. 30 is an exemplary schematic diagram of iterative training of a multi-example learning model in an embodiment of the application.
  • the electronic device may use the trained multi-example learning model to divide the sub-sequences Z1, Z2, Z3, and Z4 obtained by dividing the to-be-processed sequences B1, B2, B3 as the new to-be-processed sequences, and perform steps S1303 to S1309:
  • FIG. 31 is an exemplary schematic diagram of iteratively generating sub-sequences of a multi-example learning model in an embodiment of the application.
  • the electronic device can divide the sub-sequence obtained in the previous round, that is, the sequence to be processed in this round: Z1, Z2, Z3, Z4 input the updated multi-example learning model obtained after training, and obtain the sub-sequence Z1, Z2, Z3, Z4 .
  • the electronic device determines that the value of the loss function of the updated multi-instance learning model after training is still 10%. Compared with the previous round, the reduction range of the value of the loss function is 0, which is less than the preset reduction range of 5%, and step S1310 is executed.
  • step S1310 is executed.
  • the above loss function may be cross-entropy Loss function
  • cross entropy loss function uses the cross entropy calculated from the multi-example learning model as the value of the loss function. It can be determined when it is determined that the cross entropy calculated by the multi-instance learning model obtained in a certain round of training is not less than the preset decrease range compared with the cross entropy calculated by the multi-instance learning model obtained in the previous round of training. The multi-instance learning model that has been trained is obtained.
  • the electronic device determines that the multi-instance learning model after training is a multi-instance learning model that has been trained.
  • the electronic device determines that the value of the loss function
  • the multi-instance learning model after rounds of training is a multi-instance learning model trained using the initial dot data sequence.
  • the initial dot data sequence without manual labeling can be directly used to train the multi-example learning model to obtain a trained multi-example learning model that can divide the dot data sequence into multiple sub-sequences with smaller granularity. Realize the self-annotation of the user's dotted data. While greatly reducing the labor cost of training the intent recognition model for data labeling, the data labeling is also more accurate, and the accuracy of intent recognition is improved.
  • the input method of the dot data is multi-modal input
  • the composition of the dot data can be diversified, the time for manually labeling the training data is significantly increased, and the model training method in the embodiment of the present application , It can significantly save the labor cost of training the intent recognition model for data labeling, and increase the accuracy of data labeling, thereby improving the accuracy of intent recognition.
  • FIG. 32 is a schematic diagram of a data flow in the update process of the multi-example learning model in an embodiment of the application.
  • FIG. 33 is a schematic flowchart of the update process of the multi-example learning model in an embodiment of the application. The following describes the update process of the multi-instance learning model in the embodiment of the present application in conjunction with the schematic diagram of the data flow shown in FIG. 32 and the schematic flowchart shown in FIG. 33:
  • the electronic equipment determines the newly added dot data sequence
  • the electronic device may use the user's operation data as the management data when the user uses the electronic device.
  • the electronic device can combine the newly generated dot data of the training data that is not used as a multi-example learning model to a preset number threshold, compose these dot data into a new dot data sequence; it can also combine the dot data within a preset period (for example, every day or Every week, etc.)
  • the newly generated dot data that is not training data for multi-example learning constitutes a new dot data sequence, which is not limited here.
  • the electronic device inputs the newly added dot data sequence into the multi-example learning model to obtain multiple sub-sequences;
  • the electronic device may input the newly-added dot data sequence into a multi-example learning model that has been trained so far to obtain multiple sub-sequences. For details, refer to step S2202, which will not be repeated here.
  • the electronic device can use the multiple subsequences obtained in S2502 or the multiple subsequences obtained in S2507 as multiple to-be-processed sequences, and perform feature extraction on the to-be-processed sequence to train a multi-example learning model to obtain the updated multi-example learning.
  • the model specifically, can perform the following steps:
  • the electronic device determines examples and example tags in the multiple to-be-processed sequences.
  • the electronic device determines the package and the package label according to multiple to-be-processed sequences, examples, and example labels;
  • the electronic device extracts the feature vector matrix of the packet from the packet
  • the electronic device inputs the feature vector matrix and the package label of each package into the multi-instance learning model to obtain a trained multi-instance learning model;
  • the electronic device inputs the multiple to-be-processed sequences into the trained multi-example learning model to obtain multiple sub-sequences;
  • the electronic device determines the value of the loss function of the multi-example learning model after the training.
  • the electronic device determines whether the reduction range of the value of the loss function is less than a preset reduction range
  • Steps S2503 to S2509 are similar to steps S1303 to S1309, and reference may be made to the description of steps S1303 to S1309, which will not be repeated here.
  • the electronic device determines that the multi-instance learning model after training is a multi-instance learning model completed by updating the training;
  • the electronic device determines that the value of the loss function The multi-instance learning model after rounds of training is the multi-instance learning model that has been trained using the newly added dot data sequence to update the training.
  • the electronic device can use the newly added dot data to form a new dot data sequence to update and train the multi-instance learning model, so that the multi-instance learning model is more in line with the personalized needs of users, and the divided subsequences are more accurate , So that the intent recognition result is more in line with user expectations.
  • both the training method of the multi-instance learning model and the steps in the update process of the multi-instance learning model can be executed by the electronic device.
  • the electronic device can send the dot data sequence to the server. After the server trains the multi-instance learning model, the multi-instance learning model after training or updating is sent to the electronic device for use.
  • the electronic device can send the dot data sequence to the server. After the server trains the multi-instance learning model, the multi-instance learning model after training or updating is sent to the electronic device for use.
  • FIG. 34 is an interactive schematic diagram of the training method of the multi-example learning model in the embodiment of the application.
  • the process can be:
  • the electronic device determines the initial dot data sequence
  • step S1301 It is similar to step S1301 and will not be repeated here.
  • the electronic device sends the initial dot data sequence to the server
  • the server divides the initial dot data sequence into multiple sub-sequences according to the first preset rule.
  • the server determines examples and example tags in the multiple to-be-processed sequences.
  • the server determines the package and the package label according to multiple to-be-processed sequences, examples, and example labels;
  • the server extracts the feature vector matrix of the packet from the packet
  • the server inputs the feature vector matrix and the package label of each package into the multi-instance learning model to obtain a trained multi-instance learning model;
  • the server inputs the multiple to-be-processed sequences into the trained multi-example learning model to obtain multiple sub-sequences;
  • the server determines the value of the loss function of the multi-example learning model after the training.
  • the server determines whether the reduction range of the value of the loss function is less than a preset reduction range
  • the server determines that the multi-instance learning model after training is a multi-instance learning model that has been trained.
  • Steps S2603 to S2611 are executed by the server, and the specific actions performed are similar to the specific actions performed by the electronic device in steps S1302 to S1310, and will not be repeated here.
  • the server sends the trained multi-example learning model to the electronic device.
  • the server completes the training work of the multi-example learning model, which saves the processing resources of the electronic device and improves the training efficiency of the multi-example learning model.
  • FIG. 35 is an interactive schematic diagram of the update training process of the multi-example learning model in an embodiment of the application.
  • the process can be:
  • the electronic device determines the newly-added dotting data sequence
  • step S2501 It is similar to step S2501 and will not be repeated here.
  • the electronic device sends the newly added dot data sequence to the server
  • the server inputs the newly added dot data sequence into the multi-example learning model to obtain multiple sub-sequences;
  • the server determines examples and example tags in the multiple to-be-processed sequences.
  • the server determines the package and the package label according to multiple to-be-processed sequences, examples, and example labels;
  • the server extracts the feature vector matrix of the packet from the packet
  • the server inputs the feature vector matrix and the package label of each package into the multi-instance learning model to obtain a trained multi-instance learning model;
  • the server inputs the multiple to-be-processed sequences into the trained multi-example learning model to obtain multiple sub-sequences;
  • the server determines the value of the loss function of the multi-example learning model after the training.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种意图识别方法及电子设备,涉及人工智能(Artificial Intelligence,AI)技术领域,特别涉及决策推理技术领域。该方法可以根据多设备的环境感知和用户的多模态输入获得对环境的完整描述,并结合一个时间段内的用户输入、环境感知和上下文信息,获取一个能反应随时间变化、并且能随环境变化而扩展的完整无偏颇的意图体系,据此做出决策,如推断出接下来一段时间内用户想执行的动作或需要的服务,以决策在何种设备上响应用户的何种需求从而为用户精准地提供他所需要的响应或服务的决策。

Description

意图识别方法及电子设备
本申请要求于2020年3月9日提交中国国家知识产权局、申请号为202010159364.X、申请名称为“意图识别方法及电子设备”,于2020年8月7日提交中国国家知识产权局、申请号为202010791068.1、申请名称为“意图识别方法、多示例学习模型训练方法和相关装置”,于2020年9月3日提交中国国家知识产权局、申请号为202010918192.X、申请名称为“一种意图识别方法及装置”的中国专利申请的优先权,于2020年9月16日提交中国国家知识产权局、申请号为202010973466.5、申请名称为“一种模型训练方法及相关设备”,于2020年10月16日提交中国国家知识产权局、申请号为202011111562.5、申请名称为“基于神经网络的数据处理方法及相关设备”,于2021年2月9日提交中国国家知识产权局、申请号为202110176533.5、申请名称为“规则引擎的执行方法、装置及规则引擎”,于2021年3月5日提交中国国家知识产权局、申请号为202110246051.2、申请名称为“意图识别方法及电子设备”,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能(Artificial Intelligence,AI)领域中决策与推理子领域,尤其涉及意图识别方法及电子设备。
背景技术
在分布式场景中,每个用户或家庭都会有多台智能设备,在这种大环境下,用户需要电子设备能智能的响应其请求。
目前电子设备一般根据当前时刻的单模态输入(当前使用的那一种输入方式),结合规则预测用户意图,为该意图作出决策。如图1为现有技术一个意图识别的场景。在搜索输入框中,当用户输入健身水果这个搜索词时,由于健身水果这个词无法完整清楚的表现出用户意图,其表示的含义可能有很多种。电子设备会根据用户的输入智能识别用户可能的意图作为候选意图展示给用户。当用户选择某个候选意图时,电子设备会展示对所选择的意图的搜索结果。
然而,仅根据用户当前时刻的单模态输入获取的信息,无法准确预测用户当前时刻的意图。一方面获取的信息不足以推断出准确的行为逻辑,无法为预测意图提供足够的依据;另一方面,不可避免某一时刻出现某一偶然事件,该偶然事件与用户真实意图并无关联。因此现有技术中对用户意图的识别具有极大的局限性且准确性较差。
发明内容
本申请提供了意图识别方法及电子设备,根据对一个时间段内获取的数据识别出的实体序列来预测用户意图,提升了意图识别的准确性。
第一方面,本申请提供了一种意图识别方法,该方法包括:第一电子设备确定第一触发;响应于第一触发,第一电子设备在第一时间段内获取第一数据序列,第一数据序列包括多个数据,多个数据中至少两个数据的输入方式不同;第一电子设备根据第一数据序列,确定用户的第一意图;第一电子设备根据第一意图,确定第一待执行动作。
由此,电子设备可以根据多设备的环境感知和用户的多模态输入获得对环境的完整描述,并结合一个时间段内的用户输入、环境感知和上下文信息,获取一个能反应随时间变化、并且能随环境变化而扩展的完整无偏颇的意图体系,据此做出决策,如推断出接下来一段时间内用户想执行的动作或需要的服务,以决策在何种设备上响应用户的何种需求从而为用户精准地提供他所需要的响应或服务的决策。
在一种可能的实现方式中,第一电子设备根据第一数据序列,确定用户的第一意图;包括:第一电子设备根据第一数据序列,确定第一实体序列,第一实体序列包括至少一个实体,实体为现实世界中客观存在的并可以相互区分的对象、事物或动作;第一电子设备根据第一实体序列,确定第一意图,其中,第一意图用于确定动作序列。由此,电子设备可以给予数据序列确定出用户的意图。
在一种可能的实现方式中,第一电子设备根据第一意图,确定第一待执行动作,包括:第一电子设备根据第一实体序列和第一意图,确定第一动作序列,第一动作序列包括第一待执行动作;在第一电子设备确定第一待执行动作之后,还包括:第一电子设备执行第一待执行动作。由此,电子设备可以给予实体和意图确定出需要执行的动作,之后,电子设备既可以执行确定出的动作。
在一种可能的实现方式中,第一待执行动作中包含设备标识与待执行动作,第一电子设备执行第一待执行动作,具体包括:第一电子设备确定第一待执行动作中的设备标识是否为第一电子设备的设备标识;当确定第一待执行动作中的设备标识为第一电子设备的设备标识时,第一电子设备执行第一待执行动作;否则,第一电子设备发送第一指令给第一待执行动作中设备标识对应的第二电子设备,第一指令用于指示第二电子设备执行第一待执行动作。
由此,第一待执行动作对应的执行设备可以为该第一电子设备,也可以为其他电子设备,根据该第一待执行动作中的设备标识,第一电子设备可以确定是自己执行该第一待执行动作,还是发送指令让相应的第二电子设备执行该第一待执行动作,这样,分布式场景中,该第一电子设备可以方便的控制其他电子设备以响应用户需求。
在一种可能的实现方式中,该方法还包括:第一电子设备将出现频率超出预设第一频率阈值的异常特征向量集合确定为新的实体,其中,异常特征向量集合为在实体识别时,与可识别为实体的特征向量集合的区分度超出预设区分阈值的无法识别为实体的特征向量集合。由此,通过对异常特征向量的识别,该第一电子设备可以扩展自己的实体仓库,从而动态扩展自己可识别的实体范围,可以进一步提高意图识别的准确性。
在一种可能的实现方式中,该方法还包括:第一电子设备将出现频率超出预设第二频率阈值的异常动作确定为新的意图,其中,异常动作为未出现过的且不在已有意图对应的动作序列中的动作;第一电子设备根据异常动作出现前识别到的实体序列,建立新的意图与实体序列之间的对应关系。由此,通过对异常动作的识别,该第一电子设备可以扩展自己的意图仓库,并建立新的意图和动作序列的对应关系,这样,可以识别到更多用户个性化的意图,提供与用户需求更匹配的决策,提升了用户体验。
在一种可能的实现方式中,第一电子设备根据第一数据序列,确定第一实体序列,具体包括:第一电子设备从第一数据序列中提取特征向量,得到第一特征向量集合,第一特征向量集合中包括所有从第一数据序列中提取得到的特征向量,特征向量用于表示第一数据序列的特征;第一电子设备将第一特征向量集合输入实体识别模型,得到第一实体序列,实体识别模型为根据第一电子设备中存储的实体数据训练得到的特征向量与实体的对应关系,实体 数据为实体的存储形式,实体数据至少包括实体的编号及表示该实体的特征向量集合。
在一种可能的实现方式中,第一电子设备根据第一实体序列,确定第一意图,具体包括:第一电子设备根据第一实体序列和存储的知识图谱,确定多个候选意图;第一电子设备采用预设的强化学习算法,从多个候选意图中确定第一意图。由此,基于知识图谱和强化学习识别出第一意图,提升意图识别的准确性。
在一种可能的实现方式中,第一电子设备根据第一实体序列和存储的知识图谱,确定多个候选意图,具体包括:根据第一实体序列和知识图谱,确定用户的状态信息和场景信息;状态信息用于表示用户的当前状态,场景信息用于表示用户当前所处的环境;
根据状态信息、场景信息和候选意图的对应关系,确定状态信息和场景信息对应的多个候选意图。
在一种可能的实现方式中,采用预设的强化学习算法,从多个候选意图中确定第一意图,包括:确定与多个候选意图一一对应的意图摇臂;根据第一实体序列、状态信息、场景信息、与多个候选意图一一对应的意图摇臂,以及强化学习算法,从多个候选意图中确定第一意图。
在一种可能的实现方式中,第一电子设备根据第一实体序列,确定第一意图,具体包括:第一电子设备将第一实体序列输入意图识别模型,得到第一意图,意图识别模型为根据对应的实体序列与意图的数据训练得到的实体序列与意图的对应关系。
在一种可能的实现方式中,第一电子设备将第一实体序列输入意图识别模型之前,还包括:第一电子设备将测试数据输入至第一生成器,经过第一生成器处理后得到第一模拟数据;第一电子设备将测试数据和第一模拟数据输入至第一判别器,经过第一判别器处理后得到第一判别结果,第一判别结果用于指示测试数据和第一模拟数据之间的差异;第一电子设备根据第一判别结果更新第一生成器的权重系数,得到第二生成器;第一电子设备在第二生成器中生成第二模拟数据;第一电子设备将第一目标模拟数据输入预设的训练网络,训练得到意图识别模型,第一目标模拟数据包括第二模拟数据。
在一种可能的实现方式中,第一电子设备中配置有群体粗粒度模型和细粒度模型;第一电子设备将第一实体序列输入意图识别模型之前,还包括:第一电子设备获取细粒度标签与粗粒度标签的映射关系;第一电子设备根据映射关系将训练数据集中的细粒度数据映射为粗粒度数据;第一电子设备将粗粒度数据输入到群体粗粒度模型进行训练,通过多个节点设备的联合学习对群体粗粒度模型进行更新,并将细粒度数据输入到细粒度模型进行训练,其中,多个节点设备中包括第一电子设备;第一电子设备组合群体粗粒度模型和细粒度模型得到意图识别模型,意图识别模型的标记空间映射为细粒度标签,意图识别模型的输出结果用于更新细粒度模型。
在一种可能的实现方式中,第一电子设备中还配置有个体粗粒度模型,个体粗粒度模型的标记空间映射为粗粒度标签;第一电子设备组合群体粗粒度模型和细粒度模型得到意图识别模型,包括:第一电子设备组合群体粗粒度模型、个体粗粒度模型和细粒度模型以得到意图识别模型。
在一种可能的实现方式中,第一电子设备执行第一待执行动作之后,还包括:第一电子设备确定待识别的打点数据序列,待识别的打点数据序列由打点数据组成,打点数据包括第一电子设备记录的用户的操作数据和/或第一电子设备对用户操作的响应数据;第一电子设备将待识别的打点数据序列输入多示例学习模型,得到多个子序列;多示例学习模型为已采用第一电子设备中的打点数据序列训练过的多示例学习模型;第一电子设备按照预设意图规则 确定第一子序列的意图,第一子序列为多个子序列中的一个子序列,预设意图规则用于根据序列中的打点数据确定序列的意图;第一电子设备基于确定出的多个子序列的意图,更新意图识别模型。由此,基于用户的操作数据更新意图识别模型,提升意图识别的准确性。
在一种可能的实现方式中,第一电子设备根据第一实体序列和第一意图,确定第一动作序列,具体包括:第一电子设备将第一实体序列和第一意图输入动作预测模型,得到第一动作序列,动作预测模型为根据对应的实体序列、意图与动作序列的数据训练得到的实体序列、意图与动作序列的对应关系。
在一种可能的实现方式中,第一电子设备根据第一实体序列和第一意图,确定第一动作序列,具体包括:第一电子设备将第一实体序列和第一意图输入规则引擎,得到第一动作序列,规则引擎中包含根据用户使用习惯或使用场景设定的实体序列、意图与动作序列的对应关系。
在一种可能的实现方式中,规则引擎包括:第一节点,第一节点至少包括第一类型节点和第二类型节点;第一类型节点,用于根据输入规则引擎中的第一实体的第一属性,从内存中获取第一语义对象对第一实体进行匹配,得到第一匹配结果,第一属性用于表征第一实体的变化频率;第二类型节点,用于根据输入规则引擎中的第二实体的第二属性,从文件中获取第二语义对象对第二实体进行匹配,得到第二匹配结果,第二属性用于表征第二实体的变化频率,第二属性不同于第一属性;其中,第一匹配结果和第二匹配结果共同用于确定是否执行第一待执行动作。
在一种可能的实现方式中,第一时间段与第一触发具有对应关系。
在一种可能的实现方式中,第一数据序列由第一电子设备从触控操作的输入、传感数据的输入、文本数据的输入、语音数据的输入、视频数据的输入以及与第一电子设备互联的智能设备的传输数据的输入中至少两种输入方式得到;第一待执行动作包括启动目标应用程序、启动目标服务、后台加载目标应用程序、无线连接目标设备、发送通知消息中一种动作或服务。
在第一方面中,本申请实施例还提供了一种电子设备,该电子设备包括:至少一个存储器,用于存储程序;至少一个处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第一方面中提供的方法。
在第一方面中,本申请实施例还提供了计算机存储介质,计算机存储介质中存储有指令,当指令在计算机上运行时,使得计算机执行第一方面中提供的方法。
在第一方面中,本申请实施例还提供了一种包含指令的计算机程序产品,当指令在计算机上运行时,使得计算机执行第一方面中提供的方法。
在第一方面中,本申请实施例还提供了一种规则引擎的执行装置,其特征在于,装置运行计算机程序指令,以执行第一方面中提供的方法。
第二方面,本申请提供了一种意图识别方法,该方法包括:第一电子设备确定第一触发;响应于该第一触发,该第一电子设备在第一时间段内获取第一数据,该第一数据用于确定实体,该实体为现实世界中客观存在的并可以相互区分的对象、事物或动作;该第一电子设备根据该第一数据,确定第一实体序列,该第一实体序列包括至少一个实体;该第一电子设备根据该第一实体序列,确定第一意图,该第一意图用于确定动作序列;该第一电子设备根据该第一实体序列和该第一意图,确定第一动作序列,该第一动作序列包括第一待执行动作; 该第一电子设备执行所述第一待执行动作。
由此,电子设备可以根据多设备的环境感知和用户的多模态输入获得对环境的完整描述,并结合一个时间段内的用户输入、环境感知和上下文信息,获取一个能反应随时间变化、并且能随环境变化而扩展的完整无偏颇的意图体系,据此做出决策,如推断出接下来一段时间内用户想执行的动作或需要的服务,以决策在何种设备上响应用户的何种需求从而为用户精准地提供他所需要的响应或服务的决策。
在一种可能的实现方式中,该第一待执行动作中包含设备标识与待执行动作,该第一电子设备执行该第一待执行动作,具体包括:该第一电子设备确定该第一待执行动作中的设备标识是否为该第一电子设备的设备标识;当确定该第一待执行动作中的设备标识为该第一电子设备的设备标识时,所述第一电子设备执行该第一待执行动作;否则,该第一电子设备发送第一指令给该第一待执行动作中设备标识对应的第二电子设备,该第一指令用于指示该第二电子设备执行该第一待执行动作。
由此,第一待执行动作对应的执行设备可以为该第一电子设备,也可以为其他电子设备,根据该第一待执行动作中的设备标识,第一电子设备可以确定是自己执行该第一待执行动作,还是发送指令让相应的第二电子设备执行该第一待执行动作,这样,分布式场景中,该第一电子设备可以方便的控制其他电子设备以响应用户需求。
在一种可能的实现方式中,该方法还包括:该第一电子设备将出现频率超出预设第一频率阈值的异常特征向量集合确定为新的实体,其中,该异常特征向量集合为在实体识别时,与可识别为实体的特征向量集合的区分度超出预设区分阈值的无法识别为实体的特征向量集合。
由此,通过对异常特征向量的识别,该第一电子设备可以扩展自己的实体仓库,从而动态扩展自己可识别的实体范围,可以进一步提高意图识别的准确性。
在一种可能的实现方式中,该方法还包括:该第一电子设备将出现频率超出预设第二频率阈值的异常动作确定为新的意图,其中,该异常动作为未出现过的且不在已有意图对应的动作序列中的动作;该第一电子设备根据该异常动作出现前识别到的实体序列,建立该新的意图与实体序列之间的对应关系。
由此,通过对异常动作的识别,该第一电子设备可以扩展自己的意图仓库,并建立新的意图和动作序列的对应关系,这样,可以识别到更多用户个性化的意图,提供与用户需求更匹配的决策,提升了用户体验。
在一种可能的实现方式中,该第一电子设备根据所述第一数据,确定第一实体序列,具体包括:该第一电子设备从该第一数据中提取特征向量,得到第一特征向量集合,该第一特征向量集合中包括所有从该第一数据中提取得到的特征向量,该特征向量用于表示该第一数据的特征;该第一电子设备将该第一特征向量集合输入实体识别模型,得到该第一实体序列,该实体识别模型为根据该第一电子设备中存储的实体数据训练得到的特征向量与实体的对应关系,该实体数据为该实体的存储形式,该实体数据至少包括实体的编号及表示该实体的特征向量集合。
在一种可能的实现方式中,该第一电子设备将该第一特征向量集合输入实体识别模型,识别得到实体后,可以不仅仅将识别得到的实体组成该第一实体序列,还可以将该实体识别模型历史输出的实体与本次识别得到的实体,共同组成该第一实体序列,此处不作限定。
在一种可能的实现方式中,该实体识别模型可以存储在不同位置,示例性的,该实体识 别模型预置存储在所述第一电子设备中;或,该实体识别模型存储在该第一电子设备可访问的云服务器中,此处不作限定。
在一种可能的实现方式中,该实体识别模型可以有不同的产生方式,示例性的,该实体识别模型由厂商预先训练得到;或,该实体识别模型为该第一电子设备根据该第一电子设备中存储的实体数据训练得到,此处不作限定。
在一种可能的实现方式中,该第一电子设备根据该第一实体序列,确定第一意图,具体包括:该第一电子设备将该第一实体序列输入意图识别模型,得到该第一意图,该意图识别模型为根据对应的实体序列与意图的数据训练得到的实体序列与意图的对应关系。
在一种可能的实现方式中,该意图识别模型可以存储在不同位置,示例性的,该意图识别模型预置存储在该第一电子设备中;或,该意图识别模型存储在该第一电子设备可访问的云服务器中,此处不作限定。
在一种可能的实现方式中,该意图识别模型可以有不同的产生方式,示例性的,该意图识别模型由厂商预先训练得到;或,该意图识别模型为该第一电子设备根据该第一电子设备中存储的对应的实体序列与意图的数据训练得到;或,该意图识别模型为该第一电子设备根据其他用户共享的对应的实体序列与意图的数据训练得到,此处不作限定。
在一种可能的实现方式中,该第一电子设备根据该第一实体序列和该第一意图,确定第一动作序列,具体包括:该第一电子设备将该第一实体序列输入动作预测模型,得到该第一动作序列,该动作预测模型为根据对应的实体序列、意图与动作序列的数据训练得到的实体序列、意图与动作序列的对应关系;
由此,对于复杂应用场景,该第一电子设备可以将第一实体序列和第一意图输入动作预测模型,预测出该第一动作序列,挖掘出用户潜在的需求,帮助用于进行决策。
在一种可能的实现方式中,该第一电子设备根据该第一实体序列和该第一意图,确定第一动作序列,具体包括:所述第一电子设备根据决策规则,确定所述第一实体序列和所述第一意图序列对应的所述第一动作序列,所述决策规则为根据用户使用习惯或使用场景设定的实体序列、意图与动作序列的对应关系。
由此,对于简单应用场景,该第一电子设备可以直接根据预存的决策规则,直接确定可能需要进行的动作,不需要使用动作预测模型去预测,可以更快并更准确的满足用户需求。
在一种可能的实现方式中,该动作预测模块可以存储在不同的位置,示例性的,该动作预测模型预置存储在该第一电子设备中;或,该动作预测模型存储在该第一电子设备可访问的云服务器中,此处不作限定。
在一种可能的实现方式中,该动作预测模块可以有不同的产生方式,示例性的,该动作预测模型由厂商预先训练得到;或,该动作预测模型为该第一电子设备根据该第一电子设备中存储的对应的实体序列、意图与动作序列的数据训练得到;或,该动作预测模型为所述第一电子设备根据其他用户共享的对应的实体序列、意图与动作序列的数据训练得到,此处不作限定。
在一种可能的实现方式中,该决策规则可以存储在不同的位置,示例性的,该决策规则预置存储在该第一电子设备中;或,该决策规则存储在该第一电子设备可访问的云服务器中,此处不作限定。
在一种可能的实现方式中,该决策规则由厂商预先设定得到;或,该决策规则为该第一电子设备根据用户的使用习惯或使用场景设定得到;或,该决策规则为由其他用户共享得到; 或该决策规则由用户从第三方数据服务商获取得到,此处不作限定。
在一种可能的实现方式中,该第一时间段与该第一触发具有对应关系,当第一电子设备确定第一触发时,即可确定与该第一触发对应的第一时间段。
在一种可能的实现方式中,该第一数据由该第一电子设备从触控操作的输入、传感数据的输入、文本数据的输入、语音数据的输入、视频数据的输入以及与该第一电子设备互联的智能设备的传输数据的输入中至少两种输入方式得到。可以理解的是,在一些实施例中,该第一数据还可以从其他更多的数据输入方式中得到,此处不作限定。
在一种可能的实现方式中,该第一待执行动作包括启动目标应用程序、启动目标服务、后台加载目标应用程序、无线连接目标设备、发送通知消息中一种动作或服务。可以理解的是,在一些实施例中,该第一待执行动作还可以为其他的动作或服务,此处不作限定。
在第二方面中,本申请实施例还提供了一种电子设备,作为第一电子设备,该第一电子设备包括:一个或多个处理器和存储器;该存储器与该一个或多个处理器耦合,该存储器用于存储计算机程序代码,该计算机程序代码包括计算机指令,该一个或多个处理器调用该计算机指令以使得该第一电子设备执行:确定第一触发;响应于该第一触发,在第一时间段内获取第一数据,该第一数据用于确定实体,该实体为现实世界中客观存在的并可以相互区分的对象、事物或动作;根据该第一数据,确定第一实体序列,该第一实体序列包括至少一个实体;根据该第一实体序列,确定第一意图,该第一意图用于确定动作序列;根据该第一实体序列和该第一意图,确定第一动作序列,该第一动作序列包括第一待执行动作;执行该第一待执行动作。
由此,该电子设备可以根据多设备的环境感知和用户的多模态输入获得对环境的完整描述,并结合一个时间段内的用户输入、环境感知和上下文信息,获取一个能反应随时间变化、并且能随环境变化而扩展的完整无偏颇的意图体系,据此做出决策,如推断出接下来一段时间内用户想执行的动作或需要的服务,以决策在何种设备上响应用户的何种需求从而为用户精准地提供他所需要的响应或服务的决策。
在一种可能的实现方式中,该第一待执行动作中包含设备标识与待执行动作,该一个或多个处理器,具体用于调用该计算机指令以使得该第一电子设备执行:确定该第一待执行动作中的设备标识是否为该第一电子设备的设备标识;当确定该第一待执行动作中的设备标识为该第一电子设备的设备标识时,执行该第一待执行动作;否则,发送第一指令给该第一待执行动作中设备标识对应的第二电子设备,该第一指令用于指示该第二电子设备执行该第一待执行动作。
在一种可能的实现方式中,该一个或多个处理器,还用于调用该计算机指令以使得该第一电子设备执行:将出现频率超出预设第一频率阈值的异常特征向量集合确定为新的实体,其中,该异常特征向量集合为在实体识别时,与可识别为实体的特征向量集合的区分度超出预设区分阈值的无法识别为实体的特征向量集合。
在一种可能的实现方式中,该一个或多个处理器,还用于调用该计算机指令以使得该第一电子设备执行:将出现频率超出预设第二频率阈值的异常动作确定为新的意图,其中,该异常动作为未出现过的且不在已有意图对应的动作序列中的动作;根据该异常动作出现前识别到的实体序列,建立该新的意图与实体序列之间的对应关系。
在一种可能的实现方式中,该一个或多个处理器,具体用于调用该计算机指令以使得该第一电子设备执行:从该第一数据中提取特征向量,得到第一特征向量集合,该第一特征向 量集合中包括所有从该第一数据中提取得到的特征向量,该特征向量用于表示该第一数据的特征;将该第一特征向量集合输入实体识别模型,得到该第一实体序列,该实体识别模型为根据该存储器中存储的实体数据训练得到的特征向量与实体的对应关系,该实体数据为该实体的存储形式,该实体数据至少包括实体的编号及表示该实体的特征向量集合。
在一种可能的实现方式中,将该第一特征向量集合输入实体识别模型,识别得到实体后,可以不仅仅将识别得到的实体组成该第一实体序列,还可以将该实体识别模型历史输出的实体与本次识别得到的实体,共同组成该第一实体序列,此处不作限定。
在一种可能的实现方式中,该实体识别模型可以存储在不同位置,示例性的,该实体识别模型预置存储在该存储器中;或,该实体识别模型存储在该第一电子设备可访问的云服务器中,此处不作限定。
在一种可能的实现方式中,该实体识别模型可以有不同的产生方式,示例性的,该实体识别模型由厂商预先训练得到;或,该实体识别模型为该第一电子设备根据该存储器中存储的实体数据训练得到,此处不作限定。
在一种可能的实现方式中,该一个或多个处理器,具体用于调用该计算机指令以使得该第一电子设备执行:将该第一实体序列输入意图识别模型,得到该第一意图,该意图识别模型为根据对应的实体序列与意图的数据训练得到的实体序列与意图的对应关系。
在一种可能的实现方式中,该意图识别模型可以存储在不同位置,示例性的,该意图识别模型预置存储在该存储器中;或,该意图识别模型存储在该第一电子设备可访问的云服务器中,此处不作限定。
在一种可能的实现方式中,该意图识别模型可以有不同的产生方式,示例性的,该意图识别模型由厂商预先训练得到;或,该意图识别模型为该第一电子设备根据该存储器中存储的对应的实体序列与意图的数据训练得到;或,该意图识别模型为该第一电子设备根据其他用户共享的对应的实体序列与意图的数据训练得到,此处不作限定。
在一种可能的实现方式中,该一个或多个处理器,具体用于调用该计算机指令以使得该第一电子设备执行:将该第一实体序列输入动作预测模型,得到该第一动作序列,该动作预测模型为根据对应的实体序列、意图与动作序列的数据训练得到的实体序列、意图与动作序列的对应关系;
在一种可能的实现方式中,该一个或多个处理器,具体用于调用该计算机指令以使得该第一电子设备执行:根据决策规则,确定该第一实体序列和该第一意图序列对应的所述第一动作序列,该决策规则为根据用户使用习惯或使用场景设定的实体序列、意图与动作序列的对应关系。
在一种可能的实现方式中,该动作预测模块可以存储在不同的位置,示例性的,该动作预测模型预置存储在该存储器中;或,该动作预测模型存储在该第一电子设备可访问的云服务器中,此处不作限定。
在一种可能的实现方式中,该动作预测模块可以有不同的产生方式,示例性的,该动作预测模型由厂商预先训练得到;或,该动作预测模型为该第一电子设备根据该存储器中存储的对应的实体序列、意图与动作序列的数据训练得到;或,该动作预测模型为所述第一电子设备根据其他用户共享的对应的实体序列、意图与动作序列的数据训练得到,此处不作限定。
在一种可能的实现方式中,该决策规则可以存储在不同的位置,示例性的,该决策规则预置存储在该存储器中;或,该决策规则存储在该第一电子设备可访问的云服务器中,此处 不作限定。
在一种可能的实现方式中,该决策规则由厂商预先设定得到;或,该决策规则为该第一电子设备根据用户的使用习惯或使用场景设定得到;或,该决策规则为由其他用户共享得到;或该决策规则由用户从第三方数据服务商获取得到,此处不作限定。
在一种可能的实现方式中,该第一时间段与该第一触发具有对应关系,当确定第一触发时,即可确定与该第一触发对应的第一时间段。
在一种可能的实现方式中,该第一数据从触控操作的输入、传感数据的输入、文本数据的输入、语音数据的输入、视频数据的输入以及与该第一电子设备互联的智能设备的传输数据的输入中至少两种输入方式得到。可以理解的是,在一些实施例中,该第一数据还可以从其他更多的数据输入方式中得到,此处不作限定。
在一种可能的实现方式中,该第一待执行动作包括启动目标应用程序、启动目标服务、后台加载目标应用程序、无线连接目标设备、发送通知消息中一种动作或服务。可以理解的是,在一些实施例中,该第一待执行动作还可以为其他的动作或服务,此处不作限定。
在第二方面中,本申请实施例还提供了一种芯片,该芯片应用于电子设备,该芯片包括一个或多个处理器,该处理器用于调用计算机指令以使得该电子设备执行如第二方面以及第二方面中任一可能的实现方式描述的方法。
在第二方面中,本申请实施例还提供一种包含指令的计算机程序产品,当上述计算机程序产品在电子设备上运行时,使得上述电子设备执行如第二方面以及第二方面中任一可能的实现方式描述的方法。
在第二方面中,本申请实施例还提供一种计算机可读存储介质,包括指令,当上述指令在电子设备上运行时,使得上述电子设备执行如第二方面以及第二方面中任一可能的实现方式描述的方法。
第三方面,本申请实施例提供了一种意图识别方法,该方法可以获取用户感知数据,并根据该用户感知数据和存储的知识图谱,确定多个候选意图,之后采用预设的强化学习算法,从多个候选意图中确定目标意图。其中,用户感知数据用于表示用户的行为信息。在一个例子中,用户感知数据中可以包括多个数据,多个数据中至少两个数据的输入方式不同。
本申请实施例提供的意图识别方法,在获取到用于表示用户的行为信息的用户感知数据后,可以根据用户感知数据和存储的知识图谱,确定多个候选意图,并采用预设的强化学习算法,从多个候选意图中确定目标意图。这样,由于用户感知数据仅表示用户的行为信息,并未表明用户的意图,实现了在用户未表明自身意图的情况下,主动识别用户意图,从而提高了用户体验。
在一种可能的实现方式中,上述“根据用户感知数据和存储的知识图谱,确定多个候选意图”的方法可以包括:意图识别装置确定用户感知数据中的实体和实体的描述数据,并根据实体和实体的描述数据,以及知识图谱,确定用户的状态信息和场景信息。之后,意图识别装置根据状态信息、场景信息和候选意图的对应关系,确定状态信息和场景信息对应的多个候选意图。其中,状态信息用于表示用户的当前状态,场景信息用于表示用户当前所处的环境。
在一种可能的实现方式中,上述“采用预设的强化学习算法,从多个候选意图中确定目标意图”的方法可以包括:意图识别装置确定与多个候选意图一一对应的意图摇臂,并根据 用户感知数据、状态信息、场景信息、与多个候选意图一一对应的意图摇臂,以及强化学习算法,从多个候选意图中确定目标意图。
采用的强化学习算法不同,从多个候选意图中确定目标意图的方式不同。
在一种可能的实现方式中,本申请实施例提供的意图识别方法还可以包括:意图识别装置根据用户感知数据、状态信息、场景信息、目标意图对应的意图摇臂,确定目标意图对应的意图置信度,并根据该意图置信度,确定展示目标意图使用的目标交互模式。之后,意图识别装置利用目标交互模式,展示目标意图的内容。其中,意图置信度用于表示目标意图与真实意图的预测符合程度。
不同于现有技术中的仅依赖置信度来展示意图,即展示意图置信度大于阈值的意图,本申请能够根据置信区间,以及置信区间对应的等级的交互模式,来选择展示目标意图的目标交互模式,减轻了展示低置信度的意图导致降低用户体验的问题。
在一种可能的实现方式中,上述“根据意图置信度,确定展示目标意图使用的目标交互模式”的方法可以包括:意图识别装置在预存的多个置信区间中,确定意图置信度所属的目标置信区间,并根据目标意图对应的业务,从目标置信区间对应的等级的交互模式中确定目标交互模式。其中,一个置信区间对应一个等级的交互模式,一个等级的交互模式包括一个或多个交互模式。
在一种可能的实现方式中,本申请实施例提供的意图识别方法还可以包括:意图识别装置在利用目标交互模式,展示目标意图的内容的预设时间段内,识别对目标意图的目标操作,并根据该目标操作和预设规则,确定目标操作对应的目标值。之后,意图识别装置根据目标值,更新多个候选意图,并更新强化学习算法中用于确定目标意图的参数。其中,目标值用于表示目标意图与真实意图的实际符合程度。
现有技术中,手机在展示意图之后,仅考虑用户是否点击该意图,但是在实际应用中用户的反馈可能包含除是否点击外的其他操作,因此导致分析得到的反馈不准确。在本申请中,通过考虑预设时间段内的反馈操作,该反馈操作的类型较多,并利用不同的反馈操作能够得到不同的目标值,这样增加了反馈信息的准确度。
在一种可能的实现方式中,上述“根据目标值,更新多个候选意图”的方法可以包括:意图识别装置在确定目标值小于预设阈值的情况下,或者在确定目标值小于预设阈值的次数等于预设次数的情况下,删除多个候选意图中的目标意图。
由于现有技术中的摇臂集合是固定的,包含手机预存的全部意图摇臂。但是,本申请中,实现了摇臂集合随着候选意图改变而改变,从而实现了用户兴趣转移与意图变化的快速支持,提高了用户体验。
在第三方面中,本申请实施例还提供一种意图识别装置,该意图识别装置包括用于执行上述第三方面或上述第三方面的任一种可能的实现方式的意图识别方法的各个模块。
本申请实施例还提供一种意图识别装置,该意图识别装置包括存储器和处理器。存储器和处理器耦合。存储器用于存储计算机程序代码,计算机程序代码包括计算机指令。当处理器执行计算机指令时,意图识别装置执行如第三方面及第三方面中任一种可能的实现方式的意图识别方法。
在第三方面中,本申请实施例还提供了一种芯片系统,该芯片系统应用于第三方面中提及的意图识别装置。芯片系统包括一个或多个接口电路,以及一个或多个处理器。接口电路和处理器通过线路互联;接口电路用于从意图识别装置的存储器接收信号,并向处理器发送 信号,信号包括存储器中存储的计算机指令。当处理器执行计算机指令时,意图识别装置执行如第一方面及其任一种可能的实现方式的意图识别方法。
在第三方面中,本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质包括计算机指令,当计算机指令在意图识别装置上运行时,使得意图识别装置执行如第三方面及第三方面中任一种可能的实现方式的意图识别方法。
在第三方面中,本申请实施例还提供了一种计算机程序产品,该计算机程序产品包括计算机指令,当计算机指令在意图识别装置上运行时,使得意图识别装置执行如第三方面及第三方面中任一种可能的实现方式的意图识别方法。
第四方面,本申请实施例提供了一种模型训练方法,应用于多个节点设备中任意一个节点设备,节点设备配置有群体粗粒度模型和细粒度模型,方法包括:
节点设备获取细粒度标签与粗粒度标签的映射关系,根据映射关系将训练数据集中的细粒度数据映射为粗粒度数据;然后将粗粒度数据输入到群体粗粒度模型进行训练,将细粒度数据输入到细粒度模型进行训练;群体粗粒度模型和细粒度模型具有各自的更新时机,群体粗粒度模型通过多个节点设备的联合学习对群体粗粒度模型进行更新;节点设备组合群体粗粒度模型和细粒度模型以得到联合模型,联合模型的标记空间映射为细粒度标签,联合模型的输出结果用于更新细粒度模型。
本示例中,节点设备中训练数据集中样本数据的标记空间映射为细粒度标签,本申请中,引入了粗粒度标签,通过粗粒度标签来统一各节点设备的标记空间,从而可以保证在各端侧在细粒度任务不统一的情况下,各节点设备可以在粗粒度任务上的统一,多个节点设备也可以进行联合训练。节点设备获取细粒度标签与粗粒度标签的映射关系,然后,根据映射关系将训练数据集中的细粒度数据映射为粗粒度数据;节点设备利用粗粒度数据对群体粗粒度模型进行本地训练,并且通过多个节点设备的联合学习对群体粗粒度模型进行更新,直到该粗粒度标签收敛,从而使得粗粒度模型具有群体性特征。并且节点设备利用将细粒度数据输入到细粒度模型进行训练,基于损失函数通过联合模型输出的结果(细粒度标签)进行反向更新细粒度模型,直到该细粒度标签收敛。本申请中的联合模型既兼顾群体性特征,且每个节点设备的细粒度模型能将群体粗粒度模型匹配到具体的细粒度标签上,使得联合模型的标记空间为端侧对应的细粒度标签空间,联合模型又兼顾每个节点设备的个体化特征。
在一种可能的实现方式中,将粗粒度数据输入到群体粗粒度模型进行训练,具体可以包括:节点设备将粗粒度数据输入到群体粗粒度模型进行训练,确定群体粗粒度模型对应的第一信息,该第一信息可以为梯度、模型参数(如权重值)、或者模型(网络架构及模型参数);群体粗粒度模型的更新过程可以为:节点设备将第一信息发送至中控设备;然后节点设备接收第二信息,第二信息用于更新群体粗粒度模型,第二信息为中控设备对接收到的多个节点设备上传的第一信息进行整合后得到的。
本示例中,每个节点设备通过本地数据对群体粗粒度模型进行训练,为了达到多个节点设备联合训练的目的,每个节点设备仅将各自的第一信息(如参数值)传输至中控设备,以保证各节点设备本地数据的隐私性,中控设备将接收到的各参数值进行整合,即整合多个节点设备中各自节点设备中本地数据的特征,将整合之后的参数值下发给各个节点设备,各节点设备可以根据中控设备下发的参数值对本地的群体粗粒度模型进行更新,即完成一次更新,从而使得群体粗粒度模型具有群体性。
在一种可能的实现方式中,节点设备还配置有个体粗粒度模型;组合群体粗粒度模型和细粒度模型以得到联合模型可以具体包括:组合群体粗粒度模型、个体粗粒度模型和细粒度模型以得到联合模型;节点设备将个体粗粒度模型上传到中控设备,然后,节点设备可以接收中控设备发送的更新后的个体粗粒度模型;其中,更新后的个体粗粒度模型为:中控设备在多个节点设备上传的个体粗粒度模型中选择相关度高于阈值的至少2个个体粗粒度模型进行集成后得到的。
本示例中,群体粗粒度模型、个体粗粒度模型和细粒度模型组合为一个整体模型,群体粗粒度模型能够挖掘群体性的规律,能够为节点设备中的细粒度模型提供一个好的初始点。但是存在群体性规律与个体特征之间的差距巨大的情况,而个体粗粒度模型可以弥合少数情况下群体性与个体性的差距。
在一种可能的实现方式中,组合群体粗粒度模型和细粒度模型,包括:
基于群体粗粒度模型的权重和细粒度模型的权重组合粗粒度模型和细粒度模型。
在一种可能的实现方式中,基于群体粗粒度模型的权重和细粒度模型的权重组合粗粒度模型和细粒度模型可以包括:在联合模型的输出层,根据细粒度标签与粗粒度标签的映射关系,将粗粒度模型的标记空间中每个粗粒度标签的权重值合并到细粒度模型的标记空间中的每个细粒度标签的权重值。
本示例中,可以基于群体粗粒度模型的权重和细粒度模型的权重对两个模型进行组合,将群体粗粒度模型的权重和细粒度模型的权重相加得到整体模型的权重。细粒度标签的权重以该细粒度标签对应的粗粒度标签权重作为基,细粒度标签的权重等效于细粒度模型维护的一个偏移量,整体模型(联合模型)的输出映射至个体细粒度标签,使得联合模型输出结果实现端侧个性化。
在一种可能的实现方式中,节点设备根据映射关系将训练数据集中的细粒度数据映射为粗粒度数据可以具体包括:节点设备获取训练数据集,训练数据集中的样本数据的标记空间为细粒度标签,然后,节点设备根据细粒度标签和粗粒度标签的映射关系,将样本数据的标记空间替换为粗粒度标签,得到粗粒度数据。该粗粒度数据用于训练群体粗粒度模型。
在一种可能的实现方式中,联合模型为应用预测模型;粗粒度标签为根据应用的功能进行分类后,得到的类别标签,细粒度标签为应用的名称;训练数据集中的样本数据为:时间信息及其对应的应用的名称。
在一种可能的实现方式中,组合群体粗粒度模型和细粒度模型以得到联合模型之后,方法还包括:节点设备获取当前的时间信息;将时间信息输入到训练好的联合模型,联合模型输出预测结果,预测结果用于指示目标应用,预加载目标应用。
本示例中,该联合模型可以为应用预测模型,节点设备通过该应用预测模型预测用户可能会使用哪个应用,而预先加载该目标应用,节省开启该目标应用的响应时长,提升用户体验。
在第四方面中,本申请实施例还提供了另一种模型训练方法,应用于联合学习系统,联合学习系统包括多个节点设备及中控设备,节点设备配置有群体粗粒度模型和细粒度模型,该方法应用于中控设备,中控设备获取多个节点设备的细粒度标签,中控设备对多个细粒度标签进行分类,确定多个类别,将类别作为粗粒度标签;并确定细粒度标签与粗粒度标签的映射关系;然后将映射关系发送给多个节点设备;以使节点设备根据映射关系将训练数据集中细粒度数据映射为粗粒度数据;将粗粒度数据输入到群体粗粒度模型进行训练,通过多个 节点设备的联合学习对群体粗粒度模型进行更新;并将细粒度数据输入到细粒度模型进行训练;组合群体粗粒度模型和细粒度模型以得到联合模型,联合模型的标记空间为细粒度标签,联合模型的输出结果用于更新细粒度模型。
在一种可能的实现方式中,方法还包括:中控设备接收多个节点设备发送的第一信息,然后,中控设备对接收到的多个节点设备上传的第一信息进行整合,得到第二信息,然后,向多个节点设备发送第二信息,第二信息用于更新群体粗粒度模型。
本示例中,每个节点设备通过本地数据对群体粗粒度模型进行训练,为了达到多个节点设备联合训练的目的,每个节点设备仅将各自的第一信息(如参数值)传输至中控设备,以保证各节点设备本地数据的隐私性,中控设备将接收到的各参数值进行整合,即整合多个节点设备中各自节点设备中本地数据的特征,将整合之后的参数值下发给各个节点设备,各节点设备可以根据中控设备下发的参数值对本地的群体粗粒度模型进行更新,即完成一次更新,从而使得本地的群体粗粒度模型具有群体性。
在一种可能的实现方式中,节点设备还配置有个体粗粒度模型;中控设备接收多个节点设备发送的个体粗粒度模型,并且确定多个节点设备上传的个体粗粒度模型之间的相关度,然后,从多个节点设备上传的个体粗粒度模型中选择相关度高于阈值的至少2个目标个体粗粒度模型进行集成,得到更新后的个体粗粒度模型;最后,再将更新后的个体粗粒度模型发送至目标个体粗粒度模型对应的节点设备。
本示例中,群体粗粒度模型、个体粗粒度模型和细粒度模型组合为一个整体模型,群体粗粒度模型能够挖掘群体性的规律,能够为节点设备中的细粒度模型提供一个好的初始点。但是存在群体性的规律与个体特征之间的差距巨大的情况,而个体粗粒度模型可以弥合少数情况下群体性与个体性的差距。
在一种可能的实现方式中,确定多个节点设备上传的个体粗粒度模型之间的相关度可以包括:中控设备确定每个节点设备所属用户的用户画像;然后根据用户画像的相似度确定节点设备的个体粗粒度模型之间的相关度。
本示例中,可以根据用户画像将具有相同或相似特征的用户对应的个体粗粒度模型进行集成,使得个体粗粒度模型弥合少数情况下群体性与个体性的差距。
在一种可能的实现方式中,确定多个节点设备上传的个体粗粒度模型之间的相关度还可以包括:中控设备确定每个个体粗粒度模型输出的多个粗粒度标签的分布信息;然后,基于该分布信息确定个体粗粒度模型之间的相关度。
本示例中,中控设备不需要获取用户的相关数据,根据个体粗粒度模型输出的多个粗粒度标签的分布信息来确定个体粗粒度模型之间的相关度,从而保护用户的隐私。
在第四方面中,本申请实施例还提供了一种节点设备,节点设备配置有群体粗粒度模型和细粒度模型,节点设备包括收发模块和处理模块;
收发模块,用于获取细粒度标签与粗粒度标签的映射关系;
处理模块,用于根据收发模块获取到的映射关系将训练数据集中的细粒度数据映射为粗粒度数据;
处理模块,还用于将粗粒度数据输入到群体粗粒度模型进行训练;
收发模块,用于通过多个节点设备的联合学习对群体粗粒度模型进行更新;
处理模块,还用于将细粒度数据输入到细粒度模型进行训练;组合群体粗粒度模型和细粒度模型以得到联合模型,联合模型的标记空间映射为细粒度标签,联合模型的输出结果用 于更新细粒度模型。
在一种可能的实现方式中,处理模块,还用于将粗粒度数据输入到群体粗粒度模型进行训练,确定群体粗粒度模型对应的第一信息;
收发模块,还用于将第一信息发送至中控设备;并接收第二信息,第二信息为中控设备对接收到的多个节点设备上传的第一信息进行整合后得到的;第二信息用于更新群体粗粒度模型;
在一种可能的实现方式中,节点设备还包括个体粗粒度模型;
处理模块,还用于组合群体粗粒度模型、个体粗粒度模型和细粒度模型以得到联合模型。
在一种可能的实现方式中,收发模块,还用于将个体粗粒度模型上传到中控设备;并接收中控设备发送的更新后的个体粗粒度模型;其中,更新后的个体粗粒度模型为:中控设备在多个节点设备上传的个体粗粒度模型中选择相关度高于阈值的至少2个个体粗粒度模型进行集成后得到的。
在一种可能的实现方式中,处理模块,还用于基于群体粗粒度模型的权重值和细粒度模型的权重值组合粗粒度模型和细粒度模型。
在一种可能的实现方式中,处理模块,还用于在联合模型的输出层,根据细粒度标签与粗粒度标签的映射关系,将粗粒度模型的标记空间中每个粗粒度标签的权重值合并到细粒度模型的标记空间中的每个细粒度标签的权重值。
在一种可能的实现方式中,处理模块,还用于获取训练数据集,训练数据集中的样本数据的标记空间为细粒度标签;根据细粒度标签和粗粒度标签的映射关系,将样本数据的标记空间替换为粗粒度标签,得到粗粒度数据。
在一种可能的实现方式中,联合模型为应用预测模型;粗粒度标签为根据应用的功能进行分类后,得到的类别标签,细粒度标签为应用的名称。
在一种可能的实现方式中,处理模块,还用于获取当前的时间信息;将时间信息输入到训练好的联合模型,联合模型输出预测结果,预测结果用于指示目标应用;预加载目标应用。
在第四方面中,本申请实施例还提供了一种中控设备,应用于联合学习系统,联合学习系统包括多个节点设备及中控设备,节点设备配置有群体粗粒度模型和细粒度模型,该中控设备包括处理模块和收发模块;
收发模块,用于获取多个节点设备的细粒度标签;
处理模块,用于对多个细粒度标签进行分类,确定多个类别,将类别作为粗粒度标签;并确定细粒度标签与粗粒度标签的映射关系;
收发模块,还用于将映射关系发送给多个节点设备;以使节点设备根据映射关系将训练数据集中细粒度数据映射为粗粒度数据;将粗粒度数据输入到群体粗粒度模型进行训练,并通过多个节点设备的联合学习对群体粗粒度模型进行更新;将细粒度数据输入到细粒度模型进行训练;组合群体粗粒度模型和细粒度模型以得到联合模型,联合模型的标记空间为细粒度标签,联合模型的输出结果用于更新细粒度模型。
在一种可能的实现方式中,收发模块,用于接收多个节点设备发送的第一信息;
处理模块,还用于对接收到的多个节点设备上传的第一信息进行整合,得到第二信息;收发模块,还用于向多个节点设备发送第二信息,第二信息用于更新群体粗粒度模型。
在一种可能的实现方式中,节点设备还配置有个体粗粒度模型;
收发模块,还用于接收多个节点设备发送的个体粗粒度模型;
处理模块,还用于确定多个节点设备上传的个体粗粒度模型之间的相关度;从多个节点设备上传的个体粗粒度模型中选择相关度高于阈值的至少2个目标个体粗粒度模型进行集成,得到更新后的个体粗粒度模型;
收发模块,还用于将更新后的个体粗粒度模型发送至目标个体粗粒度模型对应的节点设备。
在一种可能的实现方式中,处理模块,还用于确定每个节点设备所属用户的用户画像;
处理模块,还用于根据用户画像的相似度确定节点设备的个体粗粒度模型之间的相关度。
在一种可能的实现方式中,处理模块,还用于确定每个个体粗粒度模型输出的多个粗粒度标签的分布信息;基于分布信息确定个体粗粒度模型之间的相关度。
在第四方面中,本申请实施例还提供了一种节点设备,包括处理器,处理器和存储器耦合,存储器存储有程序指令,当存储器存储的程序指令被处理器执行时实现上述第四方面任一项的方法。
在第四方面中,本申请实施例还提供了一种中控设备,包括处理器,处理器和存储器耦合,存储器存储有程序指令,当存储器存储的程序指令被处理器执行时实现上述第四方面的方法。
在第四方面中,本申请实施例还提供了一种计算机可读存储介质,包括程序,当其在计算机上运行时,使得计算机执行如上述第四方面中任一项的方法。
在第四方面中,本申请实施例还提供了一种芯片系统,该芯片系统包括处理器,用于支持节点设备实现上述第四方面中所涉及的功能。
在一种可能的实现方式中,芯片系统还包括存储器,存储器,用于保存节点设备必要的程序指令和数据,或者,用于保存中控设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包括芯片和其他分立器件。
第五方面,本申请实施例提供了一种基于神经网络的数据处理方法,该方法可以应用于在模拟数据的生成过程中的服务器,或者是服务器的部件(例如处理器、芯片或芯片系统等),在该方法中,服务器首先将测试数据输入至第一生成器,经过该第一生成器处理后得到第一模拟数据;然后,服务器将该测试数据和该第一模拟数据输入至该第一判别器,经过该第一判别器处理后得到第一判别结果,该第一判别结果用于指示该测试数据和该第一模拟数据之间的差异;此后,服务器再根据该第一判别结果更新该第一生成器的权重系数,得到第二生成器;最后,服务器在该第二生成器中生成第二模拟数据。其中,服务器通过生成式对抗神经网络中的第一生成器和第一判别器的处理过程,对第一生成器中权重系数的进行更新优化以得到第二生成器,利用生成式对抗网络的特性,降低在生成器中所生成的模拟数据与原始输入的测试数据之间的偏差,从而,提升神经网络所生成的模拟数据的数据质量。
在一种可能的实现方式中,服务器在该第二生成器中生成第二模拟数据之后,该方法还包括:该服务器利用第一目标模拟数据输入预设的训练网络,训练得到预测模型,该第一目标模拟数据包括该第二模拟数据。
本实施例中,服务器可以利用生成式对抗网络得到的第二生成器所生成的第二模拟数据,作为预设的训练网络的输入数据的一部分,进行训练得到预测模型,由于该第二模拟数据与原始输入的测试数据之间的偏差较小,因此,通过该第二模拟数据参与训练网络的训练过程,可以提升后续得到的预测模型的预测效果,使得在模拟环境中训练得到较优的预测模型。
在一种可能的实现方式中,该方法还包括:服务器将第二目标模拟数据输入该预测模型,经过该预测模型处理得到目标预测结果,该第二目标模拟数据包括该第二模拟数据。
本实施例中,服务器可以利用生成式对抗网络得到的第二生成器所生成的第二模拟数据,作为预测模型的输入数据的一部分,即得到所生成的模拟数据在预测模型中对应的目标预测结果,解决预测模型中训练数据过少的问题。
在一种可能的实现方式中,该方法还包括:服务器向客户端发送该预测模型;然后,该服务器接收该客户端发送的初始预测结果,该初始预测结果为该预测模型对用户操作数据进行训练得到;此后,服务器将该目标预测结果和该初始预测结果输入至第二判别器进行训练,输出第二判别结果,该第二判别结果用于指示该目标预测结果和该初始预测结果之间的差异;进一步地,该服务器根据该第二判别结果更新该第二生成器的权重系数,得到第三生成器;最后,服务器在该第三生成器中生成第三模拟数据。
本实施例中,服务器可以向客户端发送该预测模型,并接收客户端使用用户操作数据在该预测模型中进行训练得到的初始预测结果,并将通过模拟数据在该预测模型中得到的目标预测结果和该初始预测结果一并作为第二判别器的输入,得到用于更新第二生成器的权重系数,更新第二生成器得到第三生成器,并在该第三生成器中生成第三模拟数据。其中,第三模拟数据为服务器使用第二判别器对第二生成器进行权重系数更新得到的,相比于第二生成器所生成的第二模拟数据,第三模拟数据可以进一步利用生成式对抗网络的特性,实现在第三生成器中所生成的第三模拟数据与原始输入的测试数据之间的偏差的进一步降低,从而,进一步提升神经网络所生成的模拟数据的数据质量。
在一种可能的实现方式中,服务器根据该第二判别结果更新该第二生成器的权重系数,得到第三生成器包括:若满足第一条件,则根据该第二判别结果更新该第二生成器的权重系数,得到该第三生成器;其中,该第一条件包括:在该目标预设结果和该初始预测结果之间的经验分布度量小于第一预设值时;和/或,在该第二判别器对应的损失函数的取值大于第二预设值时;和/或,在该预测模型的损失函数小于第三预设值时。
本实施例中,服务器可以在满足上述第一条件时再执行根据第二判别结果更新第二生成器的权重系数的过程,即通过第一条件的限制,在第二判别器和/或预测模型的模型效果达到一定条件时,服务器才执行更新第二生成器的权重系数的过程,可以进一步优化更新得到的第三生成器所生成的第三模拟数据的数据质量。
在一种可能的实现方式中,该第一目标模拟数据还包括该测试数据。
本实施例中,服务器输入到预设的训练网络进行训练得到预测模型的输入数据中,该第一目标模拟数据还可以包括测试数据,可以进一步丰富训练网络的输入,使得训练网络可以训练得到更多的数据特征,从而提升预测模型在后续执行预测过程的预测效果。
在一种可能的实现方式中,服务器根据该第一判别结果更新该第一生成器的权重系数,得到第二生成器包括:若满足第二条件,则根据该第一判别结果更新该第一生成器的权重系数,得到第二生成器;其中,该第二条件包括:在该测试数据和该第一模拟数据之间的经验分布度量小于第四预设值时;和/或,在该第一判别器对应的损失函数的取值大于第五预设值时。
本实施例中,服务器可以在满足上述第二条件时再执行根据第一判别结果更新该第一生成器的权重系数的过程,即通过第二条件的限制,在第一判别器的模型效果达到一定条件时,服务器才执行更新第一生成器的权重系数的过程,可以进一步优化更新得到的第二生成器所 生成的第二模拟数据的数据质量。
在一种可能的实现方式中,在该第二生成器中生成第二模拟数据之前,若不满足该第二条件时,该方法还包括:将该测试数据输入至该第二生成器,经过该第二生成器处理后得到第四模拟数据;将该测试数据和该第四模拟数据输入至该第一判别器,经过该第一判别器处理后得到第三判别结果,该第三判别结果用于指示该测试数据和该第四模拟数据之间的差异;根据该第三判别结果更新该第二生成器的权重系数。
本实施例中,服务器可以在不满足上述第二条件时,执行将测试数据输入至第二生成器,并通过第一判别器的进一步处理得到用于更新第二生成器的第三判别结果,即可以进一步利用生成式对抗网络的特性,对第二生成器的权重系数进行优化。
在一种可能的实现方式中,该预测模型为意图决策模型。
本实施例中,该方法可以应用于意图决策判别过程中,相对应的,该预测模型在该过程中可以为意图决策模型,从而,提供了该预测模型的一种具体的实现方式,提升方案的可实现性。
在第五方面中,本申请实施例还提供了另一种基于神经网络的数据处理方法,该方法可以应用于在模拟数据的生成过程中的客户端中,或者是客户端的部件(例如处理器、芯片或芯片系统等),在该方法中,客户端接收来自服务器的预测模型;然后,该客户端获取用户操作数据;此后,该客户端将该用户操作数据输入至该预测模型,经过训练得到初始预测结果;
最后,该客户端向该服务器发送该初始预测结果,该初始预测结果用于作为判别器的输入,经过该判别器的处理得到用于更新生成器权重系数的判别结果。其中,客户端可以根据使用用户操作数据作为服务器所发送的预测模型的输入数据,并训练得到初始预测结果之后,向该服务器发送初始预测结果,其中,该初始预测结果用于作为判别器的输入,经过该判别器的处理得到用于更新生成器权重系数的判别结果,使得服务器可以利用生成式对抗网络的特性,降低在生成器中所生成的模拟数据与原始输入的测试数据之间的偏差,从而,提升神经网络所生成的模拟数据的数据质量;此外,由于客户端仅需要向服务器发送用户操作数据对应的初始预测结果,相比于客户端向服务器发送用户操作数据的方式,可以避免用户的隐私泄露,从而提升用户体验。
在一种可能的实现方式中,客户端获取用户操作数据的过程具体包括:客户端响应于用户操作,获取该用户操作对应的初始操作数据;此后,该客户端提取该初始操作数据的数据特征,得到该用户操作数据。
本实施例中,客户端可以通过获取用户操作对应的初始操作数据并进行特征提取的方式,获取得到输入到预测模型中的用户操作数据,提供了客户端获取用户操作数据的一种具体的实现方式,提升方案的可实现性。
在第五方面中,本申请实施例还提供了一种基于神经网络的数据处理装置,该装置包括:
第一处理单元,用于将测试数据输入至第一生成器,经过该第一生成器处理后得到第一模拟数据;
第二处理单元,用于将该测试数据和该第一模拟数据输入至该第一判别器,经过该第一判别器处理后得到第一判别结果,该第一判别结果用于指示该测试数据和该第一模拟数据之间的差异;
第一更新单元,用于根据该第一判别结果更新该第一生成器的权重系数,得到第二生成器;
第一生成单元,用于在该第二生成器中生成第二模拟数据。
本实施例中,第一处理单元和第二处理单元通过生成式对抗神经网络中的第一生成器和第一判别器的处理过程,第一更新单元对第一生成器中权重系数的进行更新优化以得到第二生成器,并通过第一生成单元在第二生成器中生成第二模拟数据,即利用生成式对抗网络的特性,降低在生成器中所生成的模拟数据与原始输入的测试数据之间的偏差,从而,提升神经网络所生成的模拟数据的数据质量。
在一种可能的实现方式中,该装置还包括:
第一训练单元,用于利用第一目标模拟数据输入预设的训练网络,训练得到预测模型,该第一目标模拟数据包括该第二模拟数据。
在一种可能的实现方式中,该装置还包括:
第三处理单元,用于将第二目标模拟数据输入该预测模型,经过该预测模型处理得到目标预测结果,该第二目标模拟数据包括该第二模拟数据。
在一种可能的实现方式中,该装置还包括:
发送单元,用于向客户端发送该预测模型;
接收单元,用于接收该客户端发送的初始预测结果,该初始预测结果为该预测模型对用户操作数据进行训练得到;
第二训练单元,用于将该目标预测结果和该初始预测结果输入至第二判别器进行训练,输出第二判别结果,该第二判别结果用于指示该目标预测结果和该初始预测结果之间的差异;
第二更新单元,用于根据该第二判别结果更新该第二生成器的权重系数,得到第三生成器;
第二生成单元,用于在该第三生成器中生成第三模拟数据。
在一种可能的实现方式中,该第二更新单元具体用于:
若满足第一条件,则根据该第二判别结果更新该第二生成器的权重系数,得到该第三生成器;其中,该第一条件包括:
在该目标预设结果和该初始预测结果之间的经验分布度量小于第一预设值时;和/或,
在该第二判别器对应的损失函数的取值大于第二预设值时;和/或,
在该预测模型的损失函数小于第三预设值时。
在一种可能的实现方式中,该第一目标模拟数据还包括该测试数据。
在一种可能的实现方式中,该第一更新单元具体用于:
若满足第二条件,则根据该第一判别结果更新该第一生成器的权重系数,得到第二生成器;其中,该第二条件包括:
在该测试数据和该第一模拟数据之间的经验分布度量小于第四预设值时;和/或,
在该第一判别器对应的损失函数的取值大于第五预设值时。
在一种可能的实现方式中,若不满足该第二条件时,该装置还包括:
第四处理单元,用于将该测试数据输入至该第二生成器,经过该第二生成器处理后得到第四模拟数据;
第五处理单元,用于将该测试数据和该第四模拟数据输入至该第一判别器,经过该第一判别器处理后得到第三判别结果,该第三判别结果用于指示该测试数据和该第四模拟数据之间的差异;
第三更新单元,用于根据该第三判别结果更新该第二生成器的权重系数。
在一种可能的实现方式中,该预测模型为意图决策模型。
在第五方面中,本申请实施例还提供了一种基于神经网络的数据处理装置,该装置包括:
收发单元,用于接收来自服务器的预测模型;
该收发单元,用于获取用户操作数据;
训练单元,用于将该用户操作数据输入至该预测模型,经过训练得到初始预测结果;
该收发单元,用于向该服务器发送该初始预测结果,该初始预测结果用于作为判别器的输入,经过该判别器的处理得到用于更新生成器权重系数的判别结果。
本实施例中,训练单元可以根据使用用户操作数据作为服务器所发送的预测模型的输入数据,并训练得到初始预测结果之后,收发单元向该服务器发送初始预测结果,其中,该初始预测结果用于作为判别器的输入,经过该判别器的处理得到用于更新生成器权重系数的判别结果,使得服务器可以利用生成式对抗网络的特性,降低在生成器中所生成的模拟数据与原始输入的测试数据之间的偏差,从而,提升神经网络所生成的模拟数据的数据质量;此外,由于客户端仅需要向服务器发送用户操作数据对应的初始预测结果,相比于客户端向服务器发送用户操作数据的方式,可以避免用户的隐私泄露,从而提升用户体验。
在一种可能的实现方式中,该收发单元具体用于:
响应于用户操作,获取该用户操作对应的初始操作数据;
提取该初始操作数据的数据特征,得到该用户操作数据。
在第五方面中,本申请实施例还提供了一种服务器,包括处理器,处理器和存储器耦合,存储器存储有程序指令,当存储器存储的程序指令被处理器执行时使得装置实现上述第五方面及其任意一种实现方式中的基于神经网络的数据处理方法。装置可以为电子设备(如终端设备或服务器设备);或可以为电子设备中的一个组成部分,如芯片。
在第五方面中,本申请实施例还提供了一种客户端,包括处理器,处理器和存储器耦合,存储器存储有程序指令,当存储器存储的程序指令被处理器执行时使得装置实现上述第五方面及其任意一种实现方式中的基于神经网络的数据处理方法。装置可以为电子设备(如终端设备或服务器设备);或可以为电子设备中的一个组成部分,如芯片。
在第五方面中,本申请实施例还提供了一种计算机可读存储介质,计算机可读存储介质中存储有计算机程序,当其在计算机上运行时,使得计算机执行上述第五方面及其任意一种实现方式中的基于神经网络的数据处理方法。
在第五方面中,本申请实施例还提供了一种电路系统,电路系统包括处理电路,处理电路配置为执行上述第五方面及其任意一种实现方式中的基于神经网络的数据处理方法。
在第五方面中,本申请实施例还提供了一种计算机程序,当其在计算机上运行时,使得计算机执行上述第五方面及其任意一种实现方式中的基于神经网络的数据处理方法。
在第五方面中,本申请实施例还提供了一种芯片系统,该芯片系统包括处理器,用于支持服务器实现上述第五方面及其任意一种实现方式中所涉及的功能,例如,发送或处理上述方法中所涉及的数据和/或信息。在一种可能的设计中,芯片系统还包括存储器,存储器,用于保存数据处理设备或通信设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包括芯片和其他分立器件。
第六方面,本申请实施例提供了一种意图识别方法,包括:电子设备确定待识别的打点数据序列,该待识别的打点数据序列由打点数据组成,该打点数据包括该电子设备记录的用 户的操作数据和/或该电子设备对用户操作的响应数据;该电子设备将该待识别的打点数据序列输入多示例学习模型,得到多个子序列;该多示例学习模型为已采用电子设备中的打点数据序列训练过的多示例学习模型;该电子设备按照预设意图规则确定第一子序列的意图,该第一子序列为该多个子序列中的一个子序列,该预设意图规则用于根据序列中的打点数据确定序列的意图。
本申请实施例中,电子设备可以采用训练好的多示例学习模型,将用户操作产生的打点数据序列作为待识别的打点数据序列划分为粒度更小的多个子序列。再采用第二预设规则确定出各个子序列的意图。由于使用的该多示例学习模型是使用用户自己的打点数据训练出来的,因此该多示例学习模型划分的子序列更符合用户个性化的使用习惯。然后再使用第二预设规则确定各个子序列的意图,使得识别出的意图更准确。
在一些实施例中,该电子设备确定待识别的打点数据序列,具体包括:响应于用户的连续操作,该电子设备生成多个打点数据;该电子设备将该多个打点数据确定为该待识别的打点数据序列。
上述实施例中,待识别的打点数据序列的打点数据可以由用户的连续操作生成的打点数据组成,这样的数据使用其他的意图识别方式非常难以确定其中各打点数据的意图。但将其输入本申请实施例中的多示例学习模型后,可以将其拆分为多个子序列,再分别确定各子序列的意图,使得识别出的意图更准确。
在一些实施例中,待识别的打点数据序列中也可以包括由非连续操作产生的打点数据,此处不作限定。
可选的,电子设备可以将预设时间周期内产生的打点数据组成为该待识别的打点数据序列;
可选的,电子设备可以在未识别的打点数据累积到预设累积数目时,将达到预设累积数目的所有未识别的打点数据组成待识别的打点数据序列。
在一些实施例中,该电子设备确定待识别的打点数据序列的步骤之前,该方法还包括:该电子设备使用初始打点数据序列训练预置多示例学习模型,得到该多示例学习模型;该初始打点数据序列中包括用户使用该电子设备产生的打点数据,和/或,出厂预置的打点数据。
在一些实施例中,该电子设备使用初始打点数据序列训练预置多示例学习模型,得到该多示例学习模型,具体包括:该电子设备按照预设拆分规则将该初始打点数据序列拆分为多个分序列;该预设拆分规则用于将打点数据序列划分为不同的分序列,且一个分序列根据该预设意图规则至少可以确定一个明确的意图;该电子设备将该多个分序列作为多个待处理序列,从该多个待处理序列中提取训练数据;该电子设备使用该训练数据训练该预置多示例学习模型,得到该多示例学习模型。
上述实施例中,电子设备可以使用初始打点数据序列训练预置多示例学习模型,从而得到可使用的多示例学习模型,不需要通过大量的人工标注打点数据,提升了打点数据的标注效率和范围,节省了时间和成本。
在一些实施例中,该方法还包括:该电子设备使用该待识别的打点数据序列对该多示例学习模型进行训练,更新该多示例学习模型。
上述实施例中,电子设备可以使用该待识别的打点数据序列对该多示例学习模型进行训练,通过增量训练的方式更新多示例学习模型,提升了多示例学习模型拆分子序列的准确性。
在第六方面中,本申请实施例还提供了一种电子设备,该电子设备包括:一个或多个处 理器和存储器;该存储器与该一个或多个处理器耦合,该存储器用于存储计算机程序代码,该计算机程序代码包括计算机指令,该一个或多个处理器调用该计算机指令以使得该电子设备执行:确定待识别的打点数据序列,该待识别的打点数据序列由打点数据组成,该打点数据包括该电子设备记录的用户的操作数据和/或该电子设备对用户操作的响应数据;将该待识别的打点数据序列输入多示例学习模型,得到多个子序列;该多示例学习模型为已采用该电子设备中的打点数据序列训练过的多示例学习模型;按照预设意图规则确定第一子序列的意图,该第一子序列为该多个子序列中的一个子序列,该预设意图规则用于根据序列中的打点数据确定序列的意图。
本申请实施例中,电子设备可以采用训练好的多示例学习模型,将用户操作产生的打点数据序列作为待识别的打点数据序列划分为粒度更小的多个子序列。再采用第二预设规则确定出各个子序列的意图。由于使用的该多示例学习模型是使用用户自己的打点数据训练出来的,因此该多示例学习模型划分的子序列更符合用户个性化的使用习惯。然后再使用第二预设规则确定各个子序列的意图,使得识别出的意图更准确。
在一些实施例中,该一个或多个处理器,具体用于调用该计算机指令以使得该电子设备执行:响应于用户的连续操作,该电子设备生成多个打点数据;该电子设备将该多个打点数据确定为该待识别的打点数据序列。
在一些实施例中,待识别的打点数据序列中也可以包括由非连续操作产生的打点数据,此处不作限定。
可选的,电子设备可以将预设时间周期内产生的打点数据组成为该待识别的打点数据序列;
可选的,电子设备可以在未识别的打点数据累积到预设累积数目时,将达到预设累积数目的所有未识别的打点数据组成待识别的打点数据序列。
在一些实施例中,该一个或多个处理器,还用于调用该计算机指令以使得该电子设备执行:使用初始打点数据序列训练预置多示例学习模型,得到该多示例学习模型;该初始打点数据序列中包括用户使用该电子设备产生的打点数据,和/或,出厂预置的打点数据。
在一些实施例中,该一个或多个处理器,具体用于调用该计算机指令以使得该电子设备执行:按照预设拆分规则将该初始打点数据序列拆分为多个分序列;该预设拆分规则用于将打点数据序列划分为不同的分序列,且一个分序列根据该预设意图规则至少可以确定一个明确的意图;将该多个分序列作为多个待处理序列,从该多个待处理序列中提取训练数据;使用该训练数据训练该预置多示例学习模型,得到该多示例学习模型。
在一些实施例中,该一个或多个处理器,还用于调用该计算机指令以使得该电子设备执行:使用该待识别的打点数据序列对该多示例学习模型进行训练,更新该多示例学习模型。
在第六方面中,本申请实施例还提供了一种芯片系统,该芯片系统应用于电子设备,该芯片系统包括一个或多个处理器,该处理器用于调用计算机指令以使得该电子设备执行如第六方面以及第六方面中任一可能的实现方式描述的方法。
在第六方面中,本申请实施例还提供一种包含指令的计算机程序产品,当上述计算机程序产品在电子设备上运行时,使得上述电子设备执行如第六方面以及第六方面中任一可能的实现方式描述的方法。
在第六方面中,本申请实施例还提供一种计算机可读存储介质,包括指令,当上述指令在电子设备上运行时,使得上述电子设备执行如第六方面以及第六方面中任一可能的实现方 式描述的方法。
在第六方面中,本申请实施例还提供了一种多示例学习模型训练方法,包括:将多个分序列或多个子序列作为多个待处理序列,从该多个待处理序列中提取训练数据;该多个分序列由该电子设备按照第一预设规则将初始打点数据序列划分得到,该多个子序列由该电子设备将打点数据序列输入多示例学习模型后输出得到;该预设拆分规则用于将打点数据序列划分为不同的分序列,且一个分序列根据预设意图规则至少可以确定一个明确的意图;该预设意图规则用于根据序列中的打点数据确定序列的意图;该打点数据包括电子设备记录的用户的操作数据和/或该电子设备对用户操作的响应数据;该训练数据中包括包标签和包的特征向量矩阵;使用该训练数据训练该多示例学习模型。
本申请实施例中,训练装置可以直接从待处理序列中提取训练数据,对多示例学习模型进行训练,而不需要人工标注打点数据作为训练数据,节省了训练数据的标注时间,提升了训练装置的训练效率。
在一些实施例中,该使用该训练数据训练该多示例学习模型的步骤之后,该方法还包括:将该多个待处理序列,输入该多示例学习模型,得到多个子序列;确定本轮训练后的多示例学习模型的损失函数的值;确定相比于上一轮训练后得到的多示例学习模型的损失函数的值,本轮训练后得到的多示例学习模型的损失函数的值的减小幅度是否小于预设减小幅度;当确定不小于该预设减小幅度时,将该多个子序列作为多个待处理序列,执行该电子设备将多个分序列或多个子序列作为多个待处理序列,从该多个待处理序列中提取训练数据的步骤;当确定小于该预设减小幅度时,确定本轮训练得到的该多示例学习模型为训练完成的多示例学习模型。
上述实施例中,可以采用迭代训练的方式对多示例学习模型进行训练,得到更加准确的多示例学习模型。
在一些实施例中,该方法还包括:将新增打点数据序列输入该多示例学习模型,得到多个子序列;该新增打点数据序列为该电子设备中新增加的打点数据组成的打点数据序列;将该多个子序列作为多个待处理序列,从该多个待处理序列中提取训练数据;使用该训练数据对该多示例学习模型进行训练,更新该多示例学习模型。
上述实施例中,电子设备可以使用新增加的打点数据对该多示例学习模型进行训练,通过增量训练的方式更新多示例学习模型,提升了多示例学习模型拆分子序列的准确性。
在一些实施例中,该使用该训练数据对该多示例学习模型进行训练,更新该多示例学习模型的步骤之后,该方法还包括:确定本轮训练后的多示例学习模型的损失函数的值;确定相比于上一轮训练后得到的多示例学习模型的损失函数的值,本轮训练后得到的多示例学习模型的损失函数的值的减小幅度是否小于预设减小幅度;当确定不小于该预设减小幅度时,将该多个子序列作为多个待处理序列,执行该将该多个子序列作为多个待处理序列,从该多个待处理序列中提取训练数据的步骤;当确定小于该预设减小幅度时,确定本轮训练得到的多示例学习模型为训练完成的多示例学习模型,更新该多示例学习模型。
上述实施例中,可以采用迭代训练的方式对多示例学习模型进行增量训练,得到更加准确的多示例学习模型。
在一些实施例中,该从该多个待处理序列中提取训练数据,具体包括:确定该多个待处理序列中的示例和示例标签;该示例由相邻的两条打点数据组成;该示例标签用于表示该示例为正示例或负示例;根据该多个待处理序列、该示例和示例标签,确定包和包标签;该包 标签用于表示该包为正包或负包;该正包中包括同一个待处理序列中的打点数据组成的示例;该负包中包括位于一个待处理序列中的最后一个打点数据和与该待处理序列连续的下一个待处理序列中的第一个打点数据组成的示例;提取每个包的特征向量矩阵,将该每个包的特征向量矩阵与相应的包标签作为该训练数据。
上述实施例中,可以通过确定示例和示例标签、确定包和包标签,并提取每个包的特征向量矩阵和相应的包标签作为训练数据,实现了训练数据的自标注,提升了训练数据的标注效率。
在第六方面中,本申请实施例还提供了一种训练装置,该训练装置包括:一个或多个处理器和存储器;该存储器与该一个或多个处理器耦合,该存储器用于存储计算机程序代码,该计算机程序代码包括计算机指令,该一个或多个处理器调用该计算机指令以使得该训练装置执行:将多个分序列或多个子序列作为多个待处理序列,从该多个待处理序列中提取训练数据;该多个分序列由电子设备按照第一预设规则将初始打点数据序列划分得到,该多个子序列由该电子设备将打点数据序列输入多示例学习模型后输出得到;该预设拆分规则用于将打点数据序列划分为不同的分序列,且一个分序列根据预设意图规则至少可以确定一个明确的意图;该预设意图规则用于根据序列中的打点数据确定序列的意图;该打点数据包括电子设备记录的用户的操作数据和/或该电子设备对用户操作的响应数据;该训练数据中包括包标签和包的特征向量矩阵;使用该训练数据训练该多示例学习模型。
本申请实施例中,训练装置可以直接从待处理序列中提取训练数据,对多示例学习模型进行训练,而不需要人工标注打点数据作为训练数据,节省了训练数据的标注时间,提升了训练装置的训练效率。
在一些实施例中,该一个或多个处理器,还用于调用该计算机指令以使得该训练装置执行:将该多个待处理序列,输入该多示例学习模型,得到多个子序列;确定本轮训练后的多示例学习模型的损失函数的值;确定相比于上一轮训练后得到的多示例学习模型的损失函数的值,本轮训练后得到的多示例学习模型的损失函数的值的减小幅度是否小于预设减小幅度;当确定不小于该预设减小幅度时,将该多个子序列作为多个待处理序列,执行该电子设备将多个分序列或多个子序列作为多个待处理序列,从该多个待处理序列中提取训练数据的步骤;当确定小于该预设减小幅度时,确定本轮训练得到的该多示例学习模型为训练完成的多示例学习模型。
在一些实施例中,该一个或多个处理器,还用于调用该计算机指令以使得该训练装置执行:将新增打点数据序列输入该多示例学习模型,得到多个子序列;该新增打点数据序列为该电子设备中新增加的打点数据组成的打点数据序列;将该多个子序列作为多个待处理序列,从该多个待处理序列中提取训练数据;使用该训练数据对该多示例学习模型进行训练,更新该多示例学习模型。
在一些实施例中,该一个或多个处理器,还用于调用该计算机指令以使得该训练装置执行:确定本轮训练后的多示例学习模型的损失函数的值;确定相比于上一轮训练后得到的多示例学习模型的损失函数的值,本轮训练后得到的多示例学习模型的损失函数的值的减小幅度是否小于预设减小幅度;当确定不小于该预设减小幅度时,将该多个子序列作为多个待处理序列,执行该将该多个子序列作为多个待处理序列,从该多个待处理序列中提取训练数据的步骤;当确定小于该预设减小幅度时,确定本轮训练得到的多示例学习模型为训练完成的多示例学习模型,更新该多示例学习模型。
在一些实施例中,该一个或多个处理器,具体用于调用该计算机指令以使得该训练装置执行:确定该多个待处理序列中的示例和示例标签;该示例由相邻的两条打点数据组成;该示例标签用于表示该示例为正示例或负示例;根据该多个待处理序列、该示例和示例标签,确定包和包标签;该包标签用于表示该包为正包或负包;该正包中包括同一个待处理序列中的打点数据组成的示例;该负包中包括位于一个待处理序列中的最后一个打点数据和与该待处理序列连续的下一个待处理序列中的第一个打点数据组成的示例;提取每个包的特征向量矩阵,将该每个包的特征向量矩阵与相应的包标签作为该训练数据。
在第六方面中,本申请实施例还提供了一种训练数据生成方法,包括:确定多个待处理序列中的示例和示例标签;该多个待处理序列为多个子序列或多个分序列;该多个分序列由该电子设备按照第一预设规则将初始打点数据序列划分得到,该多个子序列由该电子设备将打点数据序列输入多示例学习模型后输出得到;该第一预设规则用于将打点数据序列划分为不同的分序列,且一个分序列根据第二预设规则至少可以确定一个明确的意图;该第二预设规则用于根据序列中的打点数据确定序列的意图;该示例由相邻的两条打点数据组成;该打点数据包括电子设备记录的用户的操作数据和/或该电子设备对用户操作的响应数据;该示例标签用于表示该示例为正示例或负示例;根据该多个待处理序列、该示例和示例标签,确定包和包标签;该包标签用于表示该包为正包或负包;该正包中包括同一个待处理序列中的打点数据组成的示例;该负包中包括位于一个待处理序列中的最后一个打点数据和与该待处理序列连续的下一个待处理序列中的第一个打点数据组成的示例;提取每个包的特征向量矩阵,将该每个包的特征向量矩阵与相应的包标签作为该训练数据。
本申请实施例中,训练设备可以通过从待处理序列中提取示例和示例标签,确定包和包标签,然后提取每个包的特征向量矩阵,将每个包的特征向量矩阵与相应的包标签作为该训练数据,从而实现了训练数据的自标注,提升了训练数据的标注效率。
在一些实施例中,该提取每个包的特征向量矩阵,将该每个包的特征向量矩阵与相应的包标签作为该训练数据,具体包括:分别提取每个包中每个示例的J维特征向量,该J为正整数;将一个包中K个示例的J维特征向量组成该包的特征向量矩阵,将该包的特征向量矩阵与该包的包标签作为该训练数据中的一个训练数据,该K为正整数。
上述实施例中,通过提取一个包中每个示例的J维特征向量,组成特征向量矩阵,与该包的包标签作为该训练数据中的一个训练数据,使得训练数据中包含更多的信息,提升了使用该训练数据进行多示例学习模型训练的训练效果。
在一些实施例中,该J维特征向量用于表示:示例的文本特征,和/或,示例的上下文特征,和或,示例中各打点数据特有的特征,和/或,示例中打点数据的统计特征。
上述实施例中,示例的J维特征向量可以包括示例的各方面的特征,使得训练数据中包含更多方面的信息,提升了使用该训练数据进行多示例学习模型训练的训练效果。
在第六方面中,本申请实施例还提供了一种训练装置,该训练装置包括:一个或多个处理器和存储器;该存储器与该一个或多个处理器耦合,该存储器用于存储计算机程序代码,该计算机程序代码包括计算机指令,该一个或多个处理器调用该计算机指令以使得该训练装置执行:确定多个待处理序列中的示例和示例标签;该多个待处理序列为多个子序列或多个分序列;该多个分序列由该电子设备按照第一预设规则将初始打点数据序列划分得到,该多个子序列由该电子设备将打点数据序列输入多示例学习模型后输出得到;该第一预设规则用于将打点数据序列划分为不同的分序列,且一个分序列根据第二预设规则至少可以确定一个 明确的意图;该第二预设规则用于根据序列中的打点数据确定序列的意图;该示例由相邻的两条打点数据组成;该打点数据包括电子设备记录的用户的操作数据和/或该电子设备对用户操作的响应数据;该示例标签用于表示该示例为正示例或负示例;根据该多个待处理序列、该示例和示例标签,确定包和包标签;该包标签用于表示该包为正包或负包;该正包中包括同一个待处理序列中的打点数据组成的示例;该负包中包括位于一个待处理序列中的最后一个打点数据和与该待处理序列连续的下一个待处理序列中的第一个打点数据组成的示例;提取每个包的特征向量矩阵,将该每个包的特征向量矩阵与相应的包标签作为该训练数据。
本申请实施例中,训练设备可以通过从待处理序列中提取示例和示例标签,确定包和包标签,然后提取每个包的特征向量矩阵,将每个包的特征向量矩阵与相应的包标签作为该训练数据,从而实现了训练数据的自标注,提升了训练数据的标注效率。
在一些实施例中,该一个或多个处理器,具体用于调用该计算机指令以使得该训练装置执行:分别提取每个包中每个示例的J维特征向量,该J为正整数;将一个包中K个示例的J维特征向量组成该包的特征向量矩阵,将该包的特征向量矩阵与该包的包标签作为该训练数据中的一个训练数据,该K为正整数。
在一些实施例中,该J维特征向量用于表示:示例的文本特征,和/或,示例的上下文特征,和或,示例中各打点数据特有的特征,和/或,示例中打点数据的统计特征。
第七方面,本申请实施例提供了一种规则引擎的执行方法,该方法可以包括:确定输入规则引擎中的第一事实数据;根据第一事实数据的第一属性,从内存中获取第一语义对象对第一事实数据进行匹配,第一属性用于表征第一事实数据的变化频率;确定输入规则引擎中的第二事实数据;根据第二事实数据的第二属性,从文件中获取第二语义对象对第二事实数据进行匹配,第二属性用于表征第二事实数据的变化频率,其中,第二属性不同于第一属性;根据第一事实数据对应的第一匹配结果和第二事实数据对应的第二匹配结果,确定是否执行第一操作。
由此,基于事实数据的属性,确定从内存或文件中加载语义对象,并基于确定的语义对象匹配事实数据,从而使得可以将规则引擎中的一部分用于匹配事实数据的语义对象存储至内存中,另一部分用于匹配事实数据的语义对象存储在文件中,进而可以释放一些冗余内存,降低了规则引擎运行过程中的内存开销,提升了规则引擎的能力。
在一种可能的实现方式中,规则引擎包括第一节点,第一节点至少包括第一类型节点和第二类型节点,其中,第一类型节点与第一属性相关,第二类型节点与第二属性相关;根据第一事实数据的第一属性,从内存中获取第一语义对象对第一事实数据进行匹配,具体包括:根据第一属性对应的第一类型节点的第一语义索引,从第一语义索引指示的内存中获取第一语义对象,及基于第一语义对象对第一事实数据进行匹配;根据第二事实数据的第二属性,从文件中获取第二语义对象对第二事实数据进行匹配,具体包括:根据第二属性对应的第二类型节点的第二语义索引,从第二语义索引指示的文件中获取第二语义对象,及基于第二语义对象对第二事实数据进行匹配。
由此,在基于规则引擎进行决策推理时,可以基于不同类型的事实数据对应的节点的语义索引,确定从内存或文件中获取语义对象。
在一种可能的实现方式中,根据第一属性对应的第一类型节点的第一语义索引,从第一语义索引指示的内存中获取第一语义对象之前,还包括:确定第一类型节点中记录的第一事 实数据的变化次数与输入至规则引擎中的第一事实数据的变化次数不同。
由此,仅在第一类型节点中记录的第一事实数据的变化次数与输入至规则引擎中的第一事实数据的变化次数不同时,才从内存中加载语义对象进行匹配,避免了频繁加载语义对象的情况,提升了匹配效率。
在一种可能的实现方式中,根据第二属性对应的第二类型节点的第二语义索引,从第二语义索引指示的文件中获取第二语义对象之前,还包括:确定第二类型节点中记录的第二事实数据的变化次数与输入至规则引擎中的第二事实数据的变化次数不同。
由此,仅在第二类型节点中记录的第二事实数据的变化次数与输入至规则引擎中的第二事实数据的变化次数不同时,才从文件中加载语义对象进行匹配,避免了频繁加载语义对象的情况,提升了匹配效率。
在一种可能的实现方式中,该方法还包括以下一项或多项:确定第一类型节点中记录的第一事实数据的变化次数与输入至规则引擎中的第一事实数据的变化次数相同,使用第一类型节点记录的前次匹配结果作为第一匹配结果;确定第二类型节点中记录的第二事实数据的变化次数与输入至规则引擎中的第二事实数据的变化次数相同,使用第二类型节点记录的前次匹配结果作为第二匹配结果。
由此,当类型节点中记录的事实数据的变化次数与输入至规则引擎中的事实数据的变化次数相同时,直接采用前次的匹配结果,避免了频繁加载语义对象的情况,提升了匹配效率。
在一种可能的实现方式中,该方法还包括以下一项或多项:在重构规则引擎中的规则时,确定第一类型节点中记录的第一事实数据的第一变化次数;若第一变化次数小于预设次数阈值,则将第一类型节点切换为第二类型节点;在重构规则引擎中的规则时,确定第二类型节点中记录的第二事实数据的第二变化次数;若第二变化次数大于预设次数阈值,则将第二类型节点切换为第一类型节点。
由此,实现节点类型的切换,避免出现变化频率低的事实数据对应的语义对象持久的占用内存。另外,也避免出现变化频率高的事实数据对应的语义对象由文件加载时加载效率慢的问题。
在一种可能的实现方式中,该规则引擎包括第二节点;根据第一事实数据对应的第一匹配结果和第二事实数据对应的第二匹配结果,确定是否执行第一操作,具体包括:当第一匹配结果指示匹配成功,且第二匹配结果指示匹配成功时,从第二节点的语义索引指示的文件中获取第三语义对象,及执行第三语义对象对应的第一操作。由此,在基于规则引擎进行决策推理时,可以将相应的规则所需执行的语义对象持久化在文件中,避免该语义对象长期占用内存的情况,进而可以释放一些冗余内存。
在一种可能的实现方式中,第一事实数据包括时间和位置中的至少一项;第二事实数据包括年龄和季节中的至少一项。
在一种可能的实现方式中,第一操作包括以下一项或多项:提醒天气,提醒路况,提醒用户休息、娱乐或工作,推荐使用手册,预加载动作或服务。
在第七方面中,本申请实施例还提供了一种规则引擎,规则引擎包括:第一节点,第一节点至少包括第一类型节点和第二类型节点;第一类型节点,用于根据输入规则引擎中的第一事实数据的第一属性,从内存中获取第一语义对象对第一事实数据进行匹配,得到第一匹配结果,第一属性用于表征第一事实数据的变化频率;第二类型节点,用于根据输入规则引擎中的第二事实数据的第二属性,从文件中获取第二语义对象对第二事实数据进行匹配,得 到第二匹配结果,第二属性用于表征第二事实数据的变化频率,第二属性不同于第一属性;其中,第一匹配结果和第二匹配结果共同用于确定是否执行第一操作。示例性的,该规则引擎可以为人工智能(Artificial Intelligence,AI)模型。
由此,实现在规则引擎中将一部分节点的语义对象存储在内存中,将另外一部分节点的语义对象存储在文件中,进而释放一些冗余内存,降低了规则引擎运行过程中的内存开销,提升了规则引擎的能力。
在一种可能的实现方式中,第一类型节点,具体用于根据第一属性对应的第一语义索引,从第一语义索引指示的内存中获取第一语义对象,及基于第一语义对象对第一事实数据进行匹配;第二类型节点,具体用于根据第二属性对应的第二语义索引,从第二语义索引指示的文件中获取第二语义对象,及基于第二语义对象对第二事实数据进行匹配。
在一种可能的实现方式中,第一类型节点在从内存中获取第一语义对象对第一事实数据进行匹配之前,还用于确定第一类型节点中记录的第一事实数据的变化次数与输入至规则引擎中的第一事实数据的变化次数不同。
在一种可能的实现方式中,第二类型节点在从文件中获取第二语义对象对第二事实数据进行匹配之前,还用于确定第二类型节点中记录的第二事实数据的变化次数与输入至规则引擎中的第二事实数据的变化次数不同。
在一种可能的实现方式中,第一类型节点,还用于在第一类型节点中记录的第一事实数据的变化次数与输入至规则引擎中的第一事实数据的变化次数相同时,使用第一类型节点记录的前次匹配结果作为第一匹配结果。
在一种可能的实现方式中,第二类型节点,还用于在第二类型节点中记录的第二事实数据的变化次数与输入至规则引擎中的第二事实数据的变化次数相同时,使用第二类型节点记录的前次匹配结果作为第二匹配结果。
在一种可能的实现方式中,规则引擎还包括第二节点,第二节点用于当第一匹配结果指示匹配成功,且第二匹配结果指示匹配成功时,从第二节点的语义索引指示的文件中获取第三语义对象,及执行第三语义对象对应的第一操作。
在一种可能的实现方式中,第一事实数据包括时间和位置中的至少一项;第二事实数据包括年龄和季节中的至少一项。
在一种可能的实现方式中,第一操作包括以下一项或多项:提醒天气,提醒路况,提醒用户休息、娱乐或工作,推荐使用手册,预加载动作或服务。
在第七方面中,本申请实施例还提供了一种规则引擎的执行装置,包括:至少一个存储器,用于存储程序;至少一个处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第七方面中所提供的方法。
在第七方面中,本申请实施例还提供了一种计算机存储介质,计算机存储介质中存储有指令,当指令在计算机上运行时,使得计算机执行第七方面中所提供的方法。
在第七方面中,本申请实施例还提供了一种包含指令的计算机程序产品,当指令在计算机上运行时,使得计算机执行第七方面中所提供的方法。
在第七方面中,本申请实施例还提供了一种规则引擎的执行装置,该装置运行计算机程序指令,以执行如第七方面中所提供的方法。示例性的,该装置可以为芯片,或处理器。在一个例子中,该装置可以包括处理器,该处理器可以与存储器耦合,读取存储器中的指令并根据该指令执行如第七方面中所提供的方法。其中,该存储器可以集成在芯片或处理器中, 也可以独立于芯片或处理器之外。
附图说明
图1是现有技术一个意图识别的场景示意图;
图2是本申请实施例中一个实体识别场景示意图;
图3是本申请实施例中一个意图和槽位关系示意图;
图4是本申请实施例中产生打点数据的一个场景示意图;
图5是本申请实施例中产生打点数据的另一个场景示意图;
图6是本申请实施例中打点数据序列的一个示例性示意图;
图7是本申请实施例中将打点数据序列划分为分序列的一个示例性示意图;
图8是本申请实施例中将打点数据序列划分为分序列的另一个示例性示意图;
图9是本申请实施例中使用多示例学习模型的一个示例性示意图;
图10是本申请实施例中打点数据的一个示例性示意图;
图11是本申请实施例提供的知识图谱的基本结构示意图;
图12是本申请实施例中节点设备侧中模型学习目标的形式化示意图;
图13是本申请实施例中一个电子设备的示例性结构示意图;
图14是本申请实施例中一个电子设备的示例性软件结构框图;
图15是本申请实施例中一个意图识别决策系统的示例性软件结构框图;
图16是本申请实施例中一个意图识别的场景示意图;
图17是本申请实施例提供的规则引擎中的一种规则拓扑图的示意图;
图18是图17所示的规则拓扑图中一种模式节点的结构示意图;
图19是图17所示的规则拓扑图中模式节点和结果节点的类型切换示意图;
图20是本申请实施例提供的规则引擎中的另一种规则拓扑图的示意图;
图21是本申请实施例提供的一种规则引擎的执行方法的流程示意图;
图22是本申请实施例提供的一种规则引擎的结构示意图;
图23是本申请实施例中多示例学习模型的训练方法中一个数据流向示意图;
图24是本申请实施例中多示例学习模型的训练方法中一个流程示意图;
图25是本申请实施例中确定示例和示例标签的一个示例性示意图;
图26是本申请实施例中确定包和包标签的一个示例性示意图;
图27是本申请实施例中提取包的特征向量矩阵的一个示例性示意图;
图28是本申请实施例中训练多示例学习模型的一个示例性示意图;
图29是本申请实施例中多示例学习模型将待处理序列划分为子序列的示例性示意图;
图30是本申请实施例中多示例学习模型迭代训练的一个示例性示意图;
图31是本申请实施例多示例学习模型迭代生成子序列的一个示例性示意图;
图32是本申请实施例中多示例学习模型的更新过程一个数据流向示意图;
图33是本申请实施例中多示例学习模型的更新过程一个流程示意图;
图34是本申请实施例中多示例学习模型的训练方法一个交互示意图;
图35是本申请实施例中多示例学习模型的更新训练过程一个交互示意图;
图36是本申请实施例提供的一种人工智能主体框架示意图;
图37是本申请实施例提供的一种应用环境示意图;
图38是本申请实施例提供的另一种应用环境示意图;
图39是本申请实施例提供的一种基于神经网络的数据处理方法的一个示意图;
图40是本申请实施例提供的一种基于神经网络的数据处理方法的另一个示意图;
图41a本申请实施例提供的一种基于神经网络的数据处理方法的另一个示意图;
图41b本申请实施例提供的一种基于神经网络的数据处理方法的另一个示意图;
图42是本申请实施例中联合学习系统的一种架构示意图;
图43是本申请实施例中一种模型训练方法的一个实施例的步骤流程示意图;
图44a是本申请实施例中群体粗粒度模型与粗粒度标签映射的示意图;
图44b是本申请实施例中群体粗粒度模型和细粒度模型的联合模型与细粒度标签映射的示意图;
图45是本申请实施例中端云协同更新群体粗粒度模型和个体粗粒度模型的示意图;
图46a是本申请实施例中个体粗粒度模型与粗粒度标签映射的示意图;
图46b是本申请实施例中群体粗粒度模型、个体粗粒度模型和细粒度模型的联合模型与细粒度标签映射的示意图;
图47是本申请实施例中意图识别方法的一个数据流向示意图;
图48是本申请实施例中意图识别方法的一个流程示意图;
图49是本申请实施例中多示例学习模型将输入序列划分为子序列的一个示例性示意图;
图50是本申请实施例提供的意图识别方法的流程示意图之一;
图51是本申请实施例提供的意图识别方法的流程示意图之二;
图52是本申请实施例提供的目标意图的内容的展示示意图之一;
图53是本申请实施例提供的目标意图的内容的展示示意图之二;
图54是本申请实施例提供的意图识别方法的流程示意图之三;
图55是本申请实施例提供的目标操作的示意图之一;
图56是本申请实施例提供的目标操作的示意图之二;
图57是本申请实施例提供的目标操作的示意图之三;
图58是本申请实施例提供的候选意图发生变化的场景示意图;
图59是本申请实施例中意图识别方法一个流程示意图;
图60是本申请实施例中一个多设备互联的分布式场景的示例示意图;
图61是本申请实施例中实体扩展的一个信息流示意图;
图62是本申请实施例中意图扩展的一个信息流示意图;
图63是本申请实施例中另一个电子设备的示例性结构示意图。
具体实施方式
本申请以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。如在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一个”、“一种”、“所述”、“上述”、“该”和“这一”旨在也包括复数表达形式,除非其上下文中明确地有相反指示。还应当理解,本方案中使用的术语“和/或”是指并包含一个或多个所列出项目的任何或所有可能组合。
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为暗示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示 或者隐含地包括一个或者更多个该特征,在本申请实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。
由于本申请实施例涉及意图识别与决策相关技术,为了便于理解,下面先对本申请实施例涉及的相关术语及相关概念进行介绍。
(1)单模态输入
单模态输入指仅采用单一输入方式的数据。例如仅采用传感器检测的数据,或仅采用用户文本输入的数据。
(2)多模态输入
多模态输入指对多种输入方式的数据均可予以采用。
例如,电子设备中一般有用户操作输入、环境感知输入、文本输入、语音输入、视觉输入等多种数据输入方式。
此外,多模态输入还可以包括从与该电子设备互联的其他智能设备中获取的数据输入。具体互联方式并不限定,可以为点对点的直接连接,如通过蓝牙连接,也可以为通过局域网方式连接,还可以为通过互联网方式连接等。例如,电子设备可以从与其互联的智能音响中获取用户的语音控制命令作为一种输入方式,可以从与其互联的智能音响中获取用户歌曲播放列表作为一种输入方式,可以从与其互联的电视中获取用户开关机时间记录和节目播放记录作为一个输入方式,可以从与其互联的灯中获取用户开关灯的时间记录作为一种输入方式,可以从与其互联的洗衣机中获取用户的洗衣时间与洗衣重量作为一种输入方式,可以从与其互联的空调中获取用户最常使用的温度作为一种输入方式,可以从与其互联的摄像头中获取识别出的人物信息作为一种输入方式等等,此处不作限定。
多模态输入即指可以采用这些不同种输入方式的数据。
可以理解的是,在有些情况下,多模态输入可以采用所有输入方式的数据,在有些情况下,多模态输入包括至少两种输入方式的数据,在有些情况下,多模态输入也可能只能获取到一种输入方式的数据,具体根据当前的输入环境以及需求确定,并非多模态输入就一定限定为必须采用两种以上输入方式的数据。
本申请实施例中之所以采用多模态输入,是因为实体学习框架(含实体识别、上下文)要求对环境的状态描述足够准确,但有些设备受限于硬件性能、可获取资源等客观因素,感知、描述环境的能力弱,比如精确度低、噪声大等,或者只能观察、描述某些特定环境,因而需要将这些设备获取到的信息综合起来,以提供完整的环境描述。
(3)上下文信息
上下文,在编程语言中,一般指与现在这个工作相关的周围环境。例如与当前操作相关的前一步状态和下一步状态。
在本申请实施例中,上下文信息一般指在电子设备中当前时刻的数据,以及在当前时刻之前一段时间窗格内电子设备中的数据。
(4)时间窗格
时间窗格指一段时间。
例如从此刻开始直到20秒后,这20秒就是一段时间窗格。
(5)实体
本申请实施例中,实体指现实世界中客观存在的并可以相互区分的对象、事物或动作。
简单的理解,实体,可以认为是某一个概念的实例。例如,“人名”是一种概念,或者说实体类型,那么“小明”就是一种“人名”实体了;“时间”是一种实体类型,那么“中秋节”就是一种“时间”实体了。
不同设备感知到的多模态输入可以映射为不同的实体。图2为一个实体识别场景示意图。如图2所示:拍摄的照片通过对象识别算法映射到不同的对象实体,如学生、帽子、外套等;用户历史打开过的应用通过应用市场分类可以映射到游戏、娱乐、视频、美食等实体;语音识别到的对话或者文字输入,可以映射为订机票、南京、上海等动作、地点类的实体。
具体地,令Xm表示多模态输入,εm表示第m个模态对应的实体空间,Ψm表示第m个模态输入到实体空间的映射函数:Ψm:Xm→εm(某些场景下可以利用其它Xm作为增广)。Ψ可以通过收集标注数据,利用学习算法学习得到,也可以使用类似应用市场中对应用人工分类打标签这种人为的预置规则得到。在实体学习框架下,统一的特征空间为Ψ即为输入X到统一特征空间ε的映射函数。
实体在电子设备中可以采用【实体标识(id),实体名,实体表示】的方式存储。其中,实体id用于唯一标识一个实体;实体名为该实体对应与现实世界中对象、事物或动作的名词,该实体名可以有也可以没有;实体表示由一些特征(embedding)向量组成,用于表示该实体的特征。应理解的是,实体表示也可以由其他形式的特征向量组成,比如文本形式,在此不做限定。
(6)实体识别
实体识别,就是将想要获取到的实体类型,从获取到的输入数据中识别出来的过程。
可以通过实体学习的方式进行实体识别,也可以通过预置规则进行实体识别,此处不作限定。
实现实体识别的方式有很多,对不同的输入类型,也可以采用不同的实体识别方式。例如,可以采用分词和深度条件随机场对文本输入数据进行实体识别;可以采用快速目标检测算法(FastRCNN)对视觉输入数据进行实体识别;可以提取profiling数据对用户操作进行实体识别;可以调用传感器应用程序编程接口(Application Programming Interface,API)对环境感知数据进行实体识别;可以采用命名实体识别(Named Entity Recognition,NER)对语音输入数据进行实体识别等,可以理解的是,对每种输入类型,都可以采用很多不同的机器学习技术进行实体识别,例如,逻辑回归等机器学习技术,此处不作限定。
(7)实体序列
实体序列指在一个时间段内识别出的实体的集合,其中至少包含一个实体。
例如,从此时开始触发实体识别,此次实体识别的时间窗格长度是30秒。在这30秒中识别到的实体为:进入车库,走近车辆,时间为早上8点,则此次实体识别的内容可形成实体序列【进入车库;走近车辆;时间为早上8点】。若在此前的一次实体识别的触发后形成的实体序列为【打开支付宝;进行支付;收到购物短信】,则它们可以组成更长的实体序列为【打开支付宝;进行支付;收到购物短信;进入车库;走近车辆;时间为早上8点】。
根据使用场景的需求,实体序列中的实体排列可以具有顺序特征,也可以不具有顺序特征:
在不具有顺序特征的实体序列中,如果实体序列中的实体相同,那么其中的实体可以任意交换存储位置而不影响将该实体序列确认为同一个实体序列。例如,在这种情况下,实体序列【进入车库;走近车辆;时间为早上8点】与实体序列【时间为早上8点;进入车库;走近车辆】可以认为是相同的实体序列。
在具有顺序特征的实体序列中,即使实体序列中的实体相同,如果其中实体的排序不同,也会被认为是不同的实体序列。例如,在这种情况下,实体序列【进入车库;走近车辆;时间为早上8点】与实体序列【时间为早上8点;进入车库;走近车辆】则可以被认为是不同的实体序列。
具有顺序特征的实体序列中,确定其中实体的顺序的方式有很多种:可以按照将实体识别出的时间顺序来进行排序,例如,若识别得到实体的顺序为进入车库,走近车辆,时间为早上8点,则可以组成按时间排序的实体序列【进入车库;走近车辆;时间为早上8点】;电子设备中可以存储有实体优先级列表,可以按照该实体优先级列表中各实体的优先级,将识别出的实体按照优先级从高到低或从低到高的顺序排序,同一优先级的实体按预存的默认实体排序组成实体序列,例如,若实体优先级列表中时间实体为最高优先级,动作实体为第二优先级,识别得到实体的顺序为进入车库,走近车辆,时间为早上8点,则可以组成按优先级排序的实体序列【时间为早上8点;进入车库;走近车辆】,在具有顺序特征的实体序列中,还可以有很多确定其中实体顺序的方式,此处不作限定。
(8)意图和槽位
8.1、意图和槽位的定义
意图,是指电子设备识别用户实际的或潜在的需求是什么。从根本来说,意图识别是一个分类器,将用户需求划分为某个类型;或者,意图识别是一个排序器,将用户的潜在可能需求集合按照可能性进行排序。
意图和槽位共同构成了“用户动作”,电子设备无法直接理解自然语言,因此意图识别的作用便是将自然语言或操作映射为机器能够理解的结构化语义表示。
意图识别,也被称为SUC(Spoken Utterance Classification),顾名思义,是将用户输入的自然语言会话进行类别(classification)划分,划分的类别对应的就是用户意图。例如“今天天气如何”,其意图为“询问天气”。自然地,可以将意图识别看作一个典型的分类问题。示例性的,意图的分类和定义可参考ISO-24617-2标准,其中共有56种详细的定义。意图的定义与系统自身的定位和所具有的知识库有很大关系,即意图的定义具有非常强的领域相关性。可以理解的是,本申请实施例中,意图的分类和定义不局限于ISO-24617-2标准。
槽位,即意图所带的参数。一个意图可能对应若干个槽位,例如询问公交车路线时,需要给出出发地、目的地、时间等必要参数。以上参数即“询问公交车路线”这一意图对应的槽位。
例如,语义槽位填充任务的主要目标是在已知特定领域或特定意图的语义框架(semantic frame)的前提下,从输入语句中抽取该语义框架中预先定义好的语义槽的值。语义槽位填充任务可以转化为序列标注任务,即运用经典的IOB标记法,标记某一个词是某一语义槽的开始(begin)、延续(inside),或是非语义槽(outside)。
要使一个系统能正常工作,首先要设计意图和槽位。意图和槽位能够让系统知道该执行哪项特定任务,并且给出执行该任务时需要的参数类型。
以一个具体的“询问天气”的需求为例,介绍面向任务的对话系统中对意图和槽位的设计:
用户输入示例:“今天上海天气怎么样”;
用户意图定义:询问天气,Ask_Weather;
槽位定义:槽位一:时间,Date;槽位二:地点,Location。
图3为本申请实施例中一个意图和槽位关系示意图。如图3中(a)所示,在该示例中,针对“询问天气”任务定义了两个必要的槽位,它们分别是“时间”和“地点”。对于一个单一的任务,上述定义便可解决任务需求。但在真实的业务环境下,一个系统往往需要能够同时处理若干个任务,例如气象台除了能够回答“询问天气”的问题,也应该能够回答“询问温度”的问题。
对于同一系统处理多种任务的复杂情况,一种优化的策略是定义更上层的领域,如将“询问天气”意图和“询问温度”意图均归属于“天气”领域。在这种情况下,可以简单地将领域理解为意图的集合。定义领域并先进行领域识别的优点是可以约束领域知识范围,减少后续意图识别和槽位填充的搜索空间。此外,对于每一个领域进行更深入的理解,利用好任务及领域相关的特定知识和特征,往往能够显著地提升自然语言理解(Natural Language Understanding,NLU)的效果。据此,对图3中(a)的示例进行改进,加入“天气”领域:
用户输入示例:
1.“今天上海天气怎么样”;
2.“上海现在气温多少度”;
领域定义:天气,Weather;
用户意图定义:
1.询问天气,Ask_Weather;
2.询问温度,Ask_Temperature;
槽位定义:
槽位一:时间,Date;
槽位二:地点,Location。
改进后的“询问天气”的需求对应的意图和槽位如图3中(b)所示。
8.2、意图识别和槽位填充
做好意图和槽位的定义后,可以从用户输入中识别用户意图和相应槽对应的槽值。
意图识别的目标是从输入中识别用户意图,单一任务可以简单地建模为一个二分类问题,如“询问天气”意图,在意图识别时可以被建模为“是询问天气”或者“不是询问天气”二分类问题。当涉及需要系统处理多种任务时,系统需要能够判别各个意图,在这种情况下,二分类问题就转化成了多分类问题。
槽位填充的任务是从数据中提取信息并填充到事先定义好的槽位中,例如在图3中已经定义好了意图和相应的槽位,对于用户输入“今天上海天气怎么样”系统应当能够提取出“今天”和“上海”并分别将其填充到“时间”和“地点”槽位。基于特征提取的传统机器学习模型已经在槽位填充任务上得到了广泛应用。近年来,随着深度学习技术在自然语言处理领域的发展,基于深度学习的方法也逐渐被应用于槽位填充任务。相比于传统的机器学习 方法,深度学习模型能够自动学习输入数据的隐含特征。例如,将可以利用更多上下文特征的最大熵马尔可夫模型引入槽位填充的过程中,类似地,也有研究将条件随机场模型引入槽位填充。
(9)动作序列
一个动作序列可以包含至少一个待执行动作。
在一些实施例中,一个待执行动作为本设备需要执行的一个动作或服务。
在一些实施例中,一个待执行动作中可以至少包含设备标识与动作/服务标识。
示例性的,一个待执行动作的表现形式可以为【序号、设备标识、动作/服务】,其中序号可以表示该待执行动作的编号,也可以表示该待执行动作在该动作序列中的排序,设备标识表示需要由哪个设备来执行这个待执行动作,动作/服务表示这个待执行动作具体是要执行什么样的动作或服务。
一个动作序列中可以仅包含有一个待执行动作,也可以包含有多个待执行动作,这些待执行动作中的设备标识可以为确定该动作序列的电子设备自己,也可以为其他电子设备,此处不作限定。
本申请实施例中,动作序列中的待执行动作大部分为预加载类动作/服务,例如后台预加载应用程序等,在实际应用中,也可以为直接执行的动作/服务,例如连接蓝牙等,此处不做限定。
下面举例对本申请实施例中可能采用的不同的动作序列进行描述:
1、若手机A确定了一个动作序列,该动作序列中仅包含一个待执行动作,这一个待执行动作中的设备标识为手机A自己:
该待执行动作的表现形式中可以有序号,例如【1、手机A、打开蓝牙】,也可以没有序号,例如【手机A、打开蓝牙】。由于确定的动作序列中只有一个待执行动作,且该待执行动作中的设备标识对应于手机A自己,因此手机A直接执行该待执行动作,打开蓝牙。
2、若手机A确定了一个动作序列,该动作序列中包含多个待执行动作,这多个待执行动作中的设备标识均为手机A自己:
2.1、这多个待执行动作的表现形式中没有序号,或有序号,但该序号仅为待执行动作的编号,并不设定为待执行动作的执行顺序:
例如2个待执行动作为【手机A、打开蓝牙】【手机A、打开WIFI】,或【1、手机A、打开蓝牙】【2、手机A、打开WIFI】。由于确定的动作序列中的2个待执行动作中设备标识均为手机A自己,手机A执行这两个待执行动作,打开WIFI,并不完全限定这两个待执行动作的执行顺序。
2.2、这多个待执行动作的表现形式中有序号,且序号设定为待执行动作的执行顺序:
例如2个待执行动作为【1、手机A、打开蓝牙】【2、手机A、打开WIFI】,由于确定的动作序列中的2个待执行动作中设备标识均为手机A自己,且具有标识执行顺序的编号,因此,手机A先打开蓝牙,再打开WIFI。
3、若手机A确定了一个动作序列,该动作序列中包含多个待执行动作,这多个待执行动作中的设备标识均为智能设备B:
3.1、这多个待执行动作的表现形式中没有序号,或有序号,但该序号仅为待执行动作的编号,并不设定为待执行动作的执行顺序:
例如2个待执行动作为【智能设备B、切换低温模式】【智能设备B、除湿】,或【1、智能设备B、切换低温模式】【2、智能设备B、除湿】。由于确定的动作序列中的2个待执行动作中设备标识均为智能设备B,手机A发送可以发送两条指令到智能设备B,也可以仅发送一条指令到智能设备B,指示智能设备切换低温模式,除湿,且不限定其执行动作的顺序。
3.2、这多个待执行动作的表现形式中有序号,且序号设定为待执行动作的执行顺序:
例如2个待执行动作为【1、智能设备B、唤醒】【2、智能设备B、除湿】,由于确定的动作序列中的2个待执行动作中设备标识均为智能设备B,且具有标识执行顺序的编号,因此,手机A可以发送两条指令到智能设备B,也可以仅发送一条指令到智能设备B,接收到指令后,智能设备B按照序号的顺序,先唤醒,然后在除湿。
4、若手机A确定了一个动作序列,该动作序列中包含多个待执行动作,这多个待执行动作中的设备标识为多个设备,这多个设备中有手机A自己:
4.1、这多个待执行动作的表现形式中没有序号,或有序号,但该序号仅为待执行动作的编号,并不设定为待执行动作的执行顺序:
例如,3个待执行动作为【智能设备B、切换低温模式】【手机A、打开蓝牙】【智能设备C、切换为护眼模式】,或【1、智能设备B、切换低温模式】【2、手机A、打开蓝牙】【3、智能设备C、切换为护眼模式】。手机A根据这三个待执行动作中设备标识对应的设备,发送指令给智能设备B,智能设备B切换低温模式,自己执行打开蓝牙操作,发送指令给智能设备C,智能设备C启动,且这三个动作的执行并不限制执行顺序。
4.2、这多个待执行动作的表现形式中有序号,且序号设定为待执行动作的执行顺序:
例如,3个待执行动作为【1、智能设备B、切换低温模式】【2、手机A、打开蓝牙】【3、智能设备C、切换为护眼模式】。手机A根据这三个待执行动作中设备标识对应的设备以及表示执行顺序的序号,先发送指令给智能设备B,智能设备B切换低温模式,然后自己执行打开蓝牙操作,最后发送指令给智能设备C,智能设备C切换为护眼模式。
5、若手机A确定了一个动作序列,该动作序列中包含多个待执行动作,这多个待执行动作中的设备标识为多个设备,这多个设备中没有手机A自己:
5.1、这多个待执行动作的表现形式中没有序号,或有序号,但该序号仅为待执行动作的编号,并不设定为待执行动作的执行顺序:
例如,3个待执行动作为【智能设备B、切换低温模式】【智能设备B、换气】【智能设备C、切换为护眼模式】,或【1、智能设备B、切换低温模式】【2、智能设备B、换气】【3、智能设备C、切换为护眼模式】。手机A根据这三个待执行动作中设备标识对应的设备,可以发送一个或两个指令给智能设备B,智能设备B切换低温模式并换气,发送指令给智能设备C,智能设备C启动,且这三个动作的执行并不限制执行顺序。
5.2、这多个待执行动作的表现形式中有序号,且序号设定为待执行动作的执行顺序:
例如,3个待执行动作为【1、智能设备B、切换低温模式】【2、智能设备B、换气】【3、智能设备C、切换为护眼模式】。手机A根据这三个待执行动作中设备标识对应的设备以及表示执行顺序的序号,先发送一个或两个指令给智能设备B,智能设备B先切换低温模式,然后再换气,最后发送指令给智能设备C,智能设备C切换为护眼模式。
根据实际情况需求,本申请实施例中的待执行动作可以为以上任一种情况,此处不作限定。
(10)实体序列、意图与动作序列的关系
1、根据实体序列与意图的对应关系,一个实体序列可以对应一个意图,也可以对应多个意图。
即同一个实体序列即可以对应一个意图,也可以对应多个意图。其中,当多个意图之间存在层次关系或者关联关系时,一个实体序列可以对应多个意图。例如,玩游戏和娱乐这两种意图存在层次关系,当一个实体序列对应的意图为玩游戏时,该实体序列对应的意图也为娱乐。然而两个不同的实体序列,其可能对应两个不同的意图,也有可能对应一个相同的意图,此处不作限定。
例如,一个实体序列【播放,机器猫,第四集,打开电视】对应的意图可以为:“播放视频”,对应的槽位可以为:“设备,电视”、“内容,机器猫”、“选集,四”;另一个不同的实体序列【上午8点,开灯】对应的意图可以为:“提高环境亮度”,对应的槽位可以为:“时间,上午8点”、“设备,灯”,两个不同的实体序列对应了两个不同的意图和槽位。
再如,一个实体序列【播放,机器猫,第四集,打开电视】对应的意图可以为:“播放视频”,对应的槽位可以为:“设备,电视”、“内容,机器猫”、“选集,四”;而另一个不同的实体序列【播放,机器猫,第四集,打开投影仪】对应的意图也可以为:“播放视频”,对应的槽位可以为:“设备,投影仪”、“内容,机器猫”、“选集,四”,两个不同的实体序列可以对应一个相同的意图。
2、根据实体序列、意图与动作序列的对应关系,一组实体序列和意图对应的一个动作序列。
例如,一组实体序列【播放,机器猫,第四集,打开电视】和意图播放视频,对应的动作序列可以为【1、电视、播放器预加载机器猫第四集】,另一组实体序列【上午8点,开灯】和意图提高环境亮度,对应的动作序列可以为【1、智能窗帘、打开窗帘】。每组实体序列和意图可以对应一个动作序列。
可以理解的是,对应的动作序列中也可以有多个待执行动作,为便于描述,此处仅以动作序列中有一个待执行动作和待执行动作的一种表现形式为例。对动作序列的具体描述可参阅术语动作序列部分的描述,此处不再赘述。
(11)打点数据:
本申请实施例中,打点数据为电子设备在本地记录的用户日常的操作数据和/或电子设备对用户操作的响应数据。在一个例子中,打点数据可以是在电子设备执行确定出的待执行动作后所记录的用户的操作数据和/或对用户操作的响应数据。示例性的,当待执行动作为打开应用A时,电子设备可以打开应用A;若用户未使用该应用A,而是将应用A关闭,则记录用户关闭应用A的操作;若用户使用该应用A,则记录用户使用该应用A的操作。在一个例子中,打点数据的输入方式也可以为多模态输入。
当用户在电子设备中做一些操作时,如:输入内容、点击按钮、进入某页面、打开某弹框、打开某应用程序等,电子设备会通过预设打点接口,记录用户做的操作以及电子设备基于该操作的响应动作。电子设备记录的这些用户操作及电子设备的响应动作,即为一条条的打点数据。
如图4为本申请实施例中产生打点数据的一个场景示意图。示例性的,用户在使用语音助手打开视频应用程序A(例如应用程序华为视频等)时,其过程可以为:
如图4中的(a)所示,步骤1、用户唤醒语音助手,向语音助手表述打开视频应用程序A;
如图4中的(b)所示,步骤2、语音助手根据用户的表述打开视频应用程序A。
在这个过程中,就可以产生至少两条打点数据:
打点数据1:语音助手产生的接收到用户表述要打开视频应用程序A的打点数据;
打点数据2:电子设备打开视频应用程序A的打点数据。
若用户此时又想使用应用程序音乐,则其过程可以为:
如图4中的(c)所示,步骤1、用户操作电子设备回到主界面;
如图4中的(d)所示,步骤2、响应用户点击,打开应用程序音乐。
在这个过程中,可以又产生至少两条打点数据:
打点数据3:返回主界面;
打点数据4:电子设备打开应用程序音乐。
可以理解的是,电子设备保存打点数据的格式可以有很多,可以以数据交换格式的方式来保存打点数据,例如使用JS对象简谱(javascript object notation,JSON)等,也可以以表格、数据库等方式保存打点数据,还可以以其他方式保存打点数据,此处不作限定。
电子设备还可以为各打点数据加上标签,来表明各打点数据的产生方式和作用等。例如,可以标注出打点数据的编号、产生时间、来源的应用程序、意图等等,此处不作限定。且由于应用程序不同或运行环境不同等因素,各条打点数据加上的标签经常会不完整。
除了上述图4中示例性的用户使用语音助手或直接打开应用程序时会产生打点数据外,用户在电子设备上进行其他操作时也可以产生打点数据:
如图5所示,为本申请实施例中产生打点数据的另一个场景示意图。示例性的,若用户想要搜索某个内容,其过程可以为:
如图5中的(a)所示,步骤1:用户打开浏览器;
如图5中的(b)所示,步骤2:用户在浏览器出现的默认搜索引擎中搜索关键词1;
如图5中的(c)所示,步骤3:用户从多个搜索结果中选择想要的搜索结果3;
如图5中的(d)所示,步骤4:用户查看搜索结果3的内容。
在这个过程中,电子设备可以产生如下打点数据:
打点数据5:电子设备打开浏览器;
打点数据6:默认搜索引擎中接收到关键词1;
打点数据7:关键词1搜索到的多个搜索结果中被确定的是搜索结果3;
打点数据8:电子设备显示搜索结果3的内容。
可以理解的是,电子设备中还可以有很多其他操作可以产生打点数据的场景,此处不作限定。
(12)打点数据序列:
电子设备中保存的连续多条打点数据形成了打点数据序列。
示例性的,图4所示场景中即产生了【打点数据1】【打点数据2】【打点数据3】【打点数据4】这样的打点数据序列。
示例性的,如果图4所示的场景中的用户操作之后,又连续进行了图5所示场景中的用 户操作。则图4所示场景中产生的打点数据可以与图5所示场景中产生的打点数据连续保存,产生【打点数据1】【打点数据2】【打点数据3】【打点数据4】【打点数据5】【打点数据6】【打点数据7】【打点数据8】这样的打点数据序列。
可以理解的是,打点数据序列可以使用列表、数组、矩阵等形式表示,此处不作限定。
一般的,用户的连续操作产生的打点数据序列往往对应相同的意图。例如图4中的(a)和(b),表示用户的意图为打开视频应用程序A。图4中的(c)和(d)表示用户的意图为打开应用程序音乐。图5中的(a)、(b)(c)、(d),表示用户的意图为得到搜索结果3的内容。
但由于当前产生的打点数据的标签经常不够准确和完整,如果用户在短时间内连续操作电子设备,产生的打点数据序列中可能包含有多个意图。则很难用现有模型或规则预测哪些连续打点数据对应哪个意图。而采用本申请实施例中的方法,可以更准确的识别出打点数据序列中的各意图。
用户的连续操作具体可以理解为:用户进行了多次操作且多次操作之间的时间间隔小于第一预设时间间隔。例如,用户可以在进行了图4中的(a)操作后的2秒内,又进行了图4中的(c)操作;在进行图4中的(c)操作的2秒内又进行了图5中的(a)操作。这样,用户进行的图4中的(a)操作、图4中的(c)和图5中的(a)操作就可以被称为用户的连续操作。
可以理解的是,本申请实施例中并不限定打点数据序列是由用户的连续操作产生的,用户的连续操作产生的打点数据可以组成打点数据序列,用户的非连续操作产生的打点数据也可以组成打点数据序列。只是用户的连续操作产生的打点数据组成的打点数据序列按照常规方法较难用现有模型或规则预测出其中的哪些连续打点数据对应哪个意图。
示例性的,图6为本申请实施例中打点数据序列的一个示例性示意图。以日常使用电子设备为例,用户用到的最多的操作就是打开某个应用和返回主界面,有时候也会用到语音助手执行一些动作。图6是从真实场景中获取的部分用户操作电子设备的打点数据。为便于查看,将语音助手的打点数据标记为V,将电子设备执行操作的打点数据标记为A,将电子设备返回桌面的打点数据标记为L。则按产生的打点数据的顺序即可以得到图6所示的打点数据序列【V,唤醒语音助手-执行导航】【A,语音助手拉起导航应用】【L,返回桌面】【A,用户主动打开地图导航应用】【V,唤醒语音助手-执行打开视频应用程序A】【L,返回桌面】【A,打开视频应用程序A】【L,返回桌面】【A,打开录音机】【L,返回桌面】【A,打开天气】【L,返回桌面】【…,…】。
可以理解的是,图6为展示打点数据序列与打点数据之间关系的一个示例性示意图,并不表示其为实际应用中打点数据和打点数据序列的存储和显示方式。在实际应用中,打点数据和打点数据序列可以采用表格、数组、矩阵、数据库等等方式进行存储和显示,此处不作限定。
(13)第一预设规则、第二预设规则和分序列:
本申请实施例中,第二预设规则用于根据各序列中的打点数据确定各序列的意图。第一预设规则用于将打点数据序列划分为不同的分序列,且一个分序列根据该第二预设规则至少可以确定一个明确的意图。
本申请实施例中,该第一预设规则也可以被称为预设拆分规则,该第二预设规则也可以 被称为预设意图规则。
在一些实施例中,该第一预设规则和第二预设规则可以合并为一个规则或规则集合,也可以是两个分别运行的规则或规则集合,此处不作限定。
该第一预设规则和第二预设规则可以出厂预设,也可以从服务器中下载或更新,此处不作限定。
如图7所示,为本申请实施例中将打点数据序列划分为分序列的一个示例性示意图。示例性的,若第一预设规则为:将用户每次从亮屏到息屏一系列连续操作产生的打点数据划分为一个分序列。若第二预设规则为:用户息屏前关闭的最后一个使用的应用为用户的意图。
若打点数据序列A1中:序列B1段的打点数据为在一次亮屏后到息屏间的一系列连续操作产生的;序列B2段的打点数据为在另一次亮屏后到息屏间的一系列连续操作产生的;序列B3段的打点数据为在另一次亮屏后到息屏间的一系列连续操作产生的。
则根据该第一预设规则,电子设备可以将该打点数据序列A1划分成3个分序列:分序列B1,分序列B2与分序列B3。
且根据第二预设规则,电子设备可以确定,每个分序列根据第二预设规则至少可以确定一个明确的意图。分序列B1的意图为息屏前关闭的最后一个使用的应用:打开视频应用程序A。分序列B2的意图为息屏前关闭的最后一个使用的应用:打开录音机。分序列B3的意图为息屏前关闭的最后一个使用的应用:打开天气。
如图8所示,为本申请实施例中将打点数据序列划分为分序列的另一个示例性示意图。示例性的,若第一预设规则为:将产生相邻两条打点数据的时间间隔小于预设打点时间间隔的打点数据划分为一个分序列。若第二预设规则为:各分序列中打开的最后一个应用是用户的意图。
若打点数据序列A2中:产生序列C1段的各相邻打点数据的时间间隔小于预设打点时间间隔;产生序列C2段的各相邻打点数据的时间间隔小于预设打点时间间隔;产生序列C3段的各相邻打点数据的时间间隔小于预设打点时间间隔;产生序列C1段最后的一个打点数据与产生序列C2段第一个打点数据的时间间隔不小于预设打点时间间隔;产生序列C2段最后的一个打点数据与产生序列C3段第一个打点数据的时间间隔不小于预设打点时间间隔。
则根据该第一预设规则,电子设备可以将该打点数据序列A2划分成3个分序列:分序列C1,分序列C2与分序列C3。
且根据第二预设规则,电子设备可以确定,每个分序列根据第二预设规则至少可以确定一个明确的意图。分序列C1的意图为分序列中最后一个打开的应用:打开地图导航。分序列C2的意图为分序列中最后一个打开的应用:打开录音机。分序列C3的意图为分序列中最后一个打开的应用:打开天气。
可以理解的是,图7和图8是本申请实施例中按照第一预设规则和第二预设规则将打点数据划分为分序列的两个示例性示意图,在实际应用中,还可以有很多其他的第一预设规则和第二预设规则的设定,从而达到第一预设规则用于将打点数据序列划分为不同的分序列,且一个分序列根据该第二预设规则至少可以确定一个明确的意图的效果,此处不作限定。
需要说明的是,第二预设规则只用于确定出序列的意图,第二预设规则确定的序列的意图是该序列的多个意图中的一个,还是该序列的唯一的意图,此处不作限定。
在一些实施例中,第二预设规则可以为根据深度学习模型从序列中提取打点数据的意图信息和槽位信息,从而确定出该序列的意图,此处不作限定。
(14)多示例学习模型、示例和示例标签(Label)、包和包标签
本申请实施例中,多示例学习模型用于根据各待处理序列中连续的打点数据属于同一意图的可能性,将各待处理序列中可能不属于同一个意图的连续的打点数据划分到不同的粒度更小的子序列中,得到多个子序列。
该待处理序列可以为使用该第一预设规则将打点数据序列划分成的分序列,也可以为使用该多示例学习模型将该分序列划分成的更小粒度的子序列。本申请实施例中,待处理序列也可以理解为输入多示例学习模型的打点数据序列。
本申请实施例中使用的多示例学习模型可以为任一种多示例学习模型,例如ORLR模型,Citation-kNN模型,MI-SVM模型,C4.5-MI模型,BP-MIP模型,Ensemble Learning-MIP模型等,此处不作限定。
多示例学习(multi-instance learning,MIL)最初是用在制药领域中药物分子形状与药物活性的分类问题中。多示例学习以包(bag)为训练单元,包为示例(Instance,或Pair)的集合。
示例和示例标签:
本申请实施例中,相邻的两条打点数据可以组成一个示例。每个示例可以具有标签,示例标签包括正(Positive)和负(Negtive)。可以将示例标签为正的示例称为正示例,将示例标签为负的示例称为负示例。
可以使用不同的数值分别表示示例标签的正或负。例如可以使用示例标签为0表示该示例为正示例,使用示例标签为1表示该示例为负示例;也可以使用示例标签为1表示该示例为正示例,使用示例标签为0表示该示例为负示例;还可以使用其他的数值作为示例标签来分别表示示例为正示例还是负示例,此处不作限定。
本申请实施例中,位于同一个待处理序列中的两条相邻的打点数据组成的示例为正示例,位于不同待处理序列中的两条相邻的打点数据组成的示例为负示例。两条相邻的打点数据可以指这两条打点数据中开始时间相邻。
本申请实施例中,示例是为了确定连续打点数据是否对应相同的意图。一般的,此时可以认为在同一个待处理序列中相邻的两个打点数据对应相同的意图,所以将其组成的示例标记为正示例,表示这两个打点数据连续。此时可以认为在不同待处理序列中的打点数据对应不同的意图,所以将其组成的示例标记为负示例,表示这两个打点数据不连续。
包和包标签:
本申请实施例的多示例学习模型中,训练集由一组包(bag)组成,每个包具有包标签,包标签包括正和负。可以将包标签为正的包称为正包,将包标签为负的包称为负包。
可以理解的是,可以使用不同的数值作为包标签,分别表示该包为正包还是负包,此处不作限定。
每个包含有若干个示例。如果包中至少含有一个正示例,则该包为正包。如果包中所有示例都是负示例,则该包为负包。
多示例学习模型可以用包内的示例的特征和包标签训练模型,最后用训练的模型预测未知示例的示例标签。
本申请实施例中,位于同一个待处理序列中的打点数据组成的示例可以共同作为一个正包,该正包中含有至少一个正示例。位于一个待处理序列中的最后一个打点数据和与该待处理序列连续的下一个待处理序列中的第一个打点数据组成的示例可以作为一个负包,该负包 中的示例均为负示例。
下面举例说明确定示例、示例标签、包和包标签的过程:
示例性的,若打点数据序列【A】【B】【C】【D】【E】根据第一预设规则划分为了分序列1:【A】【B】【C】和分序列2:【D】【E】,作为两个连续的待处理序列。
确定示例和示例标签:
该打点数据序列中相邻的两条打点数据组成一个示例,即可以得到4个示例:示例【A、B】、示例【B、C】、示例【C、D】和示例【D、E】。
由于示例【A、B】、示例【B、C】是由位于同一个待处理序列(分序列1)中的相邻两条打点数据组成的示例,因此,示例【A、B】和示例【B、C】都是正示例;
由于示例【C、D】是由不同待处理序列(分序列1和分序列2)中的两条相邻打点数据组成的示例,因此示例【C、D】是负示例;
由于示例【D、E】是由位于同一个待处理序列(分序列2)中的相邻两条打点数据组成的示例,因此,示例【D、E】是正示例;
则得到了:
正示例【A、B】,正【B、C】,负示例【C、D】和正示例【D、E】。
确定包和包标签:
位于同一个分序列1中的打点数据【A】【B】【C】组成的示例“示例【A、B】、示例【B、C】”作为一个正包;
位于分序列1中的最后一个打点数据【C】和与该分序列1连续的分序列2中的第一个打点数据【D】组成的示例“示例【C、D】”作为一个负包;
位于同一个分序列2中的打点数据【D】【E】组成的示例“示例【D】【E】”作为一个正包;
则形成了:
正包“示例【A、B】、示例【B、C】”,负包“示例【C、D】”,正包“示例【D】【E】”。
可以理解的是,若打点数据序列中有M个打点数据,则可以组成M-1个示例。若待处理序列的数目为N,则可以得到2N-1个包。M和N均为正整数。
如图9所示,为本申请实施例中使用多示例学习模型将各待处理序列划分为更小粒度的序列的一个示例性示意图。
下面结合图9举例说明多示例学习模型将各待处理序列划分为更小粒度的序列的过程:
以根据第一预设规则划分得到的两个分序列作为待处理序列,以数字序号和打点数据标记:V、A或L表示各打点数据为例,得到的两个待处理序列为:
待处理序列I1:1V、2A、3L、4A、5V、6A、7L、8A、9L、10A、11L;
待处理序列I2:12V、13A、14L、15V、16A、17L、18V、19A、20L、21A。
经过上述确定示例和示例标签,包和包标签的过程,这两个待处理序列I1和I2可以产生3个包,分别为:
B1:正包,包括10个正示例:【1V、2A】【2A、3L】【3L、4A】【4A、5V】【5V、6A】【6A、7L】【7L、8A】【8A、9L】【9L、10A】【10A、11L】;
B2:负包,包括1个负示例:【11L】【12V】;
B3:正包,包括9个正示例:【12V、13A】【13A、14L】【14L、15V】【15V、16A】【16A、17L】【17L、18V】【18V、19A】【19A、20L】【20L、21A】。
此时可以使用本申请实施例中的特征提取方法提取B1、B2、B3的每个包中每个示例的特征,得到每个特征的特征向量。若每个特征的特征向量的维度为J,若一个包中有K个示例,则从该包中提取的特征可以组成特征向量矩阵JxK。具体的提取示例的特征,并组成特征向量矩阵的过程可以参考术语描述中下述(10)打点数据序列包内示例的特征和包的特征向量矩阵中的内容,此处不作赘述。
得到B1、B2和B3每个包的特征向量矩阵后,可以将一个包作为一个训练单元,将一个包的特征向量矩阵和该包的包标签输入多示例学习模型中对该多示例学习模型进行训练。例如,先输入B1的特征向量矩阵和B1的包标签,再输入B2的特征向量矩阵和B2的包标签,再输入B3的特征向量矩阵和B3的包标签,以此类推。
在输入包的特征向量矩阵和包标签对多示例学习模型进行训练后,可以使用训练得到的多示例模型将待处理序列I1和I2划分为更小粒度的子序列。
由于多示例学习模型训练时使用了包的特征向量矩阵和包的标签,训练完成的模型可以直接预测示例的示例标签,因此,用待处理序列直接输入到多示例学习模型中就可以重新预测待处理序列中每个示例的示例标签,根据示例标签可以将待处理序列划分为更小粒度的序列,每个序列都对应一个独立的意图。
如图9所示,待处理序列I1和I2输入训练完的多示例学习模型后被划分成了更小粒度的子序列:
子序列i1:1V、2A、3L、4A;
子序列i2:5V、6A、7L;
子序列i3:8A、9L;
子序列i4:10A、11L;
子序列i5:12V、13A、14L;
子序列i6:15V、16A、17L;
子序列i7:18V、19A、20L、21A。
此时还可以使用第二预设规则确定每个子序列的意图。
(15)损失函数和训练完成的多示例学习模型:
损失函数是衡量预测模型在能够预测预期结果方面的表现有多好的指标。每种机器学习模型都有其对应的损失函数。模型的预测结果越好,则损失函数的值越小。
本申请实施例中,在使用已有的打点数据序列根据第一预设规则划分的分序列作为待处理序列,对多示例学习模型进行训练,并将该待处理序列划分为更小粒度的序列后。电子设备还可以继续将划分得到的更小粒度的序列作为待处理序列,迭代对多示例学习模型进行训练,从而将此时的待处理序列划分为更小粒度的序列。
在每次使用训练得到的多示例学习模型将待处理序列划分为更小粒度的序列后,电子设备可以得到该多示例学习模型的损失函数的值。当该损失函数的值不再减少,或减少的幅度小于预设减少阈值时,电子设备可以确定使用已有的打点数据序列不再对多示例模型的训练有较大的增益,电子设备可以将最后得到的多示例学习模型作为训练完成的多示例学习模型。
电子设备可以使用训练完成的多示例学习模型对新的打点数据序列进行序列划分。
(16)打点数据序列包内示例的特征和包的特征向量矩阵
本申请实施例中,示例由打点数据序列中相邻的两个打点数据组成。电子设备可以从示例的这两个打点数据中提取该示例的特征,组成该示例的特征向量。
一个示例的特征可以包含有多个维度。由于示例中包含相邻的两个打点数据,因此示例的特征与打点数据的特征密切相关。如图10所示,为本申请实施例中打点数据的一个示例性示意图。图10所示的示例中打点数据使用JSON结构体的格式保存,在实际应用中,打点数据还可以以其他方式保存,此处不作限定。图10中的(a)(b)(c)为打点数据序列中相邻的3个打点数据。图10中的(a)为一个语音助手打点数据V的示例;图10中的(b)为一个动作打点数据A的示例;图10中的(c)为一个返回桌面打点数据L的示例。
以下结合图10所示的打点数据的示例性示意图,以X为示例中的第一个打点数据,Y为示例中的第二个打点数据为例,对本申请实施例中的多个维度的示例的特征分不同的类型进行描述:
1、示例的文本特征;
有些用户操作产生的打点数据会包含很多内容(如语音助手的打点数据),有些用户操作产生的打点数据包含的内容则较少(如打开应用程序的打点数据),通过示例的文本特征可以反应示例中打点数据内容的多少。
具体的,示例的文本特征可以包括示例中打点数据中关键字的总数目,以及示例中打点数据字符串的总长度等。
可选的,若打点数据以JSON结构体的格式保存,则示例的文本特征可以包括:
a)示例中X和Y的JSON结构体的关键字的总个数;
b)示例中X和Y对应的JSON字符串的总长度。
可以理解的是,还可以从打点数据中提取其他的文本特征作为示例的文本特征,例如word2vec特征、分词特征等,此处不作限定。
示例性的,以图10中的(a)所示的语音助手打点数据V和图10中的(b)所示的动作打点数据A组成一个示例为例。如果示例中第一个打点数据的字符串很长,第二个打点数据的字符串很短,则这个示例对应的两条打点数据很可能是连续的,对应着相同的意图。在使用文本特征描述示例的特征时,打点数据X(语音助手打点数据V)的JSON结构体中有25个关键字(图10中以粗体表示),打点数据Y(动作打点数据A)的结构体中有19个关键字,则示例中关键字的总个数为25+19=44个。同理,示例中字符串的总长度=打点数据X的JSON字符串长度+打点数据Y的JSON字符串的长度。
2、示例的上下文特征;
相邻的两条打点数据中总有一些信息是相关联的,比如用户当前的操作是“打开通讯录”,下一次操作是“打电话”。如果打开通讯录时点击的联系人与打电话的联系人相同,则这两条相邻的打点数据很可能对应着相同的意图。类似的上下文特征还可以有很多。
具体的,示例的上下文特征可以包括:
a)示例中两条打点数据的应用程序包名的特征;
b)示例中两条打点数据的时间戳的差;
c)示例中两条打点数据间某些关键字的值是否相同。
例如若打点数据以JSON结构体的格式保存,则某些JSON关键字的值是否相同。比如打点数据X和打点数据Y的场景信息是否相同等。
可以理解的是,还可以从示例中打点数据间提取其他的上下文特征作为示例的上下文特 征,此处不作限定。
示例性的,以图10中的(a)所示的语音助手打点数据V和图10中的(b)所示的动作打点数据A组成一个示例为例。打点数据X(语音助手打点数据V)的应用程序包名为"com.huawei.hivoice"表示语音助手打点。打点数据Y(动作打点数据A)的应用程序包名为"com.ali.pay"表示“打开某购物应用程序”。可以维护一个白名单,把应用程序包名映射成one-hot,或者用word2Vec的方法转换成特征向量。时间戳的差则是打点数据X中tm与打点数据Y中的tm的差值。此外,还可以对比打点数据A的场景(scnens)中包含的信息是否与打点数据B的场景中的信息相同。
3、示例中各打点数据特有的特征;
一个示例由两条打点数据组成,上述示例的文本特征和示例的上下文特征都是示例中打点数据X和打点数据Y共同的特征,此外还可以提取打点数据X或打点数据Y特有的特征。
可选的,示例中各打点数据特有的特征,可以包括:
a)打点数据X或打点数据Y打点数据记录的操作的使用时间;
b)打点数据X或打点数据Y的使用时间是否小于预设使用时间阈值。
可以理解的是,还可以从示例中各打点数据中提取其他的特征作为示例中各打点数据特有的特征,此处不作限定。
4、示例中打点数据的统计特征。
除了可以考虑示例中打点数据本身的文本特征和内容特征外,还可以考虑打点数据的统计特征,即打点数据的统计信息的特征。统计信息能反映不同用户的差异,比如用户1日常使用某个应用的平均时间为t1,用户2日常使用相同应用的平均时间为t2,在t1内对用户1来说就是一个完整的意图,但对用户2来说可能并不是。
可选的,示例中各打点数据的统计特征,可以包括:
a)打点数据X或打点数据Y的使用时间是否大于平均使用时间;
b)打点数据X或打点数据Y输入打点数据序列的持续时间是否小于平均持续时间。
可以理解的是,还可以从示例中各打点数据中提取其他的统计信息的特征作为示例中打点数据的统计特征,此处不作限定。
可以理解的是,本申请实施例中并不限定还可以从示例的打点数据中提取其他类型的特征作为示例的特征,除上述举例外,各类型的特征中也还可以有其他不同的同类特征作为示例的特征,此处不作限定。
可以根据实际需求,确定J个特征来作为示例的特征。示例的一个不同的特征,可以作为示例特征的一个维度,示例的J个特征,即可以组成示例的J维特征向量。
若以x (i)表示第i个示例的特征向量,
Figure PCTCN2021079723-appb-000001
表示从该第i个示例中提取出的第一个特征,
Figure PCTCN2021079723-appb-000002
表示从该第i个示例中提取出的第二个特征,依次类推,
Figure PCTCN2021079723-appb-000003
表示从该第i个示例中提取出的第c个特征,直到从该第i个示例中提取出第J个特征,则第i个示例的特征向量
Figure PCTCN2021079723-appb-000004
一个包中包含一个或多个示例,一个示例包含有一个多维度的特征向量。因此,一个包中示例的特征可以组成一个特征向量矩阵。若示例的特征向量为J维特征向量,包内含有K个示例,则该包的特征向量矩阵为一个J×K的特征向量矩阵。
(17)知识图谱
知识图谱是结构化的语义知识库,其基本组成单位是“实体、关系、实体”三元组,或者是“实体、属性、属性值”三元组。通常,属性值也可以理解为是一种常量实体。且知识图谱通常由通用知识和个人知识两部分构成。其中,通用知识可以包括:群体行为、心理学、社会学、行为学、用户标签、用户调研结果等。个人知识可以包括:用户行为的数据挖掘、人际网络、财产信息、兴趣、爱好、习惯等,个人知识是可以实时更新的。本申请实施例在此对通用知识或个人知识具体包括哪些内容不做具体限定。
知识图谱通常由节点和边组成,节点表示实体或属性值,边表示属性或关系。在知识图谱中,边将各个节点连接起来,形成网状结构。其中,每个节点对应一个唯一的身份标识(identity,ID),每条边对应一个唯一的身份标识。知识图谱可应用于知识推理、搜索、自然语言理解、电子商务、问答等相关场景,并能做精准精细化的回答。
示例性的,如图11所示,图11示出了知识图谱的基本结构。该知识图谱包括节点11、节点13和节点14,节点11和节点13通过边12连接,节点11和节点14通过边15连接。其中,节点11表示实体A,边12表示关系F,节点13表示实体B,节点14表示属性值C,边15表示属性J。节点11、边12和节点13形成“实体、关系、实体”的三元组,具体用于表示“实体A和实体B之间存在关系F”。节点11、节点14和边15形成“实体、属性、属性值”的三元组,具体用于表示“实体A的属性J的属性值为属性值C”。
本申请实施例中的实体可以为人名、物体名称、地名、职业等。属性可以为姓名、年龄、身高、体重、经度、纬度、品牌、油耗等。关系可以为父子、母子、配偶、地理区域所属关系、从属关系等。
例如,对于“用户A有一辆车”这一事实,“用户A”与“车”这两个实体可以分别为节点11和节点13,边12表明“用户A”对“车”的“拥有”关系。属性可以为年龄(边15),属性值可以为20岁(节点14),很容易知道,用户A的年龄为20岁。
(18)多臂老虎机(multi-armed bandit,MAB)
在MAB问题中,对于有k个摇臂的老虎机,每个摇臂的回报率r i未知且不全相同。玩家的目标是在有限次按下摇臂的机会下,获得最大的回报q。一种解决方案是:对每个摇臂都尝试足够多的次数,统计得到每个摇臂的平均回报,并利用每个摇臂的平均回报来估计每个摇臂真实的回报率r i。之后选取回报率最大的摇臂执行剩余步骤。在上述过程中,用于探索(exploration)的次数越多,得到的每个摇臂的平均回报越准确,在得到每个摇臂的准确的平均回报后,回报率最大的摇臂利用(exploitation)的次数越多,最后得到的回报越高。显而易见,探索与利用的次数不可能同时都多,这会导致MAB问题中的利用与探索两难困境(exploitation-exploration dilemma,E&E)。
在本申请实施例中,意图识别领域会有MAB问题中的利用与探索两难困境。例如,电子设备识别用户的意图,并展示识别的意图的相关内容给用户,并期待用户的正反馈操作。每个意图可以视为一个摇臂,每次展示意图的相关内容可以视为按下摇臂,通过对每个意图都进行多次探索才能准确的评估每个意图的正确概率。
MAB问题及其衍生问题的解决方法是强化学习算法,例如,bandit算法。bandit算法可以分为“无上下文信息的bandit算法(context-free bandit)”和“使用上下文信息的bandit算法(contextual bandit)”两类。bandit算法能够对摇臂的探索与利用进行折中,同时兼顾探索过程与利用过程,使得不仅会展示回报率高(置信度大)的摇臂,还会展示置信度较低且探索次数较少的摇臂。
(19)特征空间,标记空间
所有特征向量存在的空间,每个具体的输入是一个实例,通常由特征向量表示。请参阅图12所示,令X∈R表示特征空间,令(X (1),Y (1)),(X (2),Y (2)),...,(X (m),Y (m))表示m个节点设备的私有数据集。其中,(X (1),Y (1))中X (1)表示第1个节点设备的特征空间,Y (1)表示第1个节点设备的标记空间;(X (2),Y (2))中X (2)表示第2个节点设备的特征空间,Y (2)表示第2个节点设备的标记空间,(X (i),Y (i))中的表示X (i)第i个节点设备的特征空间,Y (i)表示第i个节点设备的标记空间等等。
其中,该“特征空间”可以理解为输入数据的集合。“标记空间”可以理解为输出数据的集合。
x (i) j∈X表示X (i)中的第j个示例,y (i) j∈Y,Y (i) j表示X (i) j对应的标记向量,y (i) j为第i个节点设备中输入数据集中的一个输入特征。(x (i) j,y (i) j)实际存在的一个组合就是第i个节点设备中的第j个样本数据。
(20)标签
本申请实施例中,标签可以标记空间中的标记向量,或者,也可以理解为标记空间中的一个输出向量,如y (i) j。在一个例子中,标签可以是一个标记,也可以是多个标记组成的集合。
(21)粗粒度标签,细粒度标签
本申请实施例中,“粗粒度”和“细粒度”实际上是提供了两个层级。第一个层级是粗粒度标签,第二个层级是细粒度标签。可以理解的是,本方案中,在原本细粒度标签的基础上,增加了一个层级的标签,粗粒度标签为第一个层级的输出,而细粒度标签为在粗粒度标签下进一步细分的标签。例如,以应用(application,APP)来举例,粗粒度标签为“音乐”类应用、“视频”类应用。而细粒度标签为“酷狗音乐”,“QQ音乐”,“网易音乐”,“腾讯视频”,“爱奇艺视频”,“西瓜视频”等。在一个例子中,粗粒度标签可以理解为是隐去了动作的意图;细粒度标签可以理解为隐去了动作的服务,或者为待执行的动作等。也即是说,粗粒度标签与意图对应,细粒度标签与服务或者待执行的动作对应。例如,粗粒度标签为“音乐”类应用时,可以理解为此时用户的意图是打开音乐类应用;细粒度标签为“酷狗音乐”时,可以理解为此时需要执行的服务是打开酷狗音乐;此外,细粒度标签为“显示一张提示卡片”时,可以理解为此时需要待执行的动作为显示一张提示卡片。
通过一个场景例子对上述的词语进行举例说明。其中,本申请实施例中,节点设备可以为终端设备(或也称为用户设备)。其中,该终端设备可以表示任何计算设备。例如,该终端设备可以为智能手机、平板电脑、可穿戴设备(如眼镜、手表、耳机等)、个人计算机、计算机工作站、车载终端、无人驾驶中的终端、辅助驾驶中的终端、智能家居中的终端(如音箱,智慧屏,扫地机器人,空调等)等。例如,多个节点设备可以均可以以手机为例。本方案中,节点设备也可以简称为“端侧”。中控设备可以是云端服务器,或者,也可以是服务器,本方案中,该中控设备以云端服务器为例。该中控设备也可以简称为“云侧”。
对于APP推荐这个应用场景中,该APP推荐是指根据端侧用户对于APP的操作习惯,为用户推荐应用,从而提供预先加载应用的服务,提高应用的响应速度,以提升用户体验。 例如,在这个应用场景中,节点设备的数量并不限定,为了方便说明,节点设备的数量以3个为例进行说明,3个节点设备分别为节点设备1、节点设备2和节点设备3。
节点设备1、节点设备2和节点设备3中每个节点设备中下载的应用不完全相同,三个节点设备中下载的应用如下表1所示:
表1
节点设备1 QQ音乐 网易音乐 腾讯视频 今日头条 淘宝 高德地图
节点设备2 酷狗音乐 咪咕音乐 爱奇艺 网易新闻 天猫 网易严选
节点设备3 酷我音乐 优酷视频 哔哩哔哩 淘宝 京东 百度地图
需要说明的是,上表1中对于三个节点设备中下载的应用仅是为了方便说明而举的例子,并不造成限定。
例如,在“节点设备1”中的第一个数据样本为:8:00打开QQ音乐。在这个数据样本中,(x (1) 1,y (1) 1)中x (1) 1对应“8:00”,y (1) 1对应“QQ音乐”。
在“节点设备2”中的第一个数据样本为:8:10打开酷狗音乐。在这个数据样本中,(x (2) 1,y (2) 1)中x (2) 1对应“8:10”,y (2) 1对应“酷狗音乐”。
在“节点设备3”中的第一个数据样本为:7:30打开百度地图。在这个数据样本中,(x (3) 1,y (3) 1)中x (3) 1对应“7:30”,y (3) 1对应“百度地图”。
需要说明的是,此处x (i) j仅是以时间进行举例说明,本方案中并不限定输入特征,例如输入特征还可以包括用户场景信息,用户状态信息等,如用户场景信息可以为用户在室内还是室外等,用户状态信息可以包括:用户是行走、坐或卧的状态,用户心情(可由心率等一些感知信息得到)等。
参阅上表1,由于每个节点设备中下载的应用不同,在端侧进行学习的过程中,每个端侧的标记向量(或称为“标签”)各不相同。在“节点设备1”中,标签可以包括:QQ音乐、网易音乐、腾讯视频等。而在“节点设备2”中,标签可以包括:酷狗音乐、咪咕音乐、爱奇艺、网易新闻等。在“节点设备3”中,标签可以包括:酷我音乐、优酷视频、哔哩哔哩、淘宝等。每个节点设备中的标记空间各不相同。此时要想对各端侧数据进行联合训练需要统一端侧任务,即要统一端侧的标记空间(或也可以称为“标签空间”)。
统一端侧的标记空间,一种实现方式可以是暴力的取所有端侧标签空间的并集,获得统一的端侧标签空间。请参阅上表1,可以取“节点设备1”、“节点设备2”和“节点设备3”中下载的所有应用的并集,然而随着节点设备数目增多,统一端侧标签空间,会使得标签的数量急剧增大。如在应用预测场景下,应用总数有数十万个,不同用户下载的应用不完全相同,随着用户增加,端侧标签空间的大小会逼近应用总数。标签数目巨大会使得模型训练开销增大,且APP预测模型效果也无法保证。同时每个节点设备下载的应用数量在几十到百来个之间,远小于应用总数,因此暴力的设置统一的端侧标签显然也是不合理的。
由此,本方案中,将原有的标签作为细粒度标签,引入了细粒度标签上一个层级的标签,通过上一个层级的标签来统一各端侧任务不统一的情形。例如,第一层级标签(也可以称为“粗粒度标签”),第二层级标签(也可以称为“细粒度标签”)等,通过粗粒度标签来统一各节点设备的标签空间(也称为标记空间),可以使得在各端侧在细粒度任务不统一的情况下,各节点设备在粗粒度任务上的统一,多个节点设备也可以进行联合训练。在该APP预测模型训练的场景中,细粒度标签可以为QQ音乐、酷狗音乐、咪咕音乐、爱奇艺、网易新闻等各应用,通过对上述所有应用进行分类,从而将类别作为粗粒度标签。例如,粗粒度标签包括 “音乐”标签,“视频”标签,“网购”标签和“地图”标签等。多个节点设备进行联合训练的方法请参阅下述实施例的说明。需要说明的是,本方案中并不限定应用场景,上述应用场景仅是示例性说明。
(22)群体粗粒度模型,细粒度模型
本申请实施例中,每个节点设备内装载“群体粗粒度模型”和“细粒度模型”。其中,“群体粗粒度模型”和“细粒度模型”可以根据不同的应用场景使用不同的训练数据集进行训练,并不限定应用场景。其中,群体粗粒度模型的标记空间映射为粗粒度标签,细粒度标签的标记空间映射为细粒度标签。每个节点设备内的群体粗粒度模型是由系统中的多个节点设备联合训练得到的,而细粒度标签是在节点设备本地训练并更新得到的。
(23)规则
规则是由条件和结论构成的推理语句,当存在事实满足条件时,相应的结论可以被激活。其中,规则可以包含条件部分(left hand side,LHS)和结论部分(right hand side,RHS)。一般的,如果将一条规则看做是if-then语句,那么则可以将规则中的条件部分称为if部分,将规则中的结论部分称为then部分。
(24)模式
模式是由规则的条件部分分割出的最小的一个条件。多个模式可以组成规则的条件部分。例如,规则的条件部分为“年龄大于20岁,且年龄小于30岁”,则该规则中的有两个模式,其中一个模式为“年龄大于20岁”,另一个模式为“年龄小于30岁”。
(25)事实对象
事实对象是对于真实事物或者事实的承载对象,其可以理解为规则引擎所需要的输入参数。例如:登录事实对象,可能包含以下事实:登录名,登录设备,近一小时内登录成功次数,近一小时登录失败次数。
以上即是对本申请实施例中涉及的部分或全部相关术语及相关概念的介绍。接下来对本申请实施例中涉及的意图识别内容进行介绍。
现有技术中电子设备仅根据用户当前时刻的单模态输入获取的信息来预测用户意图,然而只利用当下时刻的用户数据和设备信息,无法准确预测其当下时刻的意图。因为用户一段时间内的连续行为和设备状态变化等会反映事件发生的潜在逻辑,为预测其意图提供根据,但如果忽视上下文信息,则不可避免某一时刻出现某一偶然事件,该偶然事件与用户真实意图并无关联,导致现有技术中对用户意图的识别具有极大的局限性且准确性较差。
比如双十一的晚上用户打开了淘宝、京东等购物软件,则接下来他可能会打开支付宝、微信等进行支付,打开购物软件的行为和打开支付软件的行为存在逻辑上的关联性。现有技术则可能会忽视这些上下文信息间逻辑上的关联性,使得意图识别不够准确。
而本申请实施例中,电子设备可以根据完整的环境描述和多模态的用户输入,结合领域知识和已有规则,准确无偏颇地识别出用户意图,为用户做出意图决策,如在合适的设备上响应合适的用户需求或为其提供合适的服务。
如图16所示,为本申请实施例中意图识别一个场景示意图。电子设备可以通过操作输入、环境感知、文本输入、语音输入与视觉输入等多模输入获取的信息来预测用户意图。示例性 的,电子设备在连接wifi时可以触发30分钟时长的实体识别,然后通过当前连接的WiFi信息、打开支付宝进行手机支付的动作、和收到购物短信这三个先后发生的独立事件组成的上下文实体序列,判断出用户可能在商场逛街。当用户打开相机对某商品(比如一个包包)拍照时,判断出用户很可能想要购买该包包,但是又不会在商场直接购买,因此用户下一时刻很可能打开购物软件搜索该商品。根据用户历史使用购物软件的频率,确定用户使用频率最高的两个购物软件是京东和淘宝,提前在后台加载这两个购物软件,以保证用户打开时无卡顿。
本申请实施例中,在分布式场景下,电子设备可以根据多设备的环境感知和用户的多模态输入获得对环境的完整描述,并结合一定时间窗格内的用户输入、环境感知和上下文信息,获取一个能反应随时间变化、并且能随环境变化而扩展的完整无偏颇的意图体系,据此做出决策,如推断出接下来一段时间内用户想执行的动作或需要的服务,以决策在何种设备上响应用户的何种需求。本申请实施例提供的方案适用于在信息输入多源且复杂、并依赖于时间因素的分布式场景下,为用户精准地提供他所需要的响应或服务的决策。
下面首先介绍本申请实施例提供的示例性电子设备100。示例性的,该电子设备100可以为上文所描述的电子设备,节点设备等。
图13是本申请实施例提供的电子设备100的结构示意图。
下面以电子设备100为例对实施例进行具体说明。应该理解的是,电子设备100可以具有比图中所示的更多的或者更少的部件,可以组合两个或多个的部件,或者可以具有不同的部件配置。图中所示出的各种部件可以在包括一个或多个信号处理和/或专用集成电路在内的硬件、软件、或硬件和软件的组合中实现。
电子设备100可以包括:处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,用户标识模块(subscriber identification module,SIM)卡接口195,以及定位装置(图中未示出)等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本发明实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。在一个例子中,处理器110可以从内存中获取语义对象对事实数据进行匹配, 也可以从文件中获取语义对象对事实数据进行匹配,亦可以根据匹配结果,确定是否执行相应的操作,即执行下文图21中所描述的步骤;此外,处理器110也可以用于构建规则引擎中的规则拓扑图。在一个例子中,处理器110可以对意图识别模型、动作预测模型、多示例学习模型等进行训练,或者更新模型中的参数等。在一个例子中,处理器110可以用于执行本方案中提供的意图识别方法。
其中,控制器可以是电子设备100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。在一个例子中,存储器中可以存储有群体粗粒度模型,个体粗粒度模型和细粒度模型等。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
可以理解的是,本发明实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备100的结构限定。在本申请另一些实施例中,电子设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器,显示屏194,摄像头193,和无线通信模块160等供电。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号 调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。在一个例子中,蓝牙可以用于实现电子设备100与其他短距离的设备(例如手机、智能手表等)之间的数据交换。本申请实施例中的蓝牙可以是集成电路或者蓝牙芯片等。
在一些实施例中,电子设备100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得电子设备100可以通过无线通信技术与网络以及其他设备通信。
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。在一个例子中,显示屏194可以为触摸屏,该触摸屏可以具体可以包括触控板和显示器。其中,触控板可采集电子设备100的用户在其上或附近的触摸事件(比如用户使用手指、触控笔等任何适合的物体在触控板上或在触控板附近的操作),并将采集到的触摸信息发送至其他器件(例如处理器110)。显示器可用于显示由用户输入的信息或提供给用户的信息以及电子设备100的各种菜单。可以采用液晶显示器、有机发光二极管等形式来配置显示器。
电子设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,电子设备100可以包括1个或N个摄像头193,N为大于1的正整数。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。电子设备100可以支持一种或多种视频编解码器。这样,电子设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组 (moving picture experts group,MPEG)1,MPEG2,MPEG3,MPEG4等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
本申请的一些实施例中,可以使用NPU进行语音识别、图像识别或文本理解等生成打点数据。本申请的一些实施例中,可以使用NPU从打点数据序列中提取训练数据,对多示例学习模型进行训练。本申请的一些实施例中,可以使用NPU按照预设意图规则确定子序列的意图。此处不作限定。本申请的一些实施例中,通过NPU可以实现规则引擎的智能认知等应用,例如:文本理解,决策推理等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行电子设备100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用(比如人脸识别功能,指纹识别功能、移动支付功能等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如人脸信息模板数据,指纹信息模板等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。电子设备100可以通过扬声器170A收听音乐,或收听免提通话。
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。当电子设备100接听电话或语音信息时,可以通过将受话器170B靠近人耳接听语音。
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风170C发声,将声音信号输入到麦克风170C。电子设备100可以设置至少一个麦克风170C。在另一些实施例中,电子设备100可以设置两个麦克风170C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,电子设备100还可以设置三个,四个或更多麦克风170C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器180A,电极之间的电容改变。电子设备100根 据电容的变化确定压力的强度。当有触摸操作作用于显示屏194,电子设备100根据压力传感器180A检测所述触摸操作强度。电子设备100也可以根据压力传感器180A的检测信号计算触摸的位置。在一些实施例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。例如:当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图标时,执行查看短消息的指令。当有触摸操作强度大于或等于第一压力阈值的触摸操作作用于短消息应用图标时,执行新建短消息的指令。
陀螺仪传感器180B可以用于确定电子设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定电子设备100围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器180B检测电子设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消电子设备100的抖动,实现防抖。陀螺仪传感器180B还可以用于导航,体感游戏场景。
气压传感器180C用于测量气压。在一些实施例中,电子设备100通过气压传感器180C测得的气压值计算海拔高度,辅助定位和导航。
磁传感器180D包括霍尔传感器。电子设备100可以利用磁传感器180D检测翻盖皮套的开合。在一些实施例中,当电子设备100是翻盖机时,电子设备100可以根据磁传感器180D检测翻盖的开合。进而根据检测到的皮套的开合状态或翻盖的开合状态,设置翻盖自动解锁等特性。
加速度传感器180E可检测电子设备100在各个方向上(一般为三轴)加速度的大小。当电子设备100静止时可检测出重力的大小及方向。还可以用于识别电子设备姿态,应用于横竖屏切换,计步器等应用。
距离传感器180F,用于测量距离。电子设备100可以通过红外或激光测量距离。在一些实施例中,拍摄场景,电子设备100可以利用距离传感器180F测距以实现快速对焦。
接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。电子设备100通过发光二极管向外发射红外光。电子设备100使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定电子设备100附近有物体。当检测到不充分的反射光时,电子设备100可以确定电子设备100附近没有物体。电子设备100可以利用接近光传感器180G检测用户手持电子设备100贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。接近光传感器180G也可用于皮套模式,口袋模式自动解锁与锁屏。
环境光传感器180L用于感知环境光亮度。电子设备100可以根据感知的环境光亮度自适应调节显示屏194亮度。环境光传感器180L也可用于拍照时自动调节白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测电子设备100是否在口袋里,以防误触。
指纹传感器180H用于采集指纹。电子设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
温度传感器180J用于检测温度。在一些实施例中,电子设备100利用温度传感器180J检测的温度,执行温度处理策略。例如,当温度传感器180J上报的温度超过阈值,电子设备100执行降低位于温度传感器180J附近的处理器的性能,以便降低功耗实施热保护。在另一些实施例中,当温度低于另一阈值时,电子设备100对电池142加热,以避免低温导致电子设备100异常关机。在其他一些实施例中,当温度低于又一阈值时,电子设备100对电池142的输出电压执行升压,以避免低温导致的异常关机。
触摸传感器180K,也称“触控面板”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于电子设备100的表面,与显示屏194所处的位置不同。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。电子设备100可以接收按键输入,产生与电子设备100的用户设置以及功能控制有关的键信号输入。
马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反馈。例如,作用于不同应用(例如拍照,音频播放等)的触摸操作,可以对应不同的振动反馈效果。作用于显示屏194不同区域的触摸操作,马达191也可对应不同的振动反馈效果。不同的应用场景(例如:时间提醒,接收信息,闹钟,游戏等)也可以对应不同的振动反馈效果。触摸振动反馈效果还可以支持自定义。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。
SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和电子设备100的接触和分离。电子设备100可以支持1个或N个SIM卡接口,N为大于1的正整数。SIM卡接口195可以支持Nano SIM卡,Micro SIM卡,SIM卡等。同一个SIM卡接口195可以同时插入多张卡。所述多张卡的类型可以相同,也可以不同。SIM卡接口195也可以兼容不同类型的SIM卡。SIM卡接口195也可以兼容外部存储卡。电子设备100通过SIM卡和网络交互,实现通话以及数据通信等功能。
定位装置可以为电子设备100提供地理位置。可以理解的是,该定位装置具体可以是全球定位系统(global positioning system,GPS)或北斗卫星导航系统、俄罗斯GLONASS等定位系统的接收器。定位装置在接收到上述定位系统发送的地理位置后,将该信息发送至处理器110进行处理,或者发送至存储器进行保存。
本申请实施例中,电子设备100可以通过传感器模块180中的各种传感器、按键190、摄像头193、耳机接口170D、麦克风170C等部件获取用户操作,处理器110响应用户操作,执行相应指令的过程中会产生打点数据,产生的打点数据可以保存在内部存储器121中。处理器110可以根据本申请实施例中的多示例学习模型训练方法和训练数据生成方法训练出多示例学习模型,可以根据本申请实施例中的意图识别方法使用该多示例学习模型将打点数据序列划分为各小粒度且其中打点数据意图一致的子序列,确定出各子序列的意图。
本申请的一些实施例中,各方法中的步骤可以由处理器110中的应用处理器单独完成,可以由处理器110中的NPU单独完成,也可以由处理器中的应用处理器和NPU协同完成,也可以由处理器110中的其他处理器共同协同完成,此处不作限定。
接着对图13中的电子设备100的的软件结构进行介绍。
请参阅图14,图14是本发明实施例的电子设备100的软件结构框图。
分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统分为四层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(Android runtime)和系统库,以及内核层。
应用程序层可以包括一系列应用程序包。
如图14所示,应用程序包可以包括相机,图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,视频,短信息,图15中所示意图识别决策系统501等应用程序(也可以称为应用(application,App))。
本一个例子中,如图15所示,意图识别决策系统501中可以包含意图识别模块605,该意图识别模块605可以用于用于识别、存储及管理意图。
在一个例子中,如图15所示,意图识别决策系统501中可以包含动作反馈模块608。该动作反馈模块608中可以包括有上文所描述的多示例学习模型。该多示例学习模型可以基于多示例学习模型训练模块训练得到,其中,多示例学习模型训练模块可以用于执行本申请实施例中的多示例学习模型训练方法。示例性的,多示例学习模型训练模块可以配置于动作反馈模块608中,也可以配置于端侧或云侧,在此不做限定。
在一个例子中,该多示例学习模型训练模块中可以包括有训练数据生成模块,该训练数据生成模块用于执行本申请实施例中的训练数据生成方法。
在一个例子中,该多示例学习模型训练模块可以为独立于动作反馈模块608的另一个单独的模块,此处不作限定。
在一个例子中,该多示例学习模型训练模块中的训练数据生成模块也可以为独立于动作反馈模块608和多示例学习模型训练模块的另一个单独的模块,此处不作限定。
在一个例子中,该意图识别模块605、动作反馈模块608、多示例学习模型训练模块与训练数据生成模块也可以位于该软件构架的其他层级中,例如应用程序框架层、系统库、内核层等,此处不作限定。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。
如图14所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器,本地Profile管理助手(Local Profile Assistant,LPA)等。
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。
电话管理器用于提供电子设备100的通信功能。例如通话状态的管理(包括接通,挂断等)。
资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话界面形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。
安卓运行时(Android Runtime)包括核心库和虚拟机。Android runtime负责安卓系统 的调度和管理。
核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。
应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(Media Libraries),三维图形处理库(例如:OpenGL ES),二维图形引擎(例如:SGL)等。
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了二维(2-Dimensional,2D)和三维(3-Dimensional,3D)图层的融合。
媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。
三维图形处理库用于实现3D图形绘图,图像渲染,合成,和图层处理等。
2D图形引擎是2D绘图的绘图引擎。
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动,虚拟卡驱动。
下面结合捕获拍照场景,示例性说明电子设备100软件以及硬件的工作流程。
当触摸传感器180K接收到触摸操作,相应的硬件中断被发给内核层。内核层将触摸操作加工成原始输入事件(包括触摸坐标,触摸操作的时间戳等信息)。原始输入事件被存储在内核层。应用程序框架层从内核层获取原始输入事件,识别该输入事件所对应的控件。以该触摸操作是触摸单击操作,该单击操作所对应的控件为相机应用图标的控件为例,相机应用调用应用框架层的接口,启动相机应用,进而通过调用内核层启动摄像头驱动,通过摄像头193捕获静态图像或视频。
以上即是对本方案中的电子设备100的硬件结构和软件结构的介绍。接下来,基于上述电子设备100的硬件结构和软件结构,对本方案中涉及的意图识别决策系统进行介绍。如图15所示,为上述意图识别决策系统501的示例性软件结构框图。
意图识别决策系统501用于将外界多模态的输入,如用户操作、环境感知、文本输入、语音输入、视觉输入等,映射为高阶实体,并结合一定时间段内的上下文高阶实体,共同组成实体序列,将此实体序列映射到可扩展意图体系中来获取用户当前时刻的意图,结合已有的领域知识、规则以及可扩展的实体序列,基于统计和逻辑,推理并决策应当在何种设备上响应用户的何种需求,亦即,将此意图映射为动作序列和服务链,并据此反馈到意图体系上,对齐做出修正。
具体的,该意图识别决策系统501包括多模态输入模块601,知识库602,实体识别模块603,上下文模块604,意图识别模块605,规则引擎606,决策推理模块607和动作反馈模块608。
其中,多模态输入模块601用于获取各种不同的输入类型输入的数据。例如,可以获取用户在电子设备100上触摸、按压、滑动等用户操作数据;可以获取电子设备100中各种传感器取得的环境感知数据;可以获取用户在电子设备100中搜索文本时的文本输入数据;可以获取电子设备100的麦克检测到的语音输入数据;可以获取电子设备100中图片、视频、手势、摄像头识别的表情等视觉输入数据等。还可以获取电子设备100能够取得的其他类型 的输入,此处不作限定。在一个例子中,多模态输入模块601获取到的数据可以包括打点数据,用户感知数据,等等。
知识库602中包含已有的领域知识,具体可以包括实体识别模块603启动实体识别的各种触发点、各触发点对应的进行实体识别的时间窗格长度、各触发点与多模态输入中输入方式的类型的对应关系、保存的用户的习惯规则、根据实体仓库单元6033中的实体训练出来的实体识别模型,以及各实体之间的关联关系。在一个例子中,知识库602中可以包含知识图谱。
实体识别模块603用于识别、存储并管理实体。实体识别模块603中包含实体提取单元6031、实体管理单元6032和实体仓库单元6033。其中实体提取单元6031用于根据知识库602中存储的实体识别模型,从多模态输入模块601获取的数据中识别出具有特定意义的实体;实体仓库单元6033用于存储实体;实体管理单元6032用于定期更新和动态扩展实体仓库。
作为一种可能的实现方式,实体识别模块603可以从多模态输入的数据中提取特征向量,得到特征向量集合。其中,该特征向量集合中可以包括所有从多模态输入的数据中提取得到的特征向量,该特征向量可以用于表示多模态输入的各个数据的特征。接着,实体识别模块603可以将得到的特征向量集合输入到实体识别模型,得到实体序列。其中,该实体识别模型可以为根据电子设备中存储的实体数据训练得到的特征向量与实体的对应关系,实体数据为实体的存储形式,实体数据至少包括实体的编号及表示该实体的特征向量集合。
上下文模块604用于存储上下文实体。上下文实体是指电子设备识别出的一段时间窗格内的实体序列。上下文模块604中存储的实体序列的数目可以预先设定,也可以根据电子设备的存储容量进行实时控制,此处不作限定。
意图识别模块605用于识别、存储及管理意图。意图识别模块中包含意图映射单元6051、意图管理单元6052和意图仓库单元6053。其中意图映射单元6051用于根据实体序列预测出用户意图,其输入为实体序列,输出为意图;意图仓库单元6053用于存储意图;意图管理单元6052用于定期更新和动态扩展意图仓库单元6053,有些新出现的意图会被补充进意图仓库单元6053中,久未出现的意图则会被从意图仓库单元6053中移除。
在一个例子中,意图识别模块605可以基于预存储的知识图谱确定出多个候选意图,以及从多个候选意图中确定出目标意图,详见下文描述。
在一个例子中,意图识别模块605中可以具有意图识别模型,该意图识别模型可以用于识别出意图。
作为一种可能的实现方式,本方案中,在生成意图识别模型时,可以利用生成式对抗网络的特性,降低在生成器中所生成的模拟数据与原始输入的测试数据之间的偏差,以提升神经网络所生成的模拟数据的数据质量,进而使得在利用生成式对抗网络得到的模拟数据,作为预设的训练网络的输入数据的一部分,进行训练得到预测模型,例如,意图识别模型。由于该模拟数据与原始输入的测试数据之间的偏差较小,因此,通过该模拟数据参与训练网络的训练过程,可以提升后续得到的预测模型的预测效果,使得在模拟环境中训练得到较优的预测模型,即得到最优的意图识别模型。对于利用生成式对抗网络的特性进行数据处理的过程,以及基于利用生成式对抗网络的特性得到的模拟数据训练意图识别模型的过程,详见下文描述。
作为另一种可能的实现方式,本方案中,该意图识别模型可以基于联合学习系统得到。该联合学习系统可以包括多个节点设备,每个节点设备中均可以配置有群体粗粒度模型和细 粒度模型。在训练得到意图识别模型时,可以先获取细粒度标签与粗粒度标签的映射关系;然后再根据映射关系将训练数据集中的细粒度数据映射为粗粒度数据;接着再将粗粒度数据输入到群体粗粒度模型进行训练,通过多个节点设备的联合学习对群体粗粒度模型进行更新,并将细粒度数据输入到细粒度模型进行训练;最后,组合群体粗粒度模型和细粒度模型以得到联合模型,例如,意图识别模型,联合模型的标记空间映射为细粒度标签,联合模型的输出结果可以用于更新细粒度模型。对于得到联合模型(如意图识别模型)的过程,详见下文描述。
规则引擎606用于提供推理决策的规则。在一些简单场景中,不需要利用数据预测用户意图并为之作出决策,只需根据规则决定该场景下执行何种动作即可。规则引擎606可以预存有常用现有规则,还可以根据知识库602中存储的用户习惯规则对规则进行更新。
在一个例子中,规则引擎606可以从知识库602中获取知识图谱,然后再基于知识图谱预测出用户意图或者该场景下所需执行的动作等。
在一个例子中,规则引擎606中可以具有一个或多个规则,此时,规则引擎606中可以包括规则拓扑图。如图17所示,该规则拓扑图中可以包含根节点(root node),类型节点(type node),模式节点(pattern node),组合节点(merge node),结果节点(consequence node)和激活节点(active node)。下面对各个节点分别进行介绍。
根节点(root node),是输入起始节点,其可以是规则引擎的入口,所有事实对象可以通过该根节点进入到规则引擎中。一个规则引擎中可以包含一个根节点。
类型节点(type node),可以定义事实数据的类型。事实对象中的各个事实从根节点进入后,可以进入类型节点;该类型节点可以进行类型检查,其只让与其类型相匹配的事实到达该节点。其中,类型节点的数量可以由规则中条件部分包含的事实的类型的数量确定。示例性的,当规则拓扑图中包含一条规则时,若该规则的条件部分中包含2个类型的事实,则类型节点为2个;当规则拓扑图中包含多条规则时,若多条规则的条件部分中包含3个类型的事实,则类型节点为3个,,例如,一条规则的条件部分为“年龄大于20岁,地点为户外”,另一条规则的条件部分为“时间为上午8地点,地点为在家”,则此时总共存在三种类型的事实,分别为“时间”,“年龄”和“地点”,因此,该拓扑图中可以包含3种类型的类型节点。在一个例子中,事实对象在由根节点进入到类型节点时,根节点可以确定事实对象中各个事实的类型,例如基于class类型确定;然后根节点再将各个事实输入到对应的类型节点。例如,若事实对象包括以下事实:日期为12月,时间为上午8点,地点为户外;则该事实对象中包括了两种类型的事实,即时间和地点,其中,“12月,上午8点”这两个事实可以进入到类型为时间的类型节点,“户外”可以进入到类型为地点的类型节点。在一个例子中,事实数据可以为实体,意图等。
模式节点(pattern node),可以存储规则中模式的语义对象,以及确定符合该模式节点对应的模式的事实。例如,模式节点可以表达规则中的一个条件,其所表达的条件是计算机可理解的条件表达式;此外,模式节点还可以表达条件的匹配结果,以及对条件表达式进行计算,并存储计算结果。其中,每个模式节点对应规则的一种模式,例如,规则的条件部分“年龄大于20岁,地点为户外”,则此时在规则拓扑图中可以包含两个模式节点,一个模式节点对应该规则的条件部分中的“年龄大于20岁”,另一个模式节点对应该规则的条件部分中的“地点为户外”。在一个例子中,模式节点中存储规则中模式的语义对象,可以理解为该模式节点中存储了该模式节点对应的规则中模式背后的计算语句,通过该计算语句可以对进 入到该模式节点的事实进行判断;模式节点确定符合该模式节点对应的模式的事实,可以理解为该模式节点可以加载其存储的语义对象对进入到该模式节点的事实进行判断,以确定进入到该模式节点的事实是否符合该模式节点对应的模式的事实,例如,模式节点对应的模式为“年龄大于20岁”,则其存储判断年龄是否大于20岁的计算语句,当进入到该模式节点的事实为“年龄为19岁”时,该模式节点可以加载相应的计算语句对“年龄为19岁”这一事实进行判断。
本方案中,模式节点的类型可以包括瞬态模式节点和持久态模式节点两种类型。瞬态模式节点的语义对象可以存储在内存中,持久态模式节点的语义对象可以持久化于文件中。其中,瞬态模式节点对应的模式的事实的数据变化频率高于持久态模式节点对应模式的事实的数据变化频率。示例性的,瞬态模式节点适合依赖数据变化频繁的模式,例如时间、地理位置的变化等;持久态模式节点适合依赖数据变化缓慢的模式,例如年龄、季节的变化等。也即是说,本方案中,根据事实数据变化的特征,模式节点选择性地将语义对象持久化到文件或是加载到内存中常驻,这样即可以实现对于不常访问的模式节点,释放掉冗余内存,同时对于经常访问的节点,不影响其匹配效率,以此达到降低内存的目的。
本方案中,如图18所示,模式节点的数据结构可以用状态表和模式语义索引表示。其中,状态表可以用于缓存模式节点对应的模式的历史匹配信息,模式语义索引可以用于索引获取模式节点的语义对象。在一个例子中,继续参阅图18,历史匹配信息可以包括:模式节点对应的模式的身份标识(即图18中的ID)、模式节点对应的模式的前次匹配结果(即图2中的isMatached)和模式节点对应的事实的数据变化次数(即图18中的modCount);模式语义索引可以包括内存或文件,其中,当模式语义索引包括内存时,则表示该模式节点为瞬态模式节点,当模式语义索引包括文件时,则表示该模式节点为持久态模式节点。瞬态模式节点的模式语义索引是从内存中索引获取语义对象,持久态模式节点的模式语义索引是从文件中索引获取语义对象。
在一个例子中,前次匹配结果(即图18中的isMatached)可以使用标志位表示,例如,1代表与该模式节点对应的模式相符,0代表与该模式节点对应的模式不相符,即1代表真(true),0代表假(false);举例来说,模式节点对应的模式为“年龄大于20岁”,若前次输入的事实为“年龄为19岁”,则此时前次匹配结果可以用标志位0表示,若前次输入的事实为“年龄为30岁”,则此时前次匹配结果可以用标志位1表示。
在一个例子中,模式节点对应的事实的数据变化次数(即图18中的modCount),可以理解为该模式节点对应的模式的历史匹配信息中事实的数据变化次数,例如,该模式节点总共加载了4次语义对象,则该模式节点对应的模式的历史匹配信息中事实的数据变化次数为4次。本方案中,当输入到规则引擎中的事实的数据变化次数与模式节点的状态表中记录的事实的数据变化次数不一致时,该模式节点则加载语义对象对该事实进行判断,以及更新其状态表中记录的事实的数据变化次数。举例来说,模式节点的状态表中记录的事实数据变化次数为2次,输入到规则引擎中的事实的数据变化次数为3次,此时两者不符,该模式节点则加载语义对象对当前输入的事实进行判断,此时该模式节点可以将其记录的事实的数据变化次数更新为3次。此外,若输入到规则引擎中的事实的数据变化次数与模式节点的状态表中记录的事实的数据变化次数“一致”时,则可以继续使用上一次的匹配结果,此时不需要更新前次匹配结果,即不需要更新图18中的isMatached;否则需要更新使用上一次的匹配结果,即更新图18中的isMatached。
在一个例子中,模式节点的状态表中记录的的事实的数据变化次数,可以用来判断在重构规则拓扑图时是否调整该模式节点的类型。示例性的,当模式节点的状态表中记录的的事实的数据变化次数大于预设次数阈值时,表明该事实的变化频率较快,此时则在重构规则拓扑图时,若重构前该模式节点的类型为瞬态模式节点,则此次重构时将该模式节点的类型继续保持为瞬态模式节点;若重构前该模式节点的类型为持久态模式节点,则此次重构时将该模式节点的类型由持久态模式节点变更为瞬态模式节点。同样的,当模式节点的状态表中记录的的事实的数据变化次数小于或等于预设次数阈值时,表明该事实的变化频率较慢,此时则在重构规则拓扑图时,若重构前该模式节点的类型为瞬态模式节点,则此次重构时将该模式节点的类型由瞬态模式节点变更为持久态模式节点;若重构前该模式节点的类型为持久态模式节点,则此次重构时将该模式节点的类型继续保持为持久态模式节点。示例性的,如图19所示,模式节点7的状态表中记录的事实的数据变化次数与预设次数阈值存在差异,且相应的事实的变化频率发生了变化,此时则可以在重构规则拓扑图时变更模式节点7的类型。
举例来说,在中国大部分地区的气候是四季分明的,而季度发生变化的时长往往是3个月,即季度变化频率较低。但在中国的新疆地区昼夜温差往往较大,一天中有时候中午的气温相当于夏季,而晚上的气温又相当于冬季,因此可以理解为该地区的季度变化较为频繁。如果默认是将规则引擎中“季度”对应的模式节点的语义对象存储在文件中,则该规则引擎在中国大部分地区使用时均可以符合要求。但当该规则引擎在中国的新疆地区使用时,则会出现频繁从文件中加载语义对象的情况,导致规则引擎的执行效率较低。因此,该规则引擎在中国的新疆地区重构其内的规则拓扑图时,可以将“季度”对应的模式节点的语义对象由存储在文件中切换为存储在内存中,即切换“季度”对应的模式节点的类型。
在一个例子中,在首次构建规则拓扑图时,可以基于经验值,确定模式节点的类型。例如,当模式节点对应的事实为“年龄”时,由于年龄的变化频率较慢,因此可以将“年龄”这一事实对应的模式节点的类型确定为持久态模式节点,并将语义对象存储在文件中;当模式节点对应的事实为“时间”时,由于时间的变化频率较快,因此可以将“时间”这一事实对应的模式节点的类型确定为瞬态模式节点,并将语义对象存储在内存中。
可以理解的是,本方案中,同一数据类型的不同模式通过链式组合,可以共同构成逻辑“与”关系的组合模式。例如“年龄>22”和“年龄<30”两个模式节点,组合成“22<年龄<30”模式,同理“年龄>22”和“年龄<50”组合成“22<年龄<50”模式,它们共同依赖“年龄>22”模式节点。
组合节点(merge node),可以对一个规则对应的各个模式节点的匹配结果进行组合,以及确定是否触发规则。组合节点至少为一个,每个组合节点均对应一条规则。其中,组合节点综合表达了其所组合的模式的语义信息及逻辑结果。不同数据类型的组合模式通过组合节点,可以合并成某一条规则的条件。例如“22<年龄<30”和“位置为户外”合并成的规则的条件部分为“22<年龄<30,位置为户外”。
可以理解的是,当一个规则对应的各个模式节点的匹配结果均指示匹配成功时,组合节点则可以确定触发该规则。当一个规则对应的各个模式节点中有一个模式节点的匹配结果指示匹配失败时,组合节点则可以确定限制触发该规则,即不触发该规则。
可以理解的是,当一条规则中的各个模式对应的事实的类型均为同一类型时,则该规则对应的组合节点可以与通过链式组合的模式节点的最后一个模式节点对应。此外,当需要删除一条规则时,可以不用直接修改规则拓扑图,而是将该规则对应的组合节点标记为无效状 态;之后,在下一次重构规则拓扑图时,再删除该规则。
结果节点(consequence node),可以存储规则所需执行动作的语义对象,以及在组合节点确定触发规则时加载规则所需执行动作的语义对象。其中,每条规则具有以一个结果节点,在规则引擎中的规则拓扑图内结果节点的数量至少为一个,每个结果节点均对应一个组合节点。本方案中,结果节点表达了规则中某一动作具体执行的语句,当规则满足所有条件时,即触发相应的动作。
本方案中,结果节点的类型可以包括瞬态结果节点和持久态结果节点两种类型。瞬态结果节点的语义对象可以存储在内存中,持久态结果节点的语义对象可以持久化于文件中。在一个例子中,结果节点的类型依赖于模式节点的类型;其中,当一条规则中的各个模式对应的模式节点的类型均为瞬态模式节点时,则该规则对应的结果节点的类型为瞬态结果节点,当一条规则中的各个模式节点的类型中存在持久态模式节点时,则该规则的结果节点的类型为持久态结果节点。示例性的,一条规则包括两个模式,这两个模式对应的模式节点的类型均为瞬态模式节点,则该规则对应的结果节点的类型为瞬态结果节点;一条规则包括两个模式,其中一个模式对应的模式节点的类型为瞬态模式节点,另一个模式对应的模式节点的类型为持久态模式节点,则该规则对应的结果节点的类型为持久态结果节点;一条规则包括两个模式,这两个模式对应的模式节点的类型均为持久态模式节点,则该规则对应的结果节点的类型为持久态结果节点。
本方案中,结果节点的数据结构可以包括模式语义索引,该模式语义索引可以用于索引获取结果节点的语义对象。其中,瞬态结果节点的模式语义索引是从内存中索引获取语义对象,持久态结果节点的模式语义索引是从文件中索引获取语义对象。
可以理解的是,本方案中,持久态结果节点对应的规则被触发的频率较低,瞬态结果节点对应的规则的触发频率较高。举例来说,当规则为天气提醒规则时,若每天均需要进行天气提醒,则该规则触发的频率较高,因此可以推知该规则对应的结果节点的类型为瞬态结果节点;当规则为年度总结提醒规则时,由于年度总结往往是一年做一次,因此该规则触发的频率较低,因此可以推知该规则对应的结果节点的类型为持久态结果节点。此外,在重构规则拓扑图时,若重构前后规则对应的模式节点的类型出现变更,则该规则对应的结果节点的类型也可以适应性的进行切换,其中,在切换结果节点的类型时可以参照上文描述的结果节点与模式节点之间的关系。例如,如图19所示,重构规则拓扑图时,模式节点7的类型发生了变化,而模式节点7对应的规则,仅有具有一个模式节点,因此不存在其他模式节点的影响,此时则可以切换该规则对应的结果节点的类型。
激活节点(active node),可以在结果节点加载规则所需执行动作的语义对象后,执行规则对应的动作。例如,当规则为天气提醒规则时,则在该规则被触发后,激活节点可以进行天气提醒。
以上即为对本方案中提及的规则引擎606中涉及的规则拓扑图的相关介绍。接下来基于上文对规则引擎606中规则拓扑图的相关介绍,对该规则拓扑图的创建过程进行描述。
1)创建根节点。
2)解析规则,读取规则中的模式a。
3)检查模式a对应的事实的数据类型,若属于新类型,则在根节点后添加一个类型节点;若不属于新类型,则直接进行下一步。
4)检查模式a对应的模式节点是否存在,若不存在,则在类型节点尾部新增模式节点, 根据模式a对应的事实的数据类型,定义该新增的模式节点的类型。例如,模式a为“是否有私家车”,“是否正在驾车”等数据变化较慢或具有互斥性的模式时,可以将模式a对应的模式节点的类型定义为持久态模式节点;模式a为“是否在家”,“是否离家”等地理位置相关变化较频繁的模式时,可以将模式a对应的模式节点的类型定义为瞬态模式节点。
定义新增的模式节点的类型后,即可以根据该模式节点类型生成状态表和对应的语义索引。
5)重复3)和4),直至处理完规则中所有的模式。
6)组合模式节点,若组合的模式节点中存在持久态模式节点,则将对应的结果节点定义为持久态结果节点;若不存在持久态模式节点,则将对应的结果节点定义为瞬态结果节点。可以理解的是,该步骤即为创建组合节点和结果节点的过程。
7)重复2)至6),直至解析编译完所有的规则。
可以理解的是,上述对规则拓扑图的创建过程的描述中的部分或全部内容,可以参考上文有关规则引擎中规则拓扑图的介绍,例如,如何确定模式节点的类型等等,在此就不再一一赘述。
为便于理解,下面对规则拓扑图的创建过程举例进行说明。
如图20所示,该图示例给出以下3条简易的服务推荐场景规则:
a)路况提醒规则
条件:22<Age<30&&Location==Outdoor
动作:弹窗附近路况,推荐最优出行方式
b)天气提醒规则
条件:7:00am<Time<8:00am&&Location==Home
动作:通知栏弹出天气预报,推荐穿衣指南
c)年度总结提醒规则
条件:22<Age<50&&7:00am<Time<8:00am&&Time==December
动作:负一屏弹出年终总结卡片
在创建这三条规则对应的规则拓扑图过程中,先创建一个根节点(即图20中的root)。然后解析其中一条规则,并读取该规则中的模式。以路况提醒规则为例,在路况提醒规则中包括三个模式,分别为“Age>20”,“Age<30”和“Location==Outdoor”。此时,可以随机或按顺序选取一个模式,如“Age>20”,然后在检测“Age>20”对应的事实的数据类型,若属于新类型,则在根节点后添加一个类型节点,如图20中的“Age”节点。接着,确定“Age>20”对应的模式节点不存在,则创建一个模式节点,即图20中的“Age>20”,并定义该模式节点的类型,年龄事实数据变化的频率较低,故该模式节点的类型为持久态模式节点。之后,即可以生成该模式节点的状态表和语义索引。在遍历完天气提醒规则中的各个模式后,即可以创建组合节点和结果节点。之后,随机或依次对各个规则进行编译,即可以构建出如图20所示的规则拓扑图。
在构建出规则拓扑图后,即可以使用该规则拓扑图。下面结合图20该规则拓扑图的应用过程进行描述。
以天气提醒规则为例,当用户回到家时,此时事实数据Location发生变化,Location事实数据进入到处理队列,首先到达根节点通过类型判断,再进入到Location节点,同时分别访问Location==Outdoor和Location==Home这两个模式节点。分别比较各个模式节点中 记录的事实数据的变化次数与输入至规则拓扑图中的Location这一事实数据的变化次数。当变化次数不一致时,则加载相应的模式节点中的语义对象来更新模式节点中的isMatched值;其中,Location==Outdoor这一模式节点的isMatched值可以更新为False,Location==Home这一模式节点的isMatched值可以更新为True。之后,再对涉及此数据类型模式节点的所有组合节点进行逻辑运算,若其他模式节点此时暂不满足条件,该规则不触发。而当系统时间大于7:00am,且小于8:am时,Time事实数据变化进入到处理队列,同理运算相关组合节点,此时天气提醒规则的组合节点逻辑条件满足,触发该规则,访问该规则的结果节点,读取表达式语句,执行对应的动作,即进行天气提醒。
以上即是对本方案中的规则引擎中的规则拓扑图的介绍。接下来,基于上文所描述的规则引擎中规则拓扑图的部分或全部内容,对本申请实施例提供的一种规则引擎的执行方法进行介绍。可以理解的是,该方法是基于上文所描述的规则引擎中规则拓扑图提出,该方法中的部分或全部内容可以参见上文对规则引擎中规则拓扑图的描述。
请参阅图21,图21是本申请实施例提供的一种规则引擎的执行方法的流程示意图。可以理解,该方法可以通过任何具有计算、处理能力的装置、设备、平台、设备集群来执行。如图21所示,该规则引擎的执行方法包括:
步骤S101、确定输入规则引擎中的第一事实数据;根据第一事实数据的第一属性,从内存中获取第一语义对象对第一事实数据进行匹配,第一属性用于表征第一事实数据的变化频率。
本方案中,在使用规则引擎进行决策推理过程中,可以将事实数据输入至规则引擎中。当事实数据输入至规则引擎中后,即可以确定出第一事实数据。示例性的,事实数据可以由图17所示的根节点进入到规则引擎中。在一个例子中,第一事实数据可以为实体,意图等。
进一步地,确定出第一事实数据后,可以根据第一事实数据的第一属性,从内存中获取第一语义对象对第一事实数据进行匹配,第一属性用于表征第一事实数据的变化频率。在一个例子中,第一事实数据可以为时间或位置。在一个例子中,第一属性可以为类型,例如,当第一属性为时间类型时,则表明第一事实数据的变化频率较快。示例性的,该步骤可以由图17中所示的瞬态模式节点执行。
步骤S102、确定输入规则引擎中的第二事实数据;根据第二事实数据的第二属性,从文件中获取第二语义对象对第二事实数据进行匹配,第二属性用于表征第二事实数据的变化频率,其中,第二属性不同于第一属性。
本方案中,在使用规则引擎进行决策推理过程中,可以将事实数据输入至规则引擎中。当事实数据输入至规则引擎中后,即可以确定出第二事实数据。示例性的,事实数据可以由图1所示的根节点进入到规则引擎中。在一个例子中,第二事实数据可以为实体,意图等。
进一步地,确定出第二事实数据后,可以根据第二事实数据的第二属性,从文件中获取第二语义对象对第二事实数据进行匹配,第二属性用于表征第二事实数据的变化频率。在一个例子中,第二事实数据可以为年龄或季节。在一个例子中,第二属性可以为类型,例如,当第二属性为年龄类型时,则表明第二事实数据的变化频率较慢。在一个例子中,第二属性不同于第一属性,例如当第一属性为时间类型时,则第二属性可以为年龄类型。示例性的,该步骤可以由图17中所示的持久态模式节点执行。
步骤S103、根据第一事实数据对应的第一匹配结果和第二事实数据对应的第二匹配结果,确定是否执行第一操作。
本方案中,在得到第一事实数据对应的第一匹配结果和第二事实数据对应的第二匹配结果后,即可以根据第一匹配结果和第二匹配结果,确定是否执行第一操作。在一个例子中,第一操作可以为:提醒天气,提醒路况,提醒用户休息、娱乐或工作,推荐使用手册,或预加载动作或服务。示例性的,该步骤可以由图17中所示的组合节点执行。
进一步地,该方法中涉及的规则引擎可以包括第二节点,此时,该步骤S103可以具体为:当第一匹配结果指示匹配成功,且第二匹配结果指示匹配成功时,则可以从第二节点的语义索引指示的文件中获取第三语义对象,及执行第三语义对象对应的第一操作。示例性的,第二节点可以为图17中所示的持久态结果节点。此外,执行第三语义对象对应的第一操作可以由图17中所示的激活节点执行。
应理解的,上述步骤S101和步骤S102的执行顺序可以变换,本方案并不对此进行限定。例如先执行步骤S102,再执行步骤S101;或者,步骤S101和步骤S102同时执行,等等。
由此,本方案中,基于事实数据的属性,确定从内存或文件中加载语义对象,并基于确定的语义对象匹配事实数据,从而使得可以将规则引擎中的一部分用于匹配事实数据的语义对象存储至内存中,另一部分用于匹配事实数据的语义对象存储在文件中,进而可以释放一些冗余内存,降低了规则引擎运行过程中的内存开销,提升了规则引擎的能力。特别是,当规则引擎布置在对内存使用非常敏感的端侧平台时,基于本方案中的方法可以大幅降低端侧平台内存的开销,极大了提升了规则引擎在端侧平台的运行能力。可以理解的,本方案中提到的规则引擎的执行方法也可以应用在云侧执行,此时基于本方案中的方法则可以大幅降低云侧服务器资源的开销。应理解的是,当规则引擎的能力提升后,在利用该规则引擎进行意图识别、动作决策等时,则可以显著提升意图识别、动作决策等执行效率。尤其是,当输入到规则引擎的数据的输入方式为多模态输入时,输入的数据量较大且类型大多不同,例如有些数据变化较为频繁,而有些数据变化较慢,此时使用本方案中的规则引擎则可以从内存中加载语义对象对变化频繁的数据进行匹配,从文件中加载语义对象对变化较慢的数据进行匹配,从而可以避免变化较慢的数据对应的语义对象持续占用内存的情况,进而降低了规则引擎运行过程中的内存开销,提升了规则引擎的能力,以及提升了规则引擎的执行效率。
在一个例子中,该方法中涉及的规则引擎可以包括第一节点,该第一节点至少包括第一类型节点和第二类型节点,其中,第一类型节点与第一属性相关,第二类型节点与第二属性相关。此时,步骤S101中在确定输入规则引擎中的第一事实数据后,可以根据第一属性对应的第一类型节点的第一语义索引,从第一语义索引指示的内存中获取第一语义对象,及基于第一语义对象对第一事实数据进行匹配。示例性的,第一节点可以为图1中所示的模式节点,第一类型节点可以为图17中所示的瞬态模式节点。
此外,步骤S102在确定输入规则引擎中的第二事实数据后,则可以根据第二属性对应的第二类型节点的第二语义索引,从第二语义索引指示的文件中获取第二语义对象,及基于第二语义对象对第二事实数据进行匹配。示例性的,第二类型节点可以为图17中所示的持久态模式节点。
进一步地,在步骤S101中从第一语义索引指示的内存中获取第一语义对象之前,还可以先确定第一类型节点中记录的第一事实数据的变化次数与输入至规则引擎中的第一事实数据的变化次数不同。示例性的,第一类型节点中记录的第一事实数据的变化次数可以理解为图18中所示的模式节点的状态表中的modCount的值。在一个例子中,当第一类型节点中记录的第一事实数据的变化次数与输入至规则引擎中的第一事实数据的变化次数相同时,则可以 使用第一类型节点记录的前次匹配结果作为第一匹配结果。示例性的,第一类型节点记录的前次匹配结果可以理解为图18中所示的模式节点的状态表中的isMatached。
此外,在步骤S102中从第二语义索引指示的文件中获取第二语义对象之前,也可以先确定第二类型节点中记录的第二事实数据的变化次数与输入至规则引擎中的第二事实数据的变化次数不同。示例性的,第二类型节点中记录的第二事实数据的变化次数可以理解为图18中所示的模式节点的状态表中的modCount的值。在一个例子中,当第二类型节点中记录的第二事实数据的变化次数与输入至规则引擎中的第二事实数据的变化次数相同时,则可以使用第二类型节点记录的前次匹配结果作为第二匹配结果。示例性的,第二类型节点记录的前次匹配结果可以理解为图18中所示的模式节点的状态表中的isMatached。
在一个例子中,在重构规则引擎中的规则时,可以基于确定出的第一类型节点中记录的第一事实数据的变化次数,确定是否将第一类型节点切换为第二类型节点。具体地,当第一类型节点中记录的第一事实数据的变化次数小于预设次数阈值时,则表明此时第一事实数据的变化频率较低,此时若将第一类型节点中的语义对象存储在内存中,则存在内存长期被占用的情况,因此,可以此时可以将第一类型节点切换为第二类型节点。
同样的,在重构规则引擎中的规则时,可以基于确定出的第二类型节点中记录的第二事实数据的变化次数,确定是否将第二类型节点切换为第一类型节点。具体地,当第二类型节点中记录的第二事实数据的变化次数大于预设次数阈值时,则表明此时第二事实数据的变化频率较块,此时若将第二类型节点中的语义对象存储在文件中,则存在语义对象加载效率慢的情况,因此,可以此时可以将第二类型节点切换为第一类型节点。
以上即是对本方案中的规则引擎的执行方法的介绍。接下来,基于上文所描述的规则引擎中规则拓扑图的部分或全部内容,对本申请实施例提供的一种规则引擎进行介绍。可以理解的是,该规则引擎是基于上文所描述的规则引擎中规则拓扑图提出,该规则引擎所执行的的部分或全部内容可以参见上文对规则引擎中规则拓扑图的描述。
请参阅图22,图22是本申请实施例提供的一种规则引擎的结构示意图。如图22所示,该规则引擎包括:第一节点61。该第一节点61至少包括第一类型节点611和第二类型节点612。
其中,第一类型节点611可以用于根据输入规则引擎中的第一事实数据的第一属性,从内存中获取第一语义对象对第一事实数据进行匹配,得到第一匹配结果,第一属性用于表征第一事实数据的变化频率。第二类型节点612可以用于根据输入规则引擎中的第二事实数据的第二属性,从文件中获取第二语义对象对第二事实数据进行匹配,得到第二匹配结果,第二属性用于表征第二事实数据的变化频率,第二属性不同于第一属性。其中,第一匹配结果和第二匹配结果共同用于确定是否执行第一操作。示例性的,第一类型节点611可以为图17中所示的瞬态模式节点,第二类型612可以为图17中所示的持久态模式节点。
在一个例子中,第一事实数据包括时间和位置中的至少一项;第二事实数据包括年龄和季节中的至少一项。第一操作包括以下一项或多项:提醒天气,提醒路况,提醒用户休息、娱乐或工作,推荐使用手册,预加载动作或服务。
在一种实现中,第一类型节点611可以具体用于根据第一属性对应的第一语义索引,从第一语义索引指示的内存中获取第一语义对象,及基于第一语义对象对第一事实数据进行匹配。
第二类型节点612可以具体用于根据第二属性对应的第二语义索引,从第二语义索引指 示的文件中获取第二语义对象,及基于第二语义对象对第二事实数据进行匹配。
在一种实现中,第一类型节点611在从内存中获取第一语义对象对第一事实数据进行匹配之前,还可以用于确定第一类型节点611中记录的第一事实数据的变化次数与输入至规则引擎中的第一事实数据的变化次数不同。
在一种实现中,第二类型节点612在从文件中获取第二语义对象对第二事实数据进行匹配之前,还可以用于确定第二类型节点612中记录的第二事实数据的变化次数与输入至规则引擎中的第二事实数据的变化次数不同。
在一种实现中,第一类型节点611还可以用于在第一类型节点611中记录的第一事实数据的变化次数与输入至规则引擎中的第一事实数据的变化次数相同时,使用第一类型节点611记录的前次匹配结果作为第一匹配结果。
在一种实现中,第二类型节点612还可以用于在第二类型节点612中记录的第二事实数据的变化次数与输入至规则引擎中的第二事实数据的变化次数相同时,使用第二类型节点612记录的前次匹配结果作为第二匹配结果。
在一种实现中,该规则引擎还可以包括第二节点62。该第二节点62可以用于当第一匹配结果指示匹配成功,且第二匹配结果指示匹配成功时,从第二节点的语义索引指示的文件中获取第三语义对象,及执行第三语义对象对应的第一操作。示例性的,第二节点62可以为图1中所示的结果节点。
可以理解的是,该规则引擎中还可以包括第三节点,第四节点,第五节点和第六节点。其中,第三结点可以为图17中所示的根节点,第四节点可以为图1中所示的类型节点,第五节点可以为图17中所示的组合节点,第六节点可以为图17中所示的激活节点。其中,第一节点可以为图17中所示的模式节点,第二节点可以为图17中所示的结果节点。
可以理解的是,该规则引擎可以配置于任何具有计算、处理能力的装置、设备、平台、设备集群中。例如,该规则引擎可以配置于包含有处理器和存储器的设备中,其中,该设备可以为终端或服务器。
应当理解的是,上述规则引擎的实现原理和技术效果与上述对规则引擎中的规则拓扑图的描述类似,该规则引擎的工作过程可参考上述对规则引擎中的规则拓扑图中的对应过程,此处不再赘述。
在介绍完规则引擎606后,继续对意图识别决策系统501中的其他模块进行介绍。
继续参阅图15,意图识别决策系统501中的决策推理模块607用于为用户作出决策,即在何种设备上执行何种动作,决策执行的动作大部分为预加载动作或服务。决策推理模块607中可以维护有一个动作序列库,还可以包含有实体序列、意图和动作序列的对应关系。在一些简单场景中,决策推理模块607可以调用规则引擎606中的规则确定执行何种动作,在一些复杂场景中,决策推理模块607根据实体序列、意图和动作序列的对应关系确定在何种设备上执行何种动作。
在一个例子中,决策推理模块607中可以具有动作预测模型,该动作预测模型可以为用户做出决策。示例性的,动作预测模型可以基于上文有关意图识别模块605中意图识别模型的获取方式得到。
动作反馈模块608用于将预测出的动作序列和用户真实执行的动作序列作比较,以对预测结果是否正确做出反馈。动作反馈模块608的输入为决策推理模块607预测出的动作序列, 输出为预测结果和真实结果的比较,二者相同则反馈预测正确,反之反馈预测错误。动作反馈的结果可用于更新实体序列与意图的对应关系,以及实体序列、意图与动作序列的对应关系,例如若预测用户的意图是打开音乐播放器,决策执行的动作为后台预加载QQ音乐,但是用户实际打开的是网易云音乐,则此时动作反馈模块会将其记录下来,用于更新实体序列、意图与动作序列的对应关系。若预测用户的意图是打开音乐播放器,决策执行的动作为后台预加载QQ音乐,但用户实际操作为打开京东,则此时动作反馈模块会将其记录下来,用于更新实体序列与意图的对应关系,以及实体序列、意图与动作序列的对应关系。
在一个例子中,动作反馈模块608中可以包括多示例学习模型(图中未示出)。该多示例学习模型可以用于根据各待处理序列中连续的打点数据属于同一意图的可能性,将各待处理序列中可能不属于同一个意图的连续的打点数据划分到不同的粒度更小的子序列中,得到多个子序列。接着,动作反馈模块608可以按照预设意图规则确定出多个子序列中各个子序列的意图,其中,预设意图规则可以用于根据序列中的打点数据确定序列的意图。动作反馈模块608确定出各个子序列的意图后,即获知到用户真实执行的动作序列,进而将其与预测出的动作序列进行比较,并对预测结果是否正确做出反馈。
在一个例子中,动作反馈模块608中还可以包括多示例学习模型训练模块(图中未示出)。该多示例学习模型训练模块可以执行本方案中的多示例学习模型的训练方法。对于本方案中的多示例学习模型的训练方法详见下文描述。应理解的是,该多示例学习模型训练模块也可以配置于端侧或云侧,在此不做限定。
下面对各模块间的信息交互过程进行描述:
多模态输入模块601获取多种不同输入方式的数据,将获取到的数据发送到实体识别模块603。实体识别模块603中的实体提取单元6031从这些数据中提取特征向量,输入到从知识库602中获取的实体识别模型,输出得到识别出的实体。
由于知识库602中存储的实体识别模型是根据实体仓库单元6033中的实体训练出来的,因此,根据知识库602中的实体识别模型,实体提取单元6031即可以从这些数据中识别出实体仓库单元6033存储有的实体。在一个实体识别的时间窗格内,实体提取单元6031得到识别出的实体后,按照识别出的顺序发送给上下文模块604,由上下文模块604根据接收到的顺序保存为一个实体序列。所有历史接收到的实体按照接收到的顺序保存的实体序列可称为上下文实体。
上下文模块604将上下文实体中最新部分的实体序列(至少包含最近一个实体识别的时间窗格内识别出的实体组成的实体序列)发送给意图识别模块605。
意图识别模块605中的意图映射单元6051根据意图仓库单元6053中保存的实体序列与意图的对应关系,确定该实体序列对应的意图,将上下文模块604发送的实体序列以及意图映射单元6051确定好的意图发送给决策推理模块607。
决策推理模块607得到意图识别模块6051发送的意图和实体序列后,根据存储的实体序列、意图和动作序列的对应关系或从规则引擎606获取到的规则,确定动作序列,并发送给动作反馈模块608。
动作反馈模块608得到决策推理模块607确定的动作序列后,将该动作序列与用户真实执行的动作序列作比较,将比较结果发送至意图识别模块605和决策推理模块607。意图识别模块605根据比较结果更新意图仓库单元6053中存储的实体序列与意图的对应关系,决策 推理模块607根据比较结果更新存储的实体序列、意图与动作序列的对应关系。
以上即是对本方案中的图15所示的意图识别决策系统501的介绍。接下来,基于上文所描述的内容,对意图识别决策系统501中动作反馈模块608中的多示例模型的训练,多示例学习模型的更新过程等进行详细描述。
(1)多示例学习模型的训练方法
图23为本申请实施例中多示例学习模型的训练方法中一个数据流向示意图。图24为本申请实施例中多示例学习模型的训练方法中一个流程示意图。下面结合图23所示的数据流向示意图和图24所示的流程示意图,对本申请实施例中的多示例学习模型的训练方法进行描述:
S1301、电子设备确定初始打点数据序列;
打点数据为电子设备在本地记录的用户日常的操作数据。该初始打点数据序列中可以包括电子设备中出厂预置的打点数据和/或用户使用电子设备产生的打点数据组成。
具体对于打点数据的描述可以参阅上述术语介绍中的(11)打点数据,此处不再赘述。
该初始打点数据序列中的打点数据不需要人工标注,可作为训练数据训练多示例学习模型。
示例性的,图6所示的打点数据序列可以作为一个初始打点数据序列。
S1302、电子设备按照第一预设规则将该初始打点数据序列划分为多个分序列;
该第一预设规则用于将打点数据序列划分为不同的分序列,且一个分序列根据第二预设规则至少可以确定一个明确的意图,该第二预设规则用于确定序列的意图。具体对于第一预设规则和第二预设规则的描述可以参阅上述术语介绍中的(13)第一预设规则、第二预设规则和分序列,此处不再赘述。
示例性的,按照第一预设规则为:将用户每次从亮屏到息屏一系列连续操作产生的打点数据划分为一个分序列。第二预设规则为:用户息屏前关闭的最后一个使用的应用为用户的意图。可以将图6所述的打点数据序列划分为图7所示的多个分序列:B1、B2、B3。
电子设备可以将该S1302中得到的多个分序列,或S1307中得到的多个子序列,作为多个待处理序列,对该待处理序列进行特征提取训练多示例学习模型,并使用训练后的多示例学习模型将该待处理序列划分为粒度更小的序列,具体的,可以执行如下步骤:
S1303、电子设备确定该多个待处理序列中的示例和示例标签;
电子设备将多个待处理序列中相邻的两条打点数据组成一个示例。将位于同一个待处理序列中的两条打点数据组成的示例的示例标签确定为正,将位于不同待处理序列中的两条打点数据组成的示例的示例标签确定为负。具体的,对示例和示例标签的描述可以参阅上述术语描述中(14)多示例学习模型、示例和示例标签、包和包标签中对示例和示例标签的描述,此处不再赘述。
示例性的,图25为本申请实施例中确定示例和示例标签的一个示例性示意图。如图25所示,由12条打点数据组成的打点数据序列A1划分成了待处理序列B1、B2、B3。
按照多个待处理序列中相邻的两条打点数据组成一个示例,电子设备可以确定该待处理序列中的共11个示例:S1、S2、S3、S4、S5、S6、S7、S8、S9、S10、S11。
按照位于同一个待处理序列中的两条打点数据组成的示例的示例标签确定为正,将位于不同待处理序列中的两条打点数据组成的示例的示例标签确定为负,电子设备可以确定:
由同样位于待处理序列B1中的打点数据组成的示例S1、S2、S3、S4、S5、S6、S7的示例标签为正;
由同样位于待处理序列B2中的打点数据组成的示例S9的示例标签为正;
由同样位于待处理序列B3中的打点数据组成的示例S11的示例标签为正;
由分别位于待处理序列B1和B2中的打点数据组成的示例S8的示例标签为负;
由分别位于待处理序列B2和B3中的打点数据组成的示例S10的示例标签为负。
S1304、电子设备根据多个待处理序列、示例和示例标签,确定包和包标签;
电子设备确定示例和示例标签后,可以按照该示例和示例标签与多个待处理序列的关系,确定包和包标签。将由位于同一个待处理序列中的打点数据组成的示例共同作为一个包,且确定其包标签为正;将由位于一个待处理序列中的最后一个打点数据和与该待处理序列连续的下一个待处理序列中的第一个打点数据组成的示例作为一个包,且确定其包标签为负。具体的,对包和包标签的描述可以参阅上述术语描述中(14)多示例学习模型、示例和示例标签、包和包标签中对包和包标签的描述,此处不再赘述。
示例性的,图26为本申请实施例中确定包和包标签的一个示例性示意图。3个待处理序列B1、B2、B3中的11个示例共构成了5个包:
位于待处理序列B1中的打点数据组成的示例S1、S2、S3、S4、S5、S6、S7共同构成一个包L1,且其包标签为正;
位于待处理序列B2中的打点数据组成的示例S9构成一个包L3,且其包标签为正;
位于待处理序列B3中的打点数据组成的示例S11构成一个包L5,且其包标签为正;
位于待处理序列B1的最后一个打点数据和待处理序列B2的第一个打点数据组成的示例S8构成一个包L2,且其包标签为负;
位于待处理序列B2的最后一个打点数据和待处理序列B3的第一个打点数据组成的示例S10构成一个包L4,且其包标签为负。
S1305、电子设备从该包中提取包的特征向量矩阵。
电子设备可以从包中各示例中提取示例的特征,得到各示例的特征向量;然后将包中各示例的特征向量组成包的特征向量矩阵。具体的,对特征向量和特征向量矩阵的描述可以参阅上述术语描述中(16)打点数据序列包内示例的特征和包的特征向量矩阵的描述,此处不再赘述。
示例性的,图27为本申请实施例中提取包的特征向量矩阵的一个示例性示意图。以提取图26所示示例中得到的包L1的特征向量矩阵为例。包L1中包含示例S1、S2、S3、S4、S5、S6、S7。先分别提取各示例的特征,得到各示例的特征向量。假设各示例中的打点数据为JSON结构体,按如下9个维度提取各示例的特征为例:
(1)示例中第一条打点数据和第二条打点数据的JSON结构体的关键字的总个数;
(2)示例中第一条打点数据和第二条打点数据对应的JSON字符串的总长度
(3)示例中两条打点数据的应用程序包名的特征;
(4)示例中两条打点数据的时间戳的差;
(5)示例中两条打点数据间某些关键字的值是否相同;
(6)示例中第一条打点数据记录的操作的使用时间;
(7)示例中第一条打点数据的使用时间是否小于预设使用时间阈值;
(8)示例中第二条打点数据的使用时间是否大于平均使用时间;
(9)示例中第二条打点数据输入打点数据序列的持续时间是否小于平均持续时间。
可以理解的是,这里示例性的以上述每个特征都是一个维度的数据为例,在实际应用中,有些特征也可以是更多维度的数据,此处不作限定。
从而可以得到各示例的9维特征向量:
示例S1:
Figure PCTCN2021079723-appb-000005
示例S2:
Figure PCTCN2021079723-appb-000006
示例S3:
Figure PCTCN2021079723-appb-000007
示例S4:
Figure PCTCN2021079723-appb-000008
示例S5:
Figure PCTCN2021079723-appb-000009
示例S6:
Figure PCTCN2021079723-appb-000010
示例S7:
Figure PCTCN2021079723-appb-000011
然后可以将该包L1内7个示例的9维特征向量组成包的7*9的特征向量矩阵,得到包L1的特征向量矩阵N1:
Figure PCTCN2021079723-appb-000012
可以理解的是,在实际应用中,可以采用更多或更少的维度提取示例的特征向量,提取的各维度的特征也可以是其他类型,此处不作限定。示例的特征向量以及包的特征向量矩阵的表示和存储方式也可以采用其他的表示和存储方式,此处不作限定。
S1306、电子设备将各个包的特征向量矩阵和包标签输入多示例学习模型,得到训练后的多示例学习模型;
多示例学习模型为一种深度学习模型。电子设备得到各个包的特征向量矩阵后,将各包的特征向量矩阵和包标签依次输入多示例学习模型,得到训练后的多示例学习模型。
可以理解的是,可以将本申请实施例中还没有被训练过的多示例学习模型称为预置多示例学习模型。在将初始打点数据序列中提取的训练数据输入多示例学习模型进行训练之前,该多示例学习模型可以为一种预置多示例学习模型。该预置多示例学习模型可以为任一种还未训练过的多示例学习模型,例如ORLR模型,Citation-kNN模型,MI-SVM模型,C4.5-MI模型,BP-MIP模型,Ensemble Learning-MIP模型等,此处不作限定。
示例性的,图28为本申请实施例中训练多示例学习模型的一个示例性示意图。电子设备将从包L1提取出的特征向量矩阵N1和包L1的包标签“正”输入多示例学习模型,接着将从包L2提取的特征向量矩阵N2和包L2的包标签“负”输入多示例学习模型,接着将从包L3提取的特征向量矩阵N3和包L3的包标签“正”输入多示例学习模型,接着将从包L4提取的特征向量矩阵N4和包L4的包标签“负”输入多示例学习模型,接着将从包L5提取的特征向量矩阵N5和包L5的包标签“正”输入多示例学习模型,然后可以得到训练后的多示例学习模型。
S1307、电子设备将该多个待处理序列,输入训练后的多示例学习模型,得到多个子序列;
本申请实施例中,该多示例学习模型用于将各待处理序列划分为更小粒度的序列,该待 处理序列可以为使用该第一预设规则将打点数据序列划分成的分序列,可以为使用该多示例学习模型将该分序列划分成更小粒度后的子序列,也可以为使用该多示例学习模型将该子序列划分成更小粒度后的子序列。
得到训练后的多示例学习模型后,电子设备可以将该多个待处理序列输入该训练后的多示例学习模型,得到多个子序列,该多个子序列的数目大于等于该多个待处理序列的数目。
示例性的,图29为本申请实施例中多示例学习模型将多个待处理序列划分为多个更小粒度的子序列的示例性示意图。将待处理序列B1、B2、B3输入训练后的多示例学习模型后,该训练后的多示例学习模型可以生成子序列Z1、Z2、Z3、Z4,其中,待处理序列B1被划分成了粒度更小的子序列Z1和Z2。
S1308、电子设备确定该训练后的多示例学习模型的损失函数的值;
损失函数是衡量预测模型在能够预测预期结果方面的表现有多好的指标。每种机器学习模型都有其对应的损失函数。模型的预测结果越好,则损失函数的值越小。
电子设备得到训练后的多示例学习模型,并用该训练后的多示例学习模型将多个待处理序列划分为多个子序列后,可以得到该训练后的多示例学习模型的损失函数的值。
示例性的,如图29所示,采用训练后的多示例学习模型将待处理序列B1、B2、B3划分为子序列Z1、Z2、Z3、Z4后,电子设备通过采用的多示例学习模型对应的损失函数计算,确定该训练后的多示例学习模型的损失函数的值为10%。
S1309、电子设备确定该损失函数的值的减小幅度是否小于预设减小幅度;
电子设备得到训练后的多示例学习模型的损失函数的值后,可以确定该损失函数的值的减小幅度是否小于预设减小幅度。
由于在初次运行之前,电子设备还没有确定过该训练后的多示例学习模型的损失函数的值,因此,在电子设备第一次得到该训练后的多示例学习模型的损失函数的值后,可以直接默认确定该损失函数的值的减小幅度不小于预设减小幅度。
当该减小幅度不小于预设减小幅度时,电子设备可以将该多个子序列作为多个待处理序列,执行步骤S1303~S1309。
当该减小幅度小于预设减小幅度时,电子设备可以执行步骤S1310。
示例性的,图30为本申请实施例中多示例学习模型迭代训练的一个示例性示意图。电子设备可以将采用训练后的多示例学习模型将待处理序列B1、B2、B3划分得到的子序列Z1、Z2、Z3、Z4作为新的待处理序列,执行步骤S1303~S1309:
确定示例和示例标签,包和包标签,提取包的特征向量。从而得到7个包,及其相应的特征向量矩阵和包标签:LZ1:NZ1和正;LZ2:NZ2和负;LZ3:NZ3和正;LZ4:NZ4和负;LZ5:NZ5和正;LZ6:NZ6和负;LZ7:NZ7和正。依次输入该训练后的多示例学习模型,从而更新该训练后的多示例学习模型。
图31为本申请实施例中多示例学习模型迭代生成子序列的一个示例性示意图。电子设备可以将上一轮划分得到的子序列,即本轮的待处理序列:Z1、Z2、Z3、Z4输入得到的更新训练后的多示例学习模型,得到子序列Z1、Z2、Z3、Z4。
电子设备确定该更新训练后的多示例学习模型的损失函数的值还是10%。相比上一轮,损失函数的值的减小幅度为0,小于预设减小幅度5%,执行步骤S1310。
可以理解的是,根据打点数据序列中打点数据的特征不同,第一预设规则的不同,对子序列通过更新训练后的多示例学习模型重新划分后,可能得到更多的更小粒度的子序列,也 可能产生与输入相同的子序列,此处不作限定。
可以理解的是,若更新训练后的多示例学习模型将本轮的待处理序列划分为了更多的更小粒度的子序列,且得到的更新训练后的多示例学习模型的损失函数的值相比上一轮得到的训练后的多示例学习模型的损失函数的值的减小幅度不小于预设减小幅度,则可以将得到的多个子序列作为多个待处理序列,再次执行步骤S1303~S1309。直到某一轮损失函数的值的减小幅度不小于预设减小幅度,则执行步骤S1310。
可选的,在一些实施例中,在二分类多示例学习模型中,示例的标签只有两个值时,例如只有0和1,或,-1和1等时,上述损失函数可以为交叉熵损失函数,交叉熵损失函数以对多示例学习模型计算出的交叉熵作为损失函数的值。可以在确定某一轮训练得到的多示例学习模型计算出的交叉熵相比于上一轮训练得到的多示例学习模型计算出的交叉熵的减小幅度不小于预设减小幅度时,确定得到了训练完成的多示例学习模型。
S1310、电子设备确定该训练后的多示例学习模型为训练完成的多示例学习模型。
在确定本轮训练后的多示例学习模型的损失函数的值相比上一轮训练后的多示例学习模型的损失函数的值的减小幅度不小于预设减小幅度时,电子设备确定本轮训练后的多示例学习模型为使用该初始打点数据序列训练完成的多示例学习模型。
本申请实施例中,可以直接使用没有进行人工标注的初始打点数据序列对多示例学习模型进行训练,得到可以将打点数据序列划分为更小粒度的多个子序列的训练完成的多示例学习模型,实现了对用户打点数据的自标注。在大量节省了训练意图识别模型进行数据标注的人工成本的同时,使得数据的标注也更加准确,提升了意图识别的准确性。示例性的,当打点数据的输入方式为多模态输入时,由于打点数据的组成可以是多样化的,这使得人工标注训练数据的时间显著增加,而通过本申请实施例中的模型训练方法,则可以显著节省训练意图识别模型进行数据标注的人工成本,以及增加数据标注的准确性,进而提升意图识别的准确性。
(2)多示例学习模型的更新过程
图32为本申请实施例中多示例学习模型的更新过程一个数据流向示意图。图33为本申请实施例中多示例学习模型的更新过程一个流程示意图。下面结合图32所示的数据流向示意图和图33所示的流程示意图,对本申请实施例中的多示例学习模型的更新过程进行描述:
S2501、电子设备确定新增打点数据序列;
电子设备可以将在用户使用该电子设备的过程中,电子设备可以在本地记录用户的操作数据作为打点数据。电子设备可以在新产生的没有作为多示例学习模型的训练数据的打点数据累积达到预设数目阈值时,将这些打点数据组成新增打点数据序列;也可以将预设周期内(例如,每天或每周等)新产生的没有作为多示例学习的训练数据的打点数据组成新增打点数据序列,此处不作限定。
S2502、电子设备将该新增打点数据序列输入多示例学习模型,得到多个子序列;
对于之前已经训练完成的多示例学习模型,这里可以继续在之前训练完成的基础上继续使用新增打点数据训练,更新训练完成的多示例学习模型。这个过程也可以称为对多示例学习模型进行增量训练。
具体的,电子设备可以将该新增打点数据序列输入当前已经训练完成的多示例学习模型,得到多个子序列。具体可以参考步骤S2202,此处不再赘述。
电子设备可以将该S2502得到的多个子序列,或S2507中得到的多个子序列,作为多个待处理序列,对该待处理序列进行特征提取训练多示例学习模型,得到更新训练完成的多示例学习模型,具体的,可以执行如下步骤:
S2503、电子设备确定该多个待处理序列中的示例和示例标签;
S2504、电子设备根据多个待处理序列、示例和示例标签,确定包和包标签;
S2505、电子设备从该包中提取包的特征向量矩阵;
S2506、电子设备将各个包的特征向量矩阵和包标签输入多示例学习模型,得到训练后的多示例学习模型;
S2507、电子设备将该多个待处理序列,输入训练后的多示例学习模型,得到多个子序列;
S2508、电子设备确定该训练后的多示例学习模型的损失函数的值;
S2509、电子设备确定该损失函数的值的减小幅度是否小于预设减小幅度;
步骤S2503~S2509与步骤S1303~S1309类似,可参考对步骤S1303~S1309的描述,此处不再赘述。
S2510、电子设备确定该训练后的多示例学习模型为更新训练完成的多示例学习模型;
在确定本轮训练后的多示例学习模型的损失函数的值相比上一轮训练后的多示例学习模型的损失函数的值的减小幅度不小于预设减小幅度时,电子设备确定本轮训练后的多示例学习模型为使用该新增打点数据序列更新训练完成的多示例学习模型。
本申请实施例中,电子设备可以使用新增的打点数据组成新增打点数据序列对多示例学习模型进行更新训练,使得多示例学习模型更符合用户个性化的需求,且划分的子序列更加准确,从而使得意图识别结果更加符合用户期望。
可以理解的是,上面实施例中,多示例学习模型的训练方法和多示例学习模型的更新过程中步骤都可以由电子设备执行。在实际应用中,可选的,电子设备可以将打点数据序列发送给服务器,由服务器进行多示例学习模型训练后,将训练完成或更新训练完成的多示例学习模型发送给电子设备使用,此处不作限定。
示例性的,图34为本申请实施例中多示例学习模型的训练方法一个交互示意图。对于多实例学习模型的训练方法,其过程可以为:
S2601、电子设备确定初始打点数据序列;
与步骤S1301类似,此处不作赘述。
S2602、电子设备将给初始打点数据序列发送给服务器;
S2603、服务器按照第一预设规则将该初始打点数据序列划分为多个分序列;
S2604、服务器确定该多个待处理序列中的示例和示例标签;
S2605、服务器根据多个待处理序列、示例和示例标签,确定包和包标签;
S2606、服务器从该包中提取包的特征向量矩阵;
S2607、服务器将各个包的特征向量矩阵和包标签输入多示例学习模型,得到训练后的多示例学习模型;
S2608、服务器将该多个待处理序列,输入训练后的多示例学习模型,得到多个子序列;
S2609、服务器确定该训练后的多示例学习模型的损失函数的值;
S2610、服务器确定该损失函数的值的减小幅度是否小于预设减小幅度;
S2611、服务器确定该训练后的多示例学习模型为训练完成的多示例学习模型;
步骤S2603~S2611由服务器执行,其执行的具体动作与步骤S1302~S1310中电子设备执 行的具体动作类似,此处不作赘述。
S2612、服务器将该训练完成的多示例学习模型发送给电子设备。
本申请实施例中,由服务器完成多示例学习模型的训练工作,节省了电子设备的处理资源,提升了多示例学习模型的训练效率。
示例性的,图35为本申请实施例中多示例学习模型的更新训练过程一个交互示意图。对于多实例学习模型的更新训练,其过程可以为:
S2701、电子设备确定新增打点数据序列;
与步骤S2501类似,此处不作赘述。
S2702、电子设备将该新增打点数据序列发送给服务器;
S2703、服务器将该新增打点数据序列输入多示例学习模型,得到多个子序列;
S2704、服务器确定该多个待处理序列中的示例和示例标签;
S2705、服务器根据多个待处理序列、示例和示例标签,确定包和包标签;
S2706、服务器从该包中提取包的特征向量矩阵;
S2707、服务器将各个包的特征向量矩阵和包标签输入多示例学习模型,得到训练后的多示例学习模型;
S2708、服务器将该多个待处理序列,输入训练后的多示例学习模型,得到多个子序列;
S2709、服务器确定该训练后的多示例学习模型的损失函数的值;
S2710、服务器确定该损失函数的值的减小幅度是否小于预设减小幅度;
S2711、服务器确定该训练后的多示例学习模型为更新训练完成的多示例学习模型;
步骤S2703~S2111由服务器执行,其执行的具体动作与步骤S2502~S2510中电子设备执行的具体动作类似,此处不作赘述。
S2712、服务器将该更新训练完成的多示例学习模型发送给电子设备。
本申请实施例中,由服务器完成多示例学习模型的更新训练工作,节省了电子设备的处理资源,提升了多示例学习模型的更新训练效率。
可以理解的是,在多示例学习模型的更新训练效率提升的同时,可以使得多示例学习模型中的各个参数处于最佳状态,从而使得该多示例学习模型可以准确的确定出打点数据序列对应的子序列,进而可以基于确定出的子序列准确的识别出用户的意图,提升了用户意图识别的准确性。
以上即是对动作反馈模块608中的多示例模型的训练,多示例学习模型的更新过程等的介绍。接下来对意图识别决策系统501中意图识别模块605中的意图识别模型的训练进行介绍。
(1)利用生成式对抗网络的特性,得到意图识别模型
需要说明的是,本方案中利用生成式对抗网络的特性,得到意图识别模型,可以基于图36所示的人工智能框架,以及图37和38所示的应用环境实现。
其中,图36示出一种人工智能主体框架示意图,该主体框架描述了人工智能系统总体工作流程,适用于通用的人工智能领域需求。
下面从“智能信息链”(水平轴)和“IT价值链”(垂直轴)两个维度对上述人工智能主题框架进行阐述。
“智能信息链”反映从数据的获取到处理的一列过程。举例来说,可以是智能信息感知、 智能信息表示与形成、智能推理、智能决策、智能执行与输出的一般过程。在这个过程中,数据经历了“数据—信息—知识—智慧”的凝练过程。
“IT价值链”从人智能的底层基础设施、信息(提供和处理技术实现)到系统的产业生态过程,反映人工智能为信息技术产业带来的价值。
(a)基础设施
基础设施为人工智能系统提供计算能力支持,实现与外部世界的沟通,并通过基础平台实现支撑。通过传感器与外部沟通;计算能力由智能芯片(CPU、NPU、GPU、ASIC、FPGA等硬件加速芯片)提供;基础平台包括分布式计算框架及网络等相关的平台保障和支持,可以包括云存储和计算、互联互通网络等。
举例来说,传感器和外部沟通获取数据,这些数据提供给基础平台提供的分布式计算系统中的智能芯片进行计算。
(b)数据
基础设施的上一层的数据用于表示人工智能领域的数据来源。数据涉及到图形、图像、语音、文本,还涉及到传统设备的物联网数据,包括已有系统的业务数据以及力、位移、液位、温度、湿度等感知数据。
(c)数据处理
数据处理通常包括数据训练,机器学习,深度学习,搜索,推理,决策等方式。
其中,机器学习和深度学习可以对数据进行符号化和形式化的智能信息建模、抽取、预处理、训练等。
推理是指在计算机或智能系统中,模拟人类的智能推理方式,依据推理控制策略,利用形式化的信息进行机器思维和求解问题的过程,典型的功能是搜索与匹配。
决策是指智能信息经过推理后进行决策的过程,通常提供分类、排序、预测等功能。
(d)通用能力
对数据经过上面提到的数据处理后,进一步基于数据处理的结果可以形成一些通用的能力,比如可以是算法或者一个通用系统,例如,翻译,文本的分析,计算机视觉的处理,语音识别,图像的识别等等。
(e)智能产品及行业应用
智能产品及行业应用指人工智能系统在各领域的产品和应用,是对人工智能整体解决方案的封装,将智能信息决策产品化、实现落地应用,其应用领域主要包括:智能制造、智能交通、智能家居、智能医疗、智能安防、自动驾驶,平安城市,智能终端等。
下面将对本方案中涉及的神经网络的训练过程进行示例性的说明。
参见图37,本发明实施例提供了一种应用环境示意图200,示例性地,本申请实施例所涉及的服务器可以为图37中的执行设备210,客户端可以为图37所示的客户设备240。
数据采集设备260用于模拟数据和、或测试数据作为输入数据并存入数据库230,训练设备220基于数据库230中维护的输入数据生成目标模型/规则201。下面将更详细地描述训练设备220如何基于输入数据得到目标模型/规则201。
深度神经网络中的每一层的工作可以用数学表达式来描述:从物理层面深度神经网络中的每一层的工作可以理解为通过五种对输入空间(输入向量的集合)的操作,完成输入空间到输出空间的变换(即矩阵的行空间到列空间),这五种操作包括:1、升维/降维;2、放大/缩小;3、旋转;4、平移;5、“弯曲”。其中1、2、3的操作由完成,4的操作由完成, 5的操作则由来实现。这里之所以用“空间”二字来表述是因为被分类的对象并不是单个事物,而是一类事物,空间是指这类事物所有个体的集合。其中,是权重向量,该向量中的每一个值表示该层神经网络中的一个神经元的权重值。该向量决定着上文所述的输入空间到输出空间的空间变换,即每一层的权重控制着如何变换空间。训练深度神经网络的目的,也就是最终得到训练好的神经网络的所有层的权重矩阵(由很多层的向量形成的权重矩阵)。因此,神经网络的训练过程本质上就是学习控制空间变换的方式,更具体的就是学习权重矩阵。
因为希望深度神经网络的输出尽可能的接近真正想要预测的值,所以可以通过比较当前网络的预测值和真正想要的目标值,再根据两者之间的差异情况来更新每一层神经网络的权重向量(当然,在第一次更新之前通常会有初始化的过程,即为深度神经网络中的各层预先配置参数)。比如,如果网络的预测值高了,就调整权重向量让它预测低一些,不断的调整,直到神经网络能够预测出真正想要的目标值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么深度神经网络的训练就变成了尽可能缩小这个loss的过程。
训练设备220得到的目标模型/规则可以应用不同的系统或设备中。在图37中,执行设备210配置有I/O接口212,与外部设备进行数据交互,“用户”可以通过客户设备240向I/O接口212输入数据。
执行设备210可以调用数据存储系统250中的数据、代码等,也可以将数据、指令等存入数据存储系统250中。其中,本申请实施例中的信号检测装置可以包括该执行设备210实现神经网络的处理过程,或者是通过外接该执行设备110以实现神经网络的处理过程,此处不做限定。
计算模块211使用目标模型/规则201对输入的数据进行处理。
最后,I/O接口212将处理结果返回给客户设备240,提供给用户。
更深层地,训练设备220可以针对不同的目标,基于不同的数据生成相应的目标模型/规则201,以给用户提供更佳的结果。
在附图37中所示情况下,用户可以手动指定输入执行设备210中的数据,例如,在I/O接口212提供的界面中操作。另一种情况下,客户设备240可以自动地向I/O接口212输入数据并获得结果,如果客户设备240自动输入数据需要获得用户的授权,用户可以在客户设备240中设置相应权限。用户可以在客户设备240查看执行设备210输出的结果,具体的呈现形式可以是显示、声音、动作等具体方式。客户设备240也可以作为数据采集端将采集到的数据存入数据库230。
值得注意的,附图37仅是本发明实施例提供的一种系统架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制,例如,在附图37中,数据存储系统250相对执行设备210是外部存储器,在其它情况下,也可以将数据存储系统250置于执行设备210中。
参见图38,本发明实施例提供了另一种应用环境示意图300,示例性地,本申请实施例所涉及的服务器可以为图38中的执行设备310,客户端可以为图38所示的本地设备301及本地设备302。执行设备310由一个或多个服务器实现,可选的,与其它计算设备配合,例如:数据存储、路由器、负载均衡器等设备;执行设备310可以布置在一个物理站点上,或者分布在多个物理站点上。执行设备310可以使用数据存储系统350中的数据,或者调用数 据存储系统350中的程序代码实现相关步骤操作。
用户可以操作各自的用户设备(例如本地设备301和本地设备302)与执行设备310进行交互。每个本地设备可以表示任何计算设备,例如个人计算机、计算机工作站、智能手机、平板电脑、智能摄像头、智能汽车或其他类型蜂窝电话、媒体消费设备、可穿戴设备、机顶盒、游戏机等。
每个用户的本地设备可以通过任何通信机制/通信标准的通信网络与执行设备310进行交互,通信网络可以是广域网、局域网、点对点连接等方式,或它们的任意组合。
在另一种实现中,执行设备310的一个方面或多个方面可以由每个本地设备实现,例如,本地设备301可以为执行设备310提供本地数据或反馈计算结果。
需要注意的,执行设备310的所有功能也可以由本地设备实现。例如,本地设备301实现执行设备310的的功能并为自己的用户提供服务,或者为本地设备302的用户提供服务。
目前,在AI领域中,研究人员可以按照不同的需求,通过深度学习的方式得到不同的预测模型,并通过预测模型实现相对应的人工智能应用。以模拟数据生成的应用为例,一般来说,客户端需要预先采集用户的真实数据并发送给服务器,然后服务器经过机器学习的方法进行训练,提取得到真实数据对应的数据特征,然后根据数据特征生成模拟数据。该方法可以应用于前述图37或图38所示应用环境中。
具体来说,通过该方法的传统实现过程可以存在以下两种示例性的实施方案。
一种实现过程中,可以利用存储在客户端设备上的用户数据集来标记训练数据,而无需将用户数据暴露给训练服务器。使用由服务器提供的生成式对抗网络(GAN)和少量有标签数据样本,客户端设备可以基于存储在客户端设备中的用户数据执行半监督学习。然后可以将无标签训练数据单元提供给客户端设备。客户端设备上的已训练模型可以生成由服务器提供的无标签训练数据单元的拟议标签。由客户端设备提供的拟议标签被私有化,以掩蔽拟议标签与提出该标签的用户和/或客户端设备之间的关系。可以在服务器上分析拟议标签集,以确定无标签数据单元最受欢迎的拟议标签。一旦标记了训练数据集中的每个数据单元,则服务器可以使用该训练数据集来训练未训练的机器学习模型或改善预训练模型的准确性。在该实现过程中,存在的缺点至少包括:需要收集有便签的真实数据;且使用众多设备对无便签数据进行拟标注,标注结果是有偏的。然后进行生成对抗网络训练,训练结果不能完全拟合真实数据。
另一种实现过程中,可以用于已有基于深度学习的视频分类模型的数据增强,具体包括以下步骤:1)构建视频各动作类别的动态信息图像;2)利用各类所述动态信息图像分别训练生成相应动作类别动态信息图像的生成对抗网络;3)利用训练好的生成对抗网络生成所需数量的动态信息图像;4)将步骤1)和步骤3)两种方法生成的动态信息图像按比例混合后作为训练数据,对已有基于深度学习的视频分类模型进行训练。在该实现过程中,存在的缺点至少包括:少量真实数据可能是有偏的,构建的生成对抗网络生成器生成的数据也可能是有偏的。
此外,也可以通过生成式对抗网络(generative adversarial networks,GAN)建立的学习框架包括一个神经网络(生成器)试图生成接近真实的数据和另一个网络(判别器)试图区分真实的数据和由生成网络生成的数据。第一阶段,固定判别器,训练生成器,使生成的数据能“骗过”判别器,判别器无法区分真实数据与生成数据;第二阶段固定生成器,训练判别器,提高判别器的鉴别能力,以区分真实数据与生成数据。两个阶段不断循环,生成器网络使用判别器作为损耗函数,并更新其参数以生成看起来更真实的数据,使生成数据无限 接近真实数据。然而,传统基于GAN生成数据方案大多是使用真实环境数据,仅需要考虑原始数据分布与生成数据分发是否一致。
在上述模拟数据生成的应用中,仅考虑真实数据的特征分布来生成模拟数据,而由于参与训练的真实数据是有限的,存在一定的偏差,容易导致所生成的模拟数据也存在相应的偏差,使得所生成的模拟数据质量较差。具体来说,传统方案中都是使用真实环境数据,仅需要考虑生成的数据分布是否与原始数据分布一致,并没有考虑原始数据分布可能与真实数据是有偏差的。然而,在意图识别训练过程中,由于隐私条款等因素,很多业务只能从现网收集到运营打点,而并非原始数据。要想收集比较全的数据,依赖于有限的签约测试(Beta)用户数据,能够收集的数据量有限,且Beta用户的分布往往不能得到保证,与真实现网用户数据分布有很大差异,导致模型训练效果与真实现网使用的效果有比较大的差距。另一方面,由于训练出来的模型发布到现网,通过运营数据再重新进行调整模型参数,整个模型调优、反馈的周期比较长。
也即是说,本方案中可以利用少量、有偏的训练数据,构造无偏意图识别模型。基于有偏的训练数据,以及真实环境的反馈数据,构造能够生成无偏虚拟数据的模拟器,在模拟器上进行训练从而得到无偏模型。
请参阅图39,本申请实施例提供了一种基于神经网络的数据处理方法,包括如下步骤:
S201、将测试数据输入至第一生成器,经过所述第一生成器处理后得到第一模拟数据。
本实施例中,服务器将测试数据作为第一生成器的输入,经过该第一生成器处理后得到测试数据对应的第一模拟数据。
S202、将所述测试数据和所述第一模拟数据输入至所述第一判别器,经过所述第一判别器处理后得到第一判别结果;
本实施例中,服务器将步骤S201中的测试数据和第一模拟数据输入至第一判别器,经过第一判别器处理后得到第一判别结果,其中,第一判别结果用于指示该测试数据和该第一模拟数据之间的差异。
本实施例中,第一判别器可以是神经网络或者是其他机器学习、强化学习模型等,用于判断一条给定数据是测试数据还是虚拟生成的第一模拟数据。通过优化2分类的分类损失(hinge loss,logit loss,mse等),使得第一判别器能完全区分测试数据还是虚拟生成的第一模拟数据。
S203、根据所述第一判别结果更新所述第一生成器的权重系数,得到第二生成器;
本实施例中,服务器根据步骤202处理得到的第一判别结果更新第一生成器中的权重系数,得到第二生成器。
在一种可能的实现方式中,服务器根据该第一判别结果更新该第一生成器的权重系数,得到第二生成器包括:若满足第二条件,则根据该第一判别结果更新该第一生成器的权重系数,得到第二生成器;其中,该第二条件包括:在该测试数据和该第一模拟数据之间的经验分布度量小于第四预设值时;和/或,在该第一判别器对应的损失函数的取值大于第五预设值时。
本实施例中,服务器可以在满足上述第二条件时再执行根据第一判别结果更新该第一生成器的权重系数的过程,即通过第二条件的限制,在第一判别器的模型效果达到一定条件时,服务器才执行更新第一生成器的权重系数的过程,可以进一步优化更新得到的第二生成器所生成的第二模拟数据的数据质量。
具体地,在该第二条件中,该测试数据和该第一模拟数据之间的经验分布度量小于第四预设值,即使得测试数据和该第一模拟数据之间的经验分布度量最小化。其中,经验分布度量具体可以包括KL散度(KL divergence)、瓦瑟斯坦距离(Wasserstein distance)或者其它的取值实现,此处不做限定。此外,第四预设值的大小可以按照方案实施场景的不同选用不同的取值,例如0.001、0.01或者其它的取值,此处不做限定。
类似的,在该第二条件中,第一判别器对应的损失函数的取值大于第五预设值,即使得第一判别器对应的损失函数的取值最大化。其中,第一判别器的损失函数可以通过铰链损失函数(hinge loss function)、交叉熵损失函数(cross-entropy loss function)、指数损失函数(exponential loss function)或者是通过其它的损失函数对应实现,此处不做限定。此外,第五预设值的大小也可以按照损失函数的不同设置而选用不同的取值,此处不做限定。
在一种可能的实现方式中,在该第二生成器中生成第二模拟数据之前,若不满足该第二条件时,该方法还包括:将该测试数据输入至该第二生成器,经过该第二生成器处理后得到第四模拟数据;将该测试数据和该第四模拟数据输入至该第一判别器,经过该第一判别器处理后得到第三判别结果,该第三判别结果用于指示该测试数据和该第四模拟数据之间的差异;根据该第三判别结果更新该第二生成器的权重系数。
本实施例中,服务器可以在不满足上述第二条件时,执行将测试数据输入至第二生成器,并通过第一判别器的进一步处理得到用于更新第二生成器的第三判别结果,即可以进一步利用生成式对抗网络的特性,对第二生成器的权重系数进行优化。
S204、在所述第二生成器中生成第二模拟数据。
本实施例中,服务器根据步骤S203所更新得到的第二生成器中,生成第二模拟数据。
本实施例中,服务器首先将测试数据输入至第一生成器,经过该第一生成器处理后得到第一模拟数据;然后,服务器将该测试数据和该第一模拟数据输入至该第一判别器,经过该第一判别器处理后得到第一判别结果,该第一判别结果用于指示该测试数据和该第一模拟数据之间的差异;此后,服务器再根据该第一判别结果更新该第一生成器的权重系数,得到第二生成器;最后,服务器在该第二生成器中生成第二模拟数据。其中,服务器通过生成式对抗神经网络中的第一生成器和第一判别器的处理过程,对第一生成器中权重系数的进行更新优化以得到第二生成器,利用生成式对抗网络的特性,降低在生成器中所生成的模拟数据与原始输入的测试数据之间的偏差,从而,提升神经网络所生成的模拟数据的数据质量,进而为后续基于该模拟数据训练意图识别模型提供了良好的基础,使得后续训练出的意图识别模型的精准度较高,进而提升了意图识别的准确性。
请参阅图41a,本申请实施例提供了一种基于神经网络的数据处理方法,包括如下步骤。
S301、将测试数据输入至第一生成器,经过所述第一生成器处理后得到第一模拟数据;
本实施例中,服务器将测试数据作为第一生成器的输入,经过该第一生成器处理后得到测试数据对应的第一模拟数据。
本实施例中,测试数据可以是使用少量的测试(Beta)数据,可选的,还可以加上人工标注的训练数据,即人工标注数据可以和Beta用户数据合在一起,作为Beta用户数据的扩充得到步骤S201中的测试数据。此处对Beta数据的获取进行示例性说明,其中,Beta用户原始数据格式如图40所示,“手机太亮太费电了”、“让屏幕色温恢复正常”、“使用偏冷显示”、“手机的亮度太暗了”、“不在主页面显示所有图标”是用户使用数据采集设备的语音助手说的语料。(“10(设置)setting”、“5(桌面)smarthome”、“5(时钟)clock”、“5(旅 行助手)tripassistant”)是用户可能想要执行的意图,前面的数字越大,代表该意图越符合用户预期。通过提取语料的word2Vec或n-gram特征,以及一些召回来源、召回类别等特征,可以将原始特征映射为用于训练的数字特征向量,并将该数字特征向量作为测试数据的一种实现。显然,针对不同的应用实现场景,测试数据还可以通过一维或者是多维的张量形式实现,而不仅仅限于向量这一实现,此处不作具体的限定。
S302、将所述测试数据和所述第一模拟数据输入至所述第一判别器,经过所述第一判别器处理后得到第一判别结果,所述第一判别结果用于指示所述测试数据和所述第一模拟数据之间的差异;
本实施例中,服务器将步骤S301中的测试数据和第一模拟数据输入至第一判别器,经过第一判别器处理后得到第一判别结果,其中,第一判别结果用于指示该测试数据和该第一模拟数据之间的差异。
S303、根据所述第一判别结果更新所述第一生成器的权重系数,得到第二生成器;
本实施例中,服务器根据步骤302处理得到的第一判别结果更新第一生成器中的权重系数,得到第二生成器。
S304、在所述第二生成器中生成第二模拟数据。
本实施例中,服务器根据步骤S303所更新得到的第二生成器中,生成第二模拟数据。
本实施例中,步骤S301至步骤S304的实现过程可以参考前述步骤S201至步骤S204的实现过程,此处不再赘述。
S305、利用第一目标模拟数据输入预设的训练网络,训练得到预测模型。
本实施例中,服务器利用第一目标模拟数据输入预设的训练网络,训练得到预测模型,其中,该第一目标模拟数据包括步骤S304得到的第二模拟数据。可以理解的是,由于第一目标模拟数据的数据质量较高,因此基于该第一目标模拟数据训练出的预测模型的精准度也较高。当该预测模型为意图识别模型时,则该意图识别模型的意图识别的准确性也较高,也即是说,通过图41a中的方法训练得到的意图识别模型可以准确的识别出用户的意图。
在本申请实施例第一方面的一种可能的实现方式中,该预测模型为意图决策模型(如:意图识别模型)。
本实施例中,该方法可以应用于意图决策判别过程中,相对应的,该预测模型在该过程中可以为意图决策模型(如:意图识别模型),从而,提供了该预测模型的一种具体的实现方式,提升方案的可实现性。此外,该预测模型还可以应用于其它的应用场景实现对应的模型,例如该预测模型还可以为感知模型、推理模型或者是其它的模型实现,此处不做限定。
在一种可能的实现方式中,该第一目标模拟数据还包括该测试数据。
本实施例中,服务器输入到预设的训练网络进行训练得到预测模型的输入数据中,该第一目标模拟数据还可以包括测试数据,可以进一步丰富训练网络的输入,使得训练网络可以训练得到更多的数据特征,从而提升预测模型在后续执行预测过程的预测效果。
在一种可能的实现方式中,服务器在该第二生成器中生成第二模拟数据之后,该方法还包括:该服务器利用第一目标模拟数据输入预设的训练网络,训练得到预测模型,该第一目标模拟数据包括该第二模拟数据。
本实施例中,服务器可以利用生成式对抗网络得到的第二生成器所生成的第二模拟数据,作为预设的训练网络的输入数据的一部分,进行训练得到预测模型,由于该第二模拟数据与原始输入的测试数据之间的偏差较小,因此,通过该第二模拟数据参与训练网络的训练过程, 可以提升后续得到的预测模型的预测效果,使得在模拟环境中训练得到较优的预测模型。
S306、将第二目标模拟数据输入所述预测模型,经过所述预测模型处理得到目标预测结果。
本实施例中,服务器将第二目标模拟数据输入步骤S305得到的预测模型,经过该预测模型处理得到目标预测结果,其中,该第二目标模拟数据包括步骤S304得到的第二模拟数据。
在一种可能的实现方式中,该方法还包括:服务器将第二目标模拟数据输入该预测模型,经过该预测模型处理得到目标预测结果,该第二目标模拟数据包括该第二模拟数据。
本实施例中,服务器可以利用生成式对抗网络得到的第二生成器所生成的第二模拟数据,作为预测模型的输入数据的一部分,即得到所生成的模拟数据在预测模型中对应的目标预测结果,解决预测模型中训练数据过少的问题。
S307、向客户端发送所述预测模型;
本实施例中,服务器向客户端发送步骤S305得到的预测模型。
S308、获取用户操作数据;
本实施例中,客户端获取得到用户操作数据。
在一种可能的实现方式中,客户端获取用户操作数据的过程具体包括:客户端响应于用户操作,获取该用户操作对应的初始操作数据;此后,该客户端提取该初始操作数据的数据特征,得到该用户操作数据。
本实施例中,客户端可以通过获取用户操作对应的初始操作数据并进行特征提取的方式,获取得到输入到预测模型中的用户操作数据,提供了客户端获取用户操作数据的一种具体的实现方式,提升方案的可实现性。
S309、将所述用户操作数据输入至所述预测模型,经过训练得到初始预测结果;
本实施例中,客户端将步骤S308得到的用户操作数据输入至步骤S307接收得到的预测模型,经过训练得到初始预测结果。
S310、向所述服务器发送所述初始预测结果,所述初始预测结果用于作为判别器的输入,经过所述判别器的处理得到用于更新生成器权重系数的判别结果。
本实施例中,客户端向所述服务器发送所述初始预测结果,其中,该初始预测结果用于作为判别器的输入,经过判别器的处理得到用于更新生成器权重系数的判别结果;相应的,服务器在步骤S310中,接收所述客户端发送的初始预测结果,所述初始预测结果为所述预测模型对用户操作数据进行训练得到。
S311、将所述目标预测结果和所述初始预测结果输入至第二判别器进行训练,输出第二判别结果;
本实施例中,服务器将步骤S306得到的目标预测结果和步骤S310接收得到的初始预测结果输入至第二判别器进行训练,输出第二判别结果,其中,该第二判别结果用于指示目标预测结果和初始预测结果之间的差异。
本实施例中,第二判别器可以是神经网络或者是其他机器学习、强化学习模型等,用于判断一条给定输出数据是由开发环境虚拟数据使用模型预测产生的目标预测结果还是由现网环境真实数据使用模型预测产生的初始预测结果。通过优化2分类的分类损失(hinge loss,logit loss,mse等),使得第二判别器能完全区分目标预测结果和初始预测结果。
S312、根据所述第二判别结果更新所述第二生成器的权重系数,得到第三生成器;
本实施例中,服务器根据步骤S311得到的第二判别结果更新第二生成器的权重系数,得 到第三生成器。
S313、在所述第三生成器中生成第三模拟数据。
本实施例中,服务器在步骤S312得到的第三生成器中生成第三模拟数据。
在一种可能的实现方式中,该方法还包括:服务器向客户端发送该预测模型;然后,该服务器接收该客户端发送的初始预测结果,该初始预测结果为该预测模型对用户操作数据进行训练得到;此后,服务器将该目标预测结果和该初始预测结果输入至第二判别器进行训练,输出第二判别结果,该第二判别结果用于指示该目标预测结果和该初始预测结果之间的差异;进一步地,该服务器根据该第二判别结果更新该第二生成器的权重系数,得到第三生成器;最后,服务器在该第三生成器中生成第三模拟数据。
本实施例中,服务器可以向客户端发送该预测模型,并接收客户端使用用户操作数据在该预测模型中进行训练得到的初始预测结果,并将通过模拟数据在该预测模型中得到的目标预测结果和该初始预测结果一并作为第二判别器的输入,得到用于更新第二生成器的权重系数,更新第二生成器得到第三生成器,并在该第三生成器中生成第三模拟数据。其中,第三模拟数据为服务器使用第二判别器对第二生成器进行权重系数更新得到的,相比于第二生成器所生成的第二模拟数据,第三模拟数据可以进一步利用生成式对抗网络的特性,实现在第三生成器中所生成的第三模拟数据与原始输入的测试数据之间的偏差的进一步降低,从而,进一步提升神经网络所生成的模拟数据的数据质量,进而为后续基于该模拟数据训练意图识别模型提供了良好的基础,使得后续训练出的意图识别模型的精准度较高,进而提升了意图识别的准确性。
在一种可能的实现方式中,服务器根据该第二判别结果更新该第二生成器的权重系数,得到第三生成器包括:若满足第一条件,则根据该第二判别结果更新该第二生成器的权重系数,得到该第三生成器;其中,该第一条件包括:在该目标预设结果和该初始预测结果之间的经验分布度量小于第一预设值时;和/或,在该第二判别器对应的损失函数的取值大于第二预设值时;和/或,在该预测模型的损失函数小于第三预设值时。
本实施例中,服务器可以在满足上述第一条件时再执行根据第二判别结果更新第二生成器的权重系数的过程,即通过第一条件的限制,在第二判别器和/或预测模型的模型效果达到一定条件时,服务器才执行更新第二生成器的权重系数的过程,可以进一步优化更新得到的第三生成器所生成的第三模拟数据的数据质量。
具体地,在该第一条件中,该目标预设结果和该初始预测结果之间的经验分布度量小于第一预设值,即使得目标预设结果和该初始预测结果之间的经验分布度量最小化。其中,经验分布度量具体可以包括KL散度(KL divergence)、瓦瑟斯坦距离(Wasserstein distance)或者其它的取值实现,此处不做限定。此外,第一预设值的大小可以按照方案实施场景的不同选用不同的取值,例如0.001、0.01或者其它的取值,此处不做限定。
类似的,在该第一条件中,第二判别器对应的损失函数的取值大于第二预设值,即使得第二判别器对应的损失函数的取值最大化。其中,第二判别器的损失函数可以通过铰链损失函数(hinge loss function)、交叉熵损失函数(cross-entropy loss function)、指数损失函数(exponential loss function)或者是通过其它的损失函数对应实现,此处不做限定。此外,第二预设值的大小也可以按照损失函数的不同设置而选用不同的取值,此处不做限定。类似的,在该第一条件中,预测模型对应的损失函数的取值大于第三预设值,即使得预测模型对应的损失函数的取值最大化。其中,预测模型的损失函数可以通过铰链损失函数(hinge loss  function)、交叉熵损失函数(cross-entropy loss function)、指数损失函数(exponential loss function)或者是通过其它的损失函数对应实现,此处不做限定。此外,第三预设值的大小也可以按照损失函数的不同设置而选用不同的取值,此处不做限定。
本实施例中,客户端可以根据使用用户操作数据作为服务器所发送的预测模型的输入数据,并训练得到初始预测结果之后,向该服务器发送初始预测结果,其中,该初始预测结果用于作为判别器的输入,经过该判别器的处理得到用于更新生成器权重系数的判别结果,使得服务器可以利用生成式对抗网络的特性,降低在生成器中所生成的模拟数据与原始输入的测试数据之间的偏差,从而,提升神经网络所生成的模拟数据的数据质量,进而为后续基于该模拟数据训练意图识别模型提供了良好的基础,使得后续训练出的意图识别模型的精准度较高,进而提升了意图识别的准确性;此外,由于客户端仅需要向服务器发送用户操作数据对应的初始预测结果,相比于客户端向服务器发送用户操作数据的方式,可以避免用户的隐私泄露,从而提升用户体验。
下面将通过一个具体的实现示例对图39及图6-1所涉及的步骤过程进行描述。
如图41b所示,前述实施例中,服务器可以置于开发环境中,客户端可以置于真实(现网)环境中。在图41b中,“生成器”经过多次处理过程,可以分别实现本方案中“第一生成器”、“第二生成器”、“第三生成器”所对应的步骤实现;“训练数据判别器”可以实现本方案中“第一判别器”所对应的步骤实现;“输出数据判别器”可以实现本方案中“第二判别器”所对应的步骤实现;“模型”可以实现本方案中“预测模型”所对应的步骤实现。
基于图41b所示架构,服务器生成无偏虚拟数据、无偏模型以及输出数据的流程可以分为以下六步:
a)开发环境中生成器生成虚拟数据;
b)基于GAN用训练判别器作为损失函数区别Beta数据与生成器生成的虚拟数据,优化二分类的分类损失,使训练数据判别器可以完全区分Beta数据和虚拟数据;
c)优化模拟环境生成器参数,使生成的虚拟数据分布与Beta数据的分布无限接近,从而认为虚拟数据训练模型与Beta数据训练模型的效果是一致的;
d)利用Beta数据和大量生成器生成的虚拟数据来训练模型,并使用虚拟数据通过模型得到预测结果,解决训练数据过少的问题;
e)将开发环境训练的模型下发到端侧真实环境中,利用真实数据通过模型得到预测结果,并将输出结果返回云侧(开发环境);
f)利用输出数据判别器可完全区分虚拟数据的预测结果和真实数据的预测结果;更新模拟环境生成器参数,将虚拟数据的预测结果与真实数据的预测分布无限接近,即可认为生成器生成的虚拟数据是无偏的,对模型的训练效果与真实数据训练模型的效果是一致的,从而能够生成无偏模型,可以在模拟环境中训练出最优模型,直接在真实环境中使用,缩短模型反馈及调优的周期。
本发明实施例的应用场景可以是已有数据分布和真实数据分布有偏差,而真实数据又不可完全获取的场景。本发明实施例的中利用生成对抗网络生成数据,先通过已有的有偏数据构建对抗网络,然后利用生成数据进行模型训练。最后利用真实环境数据的输出构建生成对抗网,进一步优化对抗网络的数据生成器。从而达到利用有偏数据生成与真实数据分布一致的无偏训练数据。
示例性的,图41b所示架构可以应用的场景包括:用户使用语音助手的打点数据由于隐 私安全不能上服务器,使用少量签约Beta用户数据和人工标注数据,通过本申请的一整套流程,生成与真实现网数据分布一致的训练数据,训练数据用于语音助手多意图决策。
在该场景中,具体实施步骤如下:
a)原始数据导出,导出Beta用户语音打点数据和人工标注数据。
b)特征提取,对Beta用户数据和人工标注数据进行数据清洗和处理,将该原始数据映射为原始特征向量。
c)构建生成对抗网络,使用b)中的原始特征向量构建生成对抗网络,优化生成对抗网络的生成器和判别器,使用训练好的生成器产生大量用于模型训练的训练数据。
d)模型训练,使用c)中产生的训练数据在服务器上训练意图决策模型,并将训练好的模型下发到众多客户端。
e)再次训练生成对抗网络,使用客户端模型对真实数据进行意图决策,构建模型输出判别器,利用真实数据模型输出构建的生成对抗网络再次优化生成器。使生成器生成数据和现网真实数据分布一致。
f)模型训练,使用e)中训练好的生成器生成训练数据,在服务器上对模型进行训练。
g)模型预测,使用f)中训练好的模型多语音助手产生的多意图进行决策。
从而,在现网数据不上服务器的情况下,利用本申请的一整套流程生成和现网数据分布一致的大量训练数据,提高了意图决策模块的准确率,带给用户更好的体验。此外,在该实现过程中,真实数据不需要上传到服务器,即真实数据在客户端进行处理,极大的保护了用户的隐私。另一方面,利用少量有偏数据生成和真实数据分布一致的数据。传统方式中提供的少量真实数据是有偏的,直接利用少量真实数据进行生成对抗网络训练,因此生成数据也是有偏的,上述的Beta数据和人工标注数据也是存在一定的偏差,但通过本申请的一整套数据生成流程,生成了与真实数据分布一致的大量数据,进而为后续训练意图识别模型提供了良好的基础,使得后续训练出的意图识别模型的精准度较高,进而提升了意图识别的准确性。
(2)基于联合学习系统得到意图识别模型
需要说明的是,本方案中基于联合学习系统得到意图识别模型,可以基于图1所示的人工智能框架实现。
其中,本申请实施例主要涉及图37中第(c)部分中的机器学习内容,本申请涉及机器学习中的联合学习方法。联合学习是一种分散式的机器学习框架。联合学习与传统的机器学习的不同之处主要在于:传统的机器学习中,训练数据集中在数据库中,训练设备基于数据库中维护的训练数据生成目标模型。而联合学习的训练数据分散在不同的节点设备上,每个节点设备拥有各自的训练数据,各节点之间不会进行数据交换,通过这些节点设备的合作,共同进行机器学习训练。
请参阅图42所示,在本申请提供的联合学习的系统框架中,该系统框架包括多个节点设备和中控设备,多个节点设备与中控设备通信连接,每个节点设备和中控设备可以通过任何通信机制或通信标准的通信网络进行交互,通信网络可以是广域网、局域网、点对点连接等方式,或它们的任意组合。本方案中每个节点设备即是训练数据集的存储设备,又是用于训练模型的执行设备。可选的,每个节点设备又可以是用于采集训练数据的数据采集设备。其中,中控设备用于整合各节点设备上传的参数值(或者梯度,或模型),然后,将整合后的参数值(或者梯度,或模型)下发至各节点设备,从而使得节点设备更新本地的模型。例如, 中控设备下发一个机器学习网络架构(如神经网络)与一组初始化的权重值给各个节点设备。各节点设备收到后,使用本地端的数据对该神经网络进行训练,得到模型参数,然后将该参数上传给中控设备,由中控设备对各个节点设备上传的参数进行整合,将整合之后的参数下发给各节点设备,该整合之后的参数用于更新节点设备的模型。
本方案中,节点设备可以为终端设备(或也称为用户设备)。其中,该终端设备可以表示任何计算设备。例如,该终端设备可以为智能手机、平板电脑、可穿戴设备(如眼镜、手表、耳机等)、个人计算机、计算机工作站、车载终端、无人驾驶中的终端、辅助驾驶中的终端、智能家居中的终端(如音箱,智慧屏,扫地机器人,空调等)等。例如,多个节点设备可以均可以以手机为例。本方案中,节点设备也可以简称为“端侧”。
本方案中,中控设备可以是云端服务器,或者,也可以是服务器,本方案中,该中控设备以云端服务器为例。该中控设备也可以简称为“云侧”。
请参阅图43所示,本申请实施例提供了一种模型训练方法,该方法应用于联合学习系统,系统中包括多个节点设备和中控设备,节点设备的数量并不限定。为了方便说明,本实施例中,该节点设备以3个节点为例进行说明。例如,第一节点设备、第二节点设备和第三节点设备。
步骤401、中控设备获取细粒度标签。
第一种实现方式中,中控设备可接收各节点设备上传的细粒度标签,中控设备可以获取所有节点设备的细粒度标签。
第二种实现方式中,中控设备可以从第三方获取所有的细粒度标签。例如,在以APP名称作为细粒度标签的场景中,中控设备通过爬虫获取APP名称,或者通过搜索应用市场的方式获取全体的细粒度标签(如APP名称)。
步骤402、中控设备根据细粒度标签确定粗粒度标签,及细粒度标签到粗粒度标签的映射关系。
中控设备将所有的细粒度标签进行分类,每一个类别作为一个粗粒度标签,对于细粒度标签进行分类的方法可以是基于领域知识、基于聚类、基于规则、基于词向量等,具体的方法本申请并不限定。
例如,在以APP名称作为细粒度标签的场景中,中控设备可以通过APP的描述信息、APP评论以及领域知识等,对APP进行分类,将APP的类别作为粗粒度标签。基于上述表1,根据领域知识与APP的功能将APP划分为音乐类、视频类、网购类、地图、新闻类五个粗粒度标签,请参阅表2所示:
表2
Figure PCTCN2021079723-appb-000013
需要说明的是,上述表2中的内容仅是为了方便说明而举的例子,并不造成限定。
中控设备根据对细粒度标签的分类,可以确定细粒度标签到粗粒度标签之间的映射关系。即归属于同一个类别的细粒度标签与该类别对应的粗粒度标签具有映射关系。如上表2所示, 细粒度标签“QQ音乐”、“网易音乐”、“酷狗音乐”、“咪咕音乐”、“酷我音乐”与粗粒度标签“音乐”具有映射关系。
步骤403、各节点设备获取细粒度标签到粗粒度标签的映射关系。
中控设备将如表2所示的细粒度标签与粗粒度标签的映射关系下发到各节点设备,各节点设备接收该细粒度标签到粗粒度标签的映射关系。
步骤404、各节点设备根据映射关系将训练数据集中细粒度数据映射为粗粒度数据。
以第一节点设备为例,第一节点设备根据所述映射关系将训练数据集中细粒度数据映射为粗粒度数据。其中,细粒度数据为细粒度标签对应的数据,粗粒度数据为粗粒度标签对应的数据。该训练数据集中包括多个样本数据,该样本数据为APP的使用数据。例如,第一节点设备的训练数据集中的一个样本数据为:QQ音乐使用数据,在12:05打开QQ音乐。第一节点设备根据细粒度标签到粗粒度标签的映射关系(QQ音乐属于音乐类应用),可将QQ音乐使用数据转化为音乐类使用数据:在12:05打开了音乐类应用。第一节点设备将训练数据集中的每个样本数据根据映射关系进行处理,将细粒度数据映射为粗粒度数据。该样本数据还可以包括用户场景信息,用户状态信息等,如用户场景信息可以为用户在室内还是室外,用户是行走、坐或卧的状态,用户心情(可由心率等一些感知信息得到)等。
同理,第二节点设备根据映射关系将本地的训练数据集中细粒度数据映射为粗粒度数据。第三节点设备根据映射关系将本地的训练数据集中细粒度数据映射为粗粒度数据。第二节点设备和第三节点设备对训练数据集中的细粒度数据的处理方式与第一节点设备的处理方式相同,此处不赘述。
步骤405、各节点设备将粗粒度数据输入到群体粗粒度模型进行训练,确定群体粗粒度模型的第一信息;并将所述细粒度数据输入到细粒度模型进行训练。
第一模型可以理解为初始模型,该初始模型包括模型架构(如神经网络)及初始参数,第一模型包括群体粗粒度模型和细粒度模型。
以第一节点设备为例,第一节点设备将粗粒度数据输入到群体粗粒度模型,通过所述群体粗粒度模型对所述粗粒度数据进行学习,得到用于更新所述群体粗粒度模型的第一信息。其中,第一信息可以为梯度、模型参数、或者模型(包括模型架构及参数值)。
第一节点设备通过群体粗粒度模型对粗粒度数据进行学习,例如,该粗粒度数据为:在12:05打开了音乐类应用。
并且,第一节点设备将所述细粒度数据输入到细粒度模型,通过细粒度模型对细粒度数据进行学习,确定细粒度模型的模型参数。例如,该细粒度数据为:在12:05打开了QQ音乐。
第二节点设备和第三节点设备执行的动作与第一节点设备类似,第二节点设备和第三节点设备执行的动作请参阅第一节点设备的说明,此处不赘述。
步骤406、各节点设备将所述第一信息发送至中控设备。
第一节点设备将自身得到的第一信息上传至中控设备;第二节点设备将自身得到的第一信息上传至中控设备;第三节点设备将自身得到的第一信息上传至中控设备。
第一信息可以为梯度、模型参数(如权重)、或者模型(网络架构及模型参数)。第一种实现方式,该第一信息为梯度,第一节点设备根据loss函数计算梯度,然后将梯度发送给中控设备,中控设备将多个节点设备发送的梯度整合后再下发给各个终端设备。各节点设备接收整合后的梯度,再根据整合后的梯度更新各自的群体粗粒度模型的参数值。第二种实现方 式,第一信息为参数,每个节点设备得到各自的参数值,然后,各节点设备可以将各自的参数值发送给中控设备,中控设备将多个节点设备发送的参数值整合后再下发给各个终端设备,每个节点设备接收到整合后的参数值后,更新本地端的群体粗粒度模型。第三种实现方式中,第一信息为模型,每个节点设备也可以将经过本地粗粒度数据训练之后的模型发送至中控设备,中控设备对每个节点设备的模型进行整合,然后将整合之后的模型下发给各终端设备,每个终端设备接收更新之后的模型,在第三种实现方式中,本质上,中控设备也是通过整个各节点设备发送的模型的参数来更新模型的。
本申请实施例中,该第一信息可以以参数值为例进行说明。
步骤407、中控设备接收各个节点设备上传的第一信息,对接收到的所述多个节点设备上传的第一信息进行整合,得到整合后的第二信息;所述第二信息用于更新所述群体粗粒度模型。
第二信息可以为梯度、参数值(如权重值)、或者模型。本实施例中,该第一信息以参数值为例,则第二信息也以参数值为例进行说明。
中控设备接收各个节点上传的参数值,对接收到的多个节点设备上传的参数值进行整合,具体的实现方法并不限定。本申请实施例中,中控设备整合参数的方法可以为加权平均法,例如,由中控设备根据各个节点设备上传的参数及训练数据的数据量中控设备收集到各设备上传的参数与数据量,如下述式(1)按照比例计算平均值,该平均值W 就是该次计算的结果。
Figure PCTCN2021079723-appb-000014
其中,k为节点设备的数量,W k为第k个节点设备训练的一组权重值,n k为第k个节点设备的训练数据的数据量。然后,中控设备将该结果W’传回给各节点设备,这个来回需要多次,使得最后选定的参数可以使模型准确率到达系统预定的要求。
步骤408、中控设备将整合后的第二信息下发给各节点设备。
中控设备将整合之后的参数值下发给各个节点设备。例如,中控设备可以广播整合之后的参数,从而使得每个节点设备接收到该整合之后的参数。
例如,该第二信息为参数(如权重)时,第一节点设备根据该参数更新本地的群体粗粒度模型。同理,第二节点设备根据该参数更新本地的群体粗粒度模型。第三节点设备根据该参数更新本地的群体粗粒度模型。
上述步骤406-步骤408用于更新群体粗粒度模型。各节点设备并不会将本地训练数据上传到中控设备,每个节点设备通过本地数据对群体粗粒度模型进行训练,为了达到多个节点设备联合训练的目的,每个节点设备仅将各自的第一信息(如参数)传输至中控设备,以保证各节点设备本地数据的隐私性,中控设备将接收到的各参数值进行整合,将整合之后的参数下发给各个节点设备,各节点设备可以根据中控设备下发的参数对本地的群体粗粒度模型进行更新,即完成一次更新,从而使得本地的群体粗粒度模型具有群体性。
步骤409、各节点设备组合群体粗粒度模型和细粒度模型以得到联合模型,联合模型的标记空间映射为细粒度标签,所述联合模型的输出结果用于更新所述细粒度模型。
第一节点设备组合更新后的群体粗粒度模型和自身的细粒度模型以得到联合模型(如:意图识别模型)。第二节点设备组合群体粗粒度模型和自身的细粒度模型以得到联合模型(如:意图识别模型)。第三节点设备组合群体粗粒度模型和自身的细粒度模型以得到联合模型(如: 意图识别模型)。
需要说明的是,并不限定步骤409的时序,该步骤409可以在步骤405后的任意位置执行。本步骤中,该联合模型可以是初始群体粗粒度模型和初始细粒度模型联合之后的整体模型。随着群体粗粒度模型在训练过程中的不断更新,该联合模型中群体粗粒度模型可以是迭代更新后的模型,细粒度模型可以是每次迭代训练后更新的模型,直到群体粗粒度模型收敛和细粒度模型收敛。群体粗粒度模型和细粒度模型的更新时机不同。群体粗粒度模型是通过多个节点设备联合学习及中控设备协同更新,而细粒度模型是基于loss函数通过联合模型的输出结果进行反向更新。本方案中,每个节点设备中维护群体粗粒度模型和细粒度模型,该群体粗粒度模型和细粒度模型作为一个整体模型,其中,群体粗粒度模型和细粒度模型作为该整体模型中的一个部分进行训练,最后,还需要将这两个模型进行组合,组合成一个整体的模型(即联合模型)。本方案中,对于群体粗粒度模型和细粒度模型的组合方法并不限定,只要保证细粒度模型作为整体模型的一部分即可。
示例性的,请参阅图44a和44b所示,在一个应用场景中,细粒度标签和粗粒度标签以上述表2中的内容为例,对群体粗粒度模型和细粒度模型的组合方式进行说明。本实施例中,可以基于群体粗粒度模型的权重和细粒度模型的权重对两个模型进行组合,将群体粗粒度模型的权重和细粒度模型的权重相加得到整体模型的权重。细粒度标签的权重以该细粒度标签对应的粗粒度标签权重作为基,细粒度标签的权重等效于细粒度模型维护的一个偏移量,整体模型(联合模型)的输出结果映射至个体细粒度标签,使得联合模型输出的结果实现端侧的个性化。每个细粒度标签对应的权重包含群体粗粒度模型的权重以及细粒度模型的权重两部分。
群体粗粒度模型中,(w11,w21)表示对应音乐类标签的权重,(w12,w22)表示对应视频类标签的权重。细粒度标签有“爱奇艺”、“QQ音乐”和“网易音乐”三种。其中,“爱奇艺”对应的粗粒度标签是视频类,“QQ音乐”与“网易音乐”对应的是音乐类。细粒度模型对“爱奇艺”、“QQ音乐”、“网易音乐”分别对应三组权重(w'11,w'21),(w'12,w'22)和(w'13,w'23)。群体粗粒度模型和细粒度模型结合的整体模型中,输出层将输出的结果映射至个体细粒度标签。其中,“爱奇艺”归属于视频类,其对应的粗粒度标签为“视频”标签,其基部分使用视频类标签的权重(w12,w22)。而“QQ音乐”和“网易音乐”归属于音乐类,其对应的粗粒度标签是“音乐”标签,其基部分使用“音乐”标签的权重(w11,w21)。其中,“爱奇艺”对应的权重为(w12+w'11,w22+w'21),“QQ音乐”对应的权重为(w11+w'12,w21+w'22),“网易音乐”对应的权重为(w11+w'13,w21+w'23)。
在上述步骤406-步骤408步骤为群体粗粒度模型参数更新的步骤,群体粗粒度模型的参数在训练细粒度模型的参数时固定。对于细粒度模型,利用粗粒度模型与细粒度模型的联合模型在端侧采用在线学习或使用mini-batch等方式进行更新。
本申请实施例中,节点设备中训练数据集中样本数据的标记空间为细粒度标签,并引入粗粒度标签,通过粗粒度标签来统一各节点设备的标记空间,从而可以保证在各端侧细粒度任务不统一的情况下,各节点设备可以在粗粒度任务上的统一,多个节点设备也可以进行联合训练。节点设备获取细粒度标签与粗粒度标签的映射关系,然后,根据所述映射关系将训练数据集中的细粒度数据映射为粗粒度数据。节点设备利用粗粒度数据对群体粗粒度模型进行本地训练,并且通过所述多个节点设备的联合学习对所述群体粗粒度模型进行更新,该群体粗粒度模型也可以理解为横向维度上的端侧和云侧协同更新,直到该粗粒度标签收敛,从 而使得粗粒度模型具有群体性特征。并且节点设备利用将细粒度数据输入到所述细粒度模型进行训练,并且基于损失函数通过联合模型输出结果(细粒度标签)进行反向更新细粒度模型,直到该细粒度标签收敛。本方案中的联合模型既兼顾群体性特征,每个节点设备的细粒度模型能将群体粗粒度模型匹配到具体的细粒度标签上,使得联合模型的标记空间为端侧对应的细粒度标签空间,联合模型又兼顾每个节点设备的个体化特征,进而使得当联合模型为意图识别模型时通过该联合模型可以提升意图识别的准确性。
并且,本方案中,由于粗粒度标签(群体粗粒度模型或个体粗粒度模型)和细粒度标签存在层级关系(粗粒度标签为细粒度标签的上一个层级),粗粒度模型学到的知识可以指引细粒度模型。例如,当在一个节点设备中,某细粒度标签首次出现时,可由粗粒度标签对其进行初始化,解决端侧用户冷启动问题。
在一个可选的实现方式中,多个节点设备中的群体粗粒度模型可以同步更新,或者,也可以异步更新,示例性的,多个节点设备中的群体粗粒度模型以异步更新为例进行说明:
10)中控设备广播更新请求。中控设备向系统内的所有节点设备发送群体粗粒度模型更新请求。
20)各节点设备收到请求后,若能参与更新,则向中控设备反馈指示信息,该指示信息用于指示该节点设备能参与更新。
各节点设备接收到更新请求后,评估自身的状态,例如,自身当前的网络情况,电量情况,是否处于空闲状态等。
各节点设备根据自身当前的状态确定是否能参与更新。例如,第一节点设备和第二节点设备当前的网络情况适合更新,并且电量适合更新,且处于空闲状态。第一节点设备和第二节点设备向中控设备反馈能参与更新,而第三节点设备当前的状态不适合进行更新,第三节点设备可以不进行反馈,以节省网络开销。
30)中控设备向目标节点设备下发当前中控设备侧的群体粗粒度模型的参数。该目标节点设备为反馈指示信息的节点设备。
可选地,中控设备可以根据一些策略(例如每个节点设备的网络状态等)从多个节点设备中选择至少一个节点设备,这至少一个节点设备为适合进行模型更新的节点设备,中控设备可以向该节点设备发送中控设备侧的群体粗粒度模型的参数。
40)目标节点设备接收该参数,利用本地粗粒度数据训练群体粗粒度模型,得到梯度。
例如,第一节点设备利用本地粗粒度数据训练群体粗粒度模型,得到第一节点设备对应的梯度变化。第二节点设备利用本地粗粒度数据训练群体粗粒度模型,得到第二节点设备对应的梯度。
50)目标节点设备将计算得到的梯度上传给中控设备。
例如,第一节点设备将自身的梯度上传给中控设备。第二节点设备将自身的梯度上传给中控设备。
60)中控设备接收目标节点设备上传的梯度后,中控设备对目标节点设备上传的梯度进行整合并更新中控设备侧的梯度,获得更新后的参数(如权重)。
可选地,中控设备对梯度整合的方法可以是加权平均,也可以其他的优化算法,可选的,也可以在计算梯度的过程中引入冲量,提高速率,借助上一次的势能来和当前的梯度调节当前的参数,本申请并不具体限定整合的方法。
70)中控设备更新完中控设备侧的参数后,中控设备向所有节点设备广播,该广播用于 通知节点设备当前有新的模型可以更新。
80)各节点设备可根据自身状态(如网络许可、电量许可、手机处于空闲状态),选择模型更新时间,并向中控设备发送请求。
90)中控设备接收对应端侧请求后,向发送请求的节点设备发送更新后的参数,节点设备完成一次更新。
中控设备和节点设备之间的数据传输过程可以采用同态加密或常用加密算法,具体的并不限定。本示例中,多个节点设备中的粗粒度模型可以异步更新,每个节点设备可以根据各自的状态来对粗粒度模型进行更新,直到该粗粒度模型收敛,保证每个节点设备中粗粒度模型每次更新的成功率。
在一个可选的实现方式中,请参阅图45所示,节点设备中还配置个体粗粒度模型。节点设备中的整体模型可由群体粗粒度模型、个体粗粒度模型和细粒度模型构成。群体粗粒度模型能够挖掘群体性的规律,即能够体现多个节点设备群体性的特征。细粒度模型保证模型具有个性化,即体现每个节点设备所属用户的特征。而个体粗粒度模型用于弥合群体粗粒度模型与细粒度模型的差距。
个体粗粒度模型与群体粗粒度模型的相同的之处在于:个体粗粒度模型的标记空间为粗粒度标签。通过粗粒度数据对个体粗粒度模型进行训练。
个体粗粒度模型与群体粗粒度模型的更新过程不同,不同之处在于:
首先,对于群体粗粒度模型,在云侧初始化,云侧将初始化模型及初始化参数下发给所有节点设备,实现群体模型初始化。而个体粗粒度模型在端侧初始化。
然后,各节点设备将各自的个体粗粒度模型及模型相关参数上传至中控设备。例如,第一节点设备将自身的个体粗粒度模型及该模型相关参数上传到中控设备。同样的,第二节点设备将自身的个体粗粒度模型及模型相关参数上传到中控设备,第三节点设备将自身的个体粗粒度模型及模型相关参数上传到中控设备,第四节点设备将自身的个体粗粒度模型及模型相关参数上传到中控设备等。
最后,中控设备将接收到的每个节点设备上传的个体粗粒度模型加入到模型池,将相关度高于阈值的个体粗粒度模型进行整合,将整合之后的个体粗粒度模型下发至各节点设备。例如,中控设备将第一节点设备的个体粗粒度模型,第二节点设备的个体粗粒度模型,第三节点设备的个体粗粒度模型及第四节点设备的个体粗粒度模型保存至模型池。需要说明的是,此处为了方便说明,以四个节点设备为例进行说明,而在实际应用中,节点设备的数量并不限定。
中控设备对个体粗粒度模型的集成依赖于模型池中各个体粗粒度模型的相关度。其中,各个体粗粒度模型的相关度可以包括多种方式,具体的方法并不限定。
示例性的,一种实现方式,中控设备可以根据用户画像来判定个体粗粒度模型的相关度。例如,第一节点设备所属用户的用户画像和第二节点设备所属用户的用户画像的相似度高于第一门限,则确定第一节点设备的个体粗粒度模型和第二节点设备的个体粗粒度模型的相似度高于阈值。中控设备将第一节点设备的个体粗粒度模型和第二节点设备的个体粗粒度模型进行集成,将集成之后的个体粗粒度模型下发至第一节点设备和第二节点设备。第一节点设备和第二节点设备完成一次更新。同理,第三节点设备所属用户的用户画像和第四节点设备所属用户的用户画像的相似度高于第一门限,则确定第三节点设备的个体粗粒度模型和第四节点设备的个体粗粒度模型的相似度高于阈值。中控设备将第三节点设备的个体粗粒度模型 和第四节点设备的个体粗粒度模型进行集成,将集成之后的个体粗粒度模型分别下发至第三节点设备和第四节点设备。第三节点设备和第四节点设备完成一次更新。
另一种实现方式,模型相关参数可以是粗粒度标签的分布信息。例如,将粗粒度数据作为个体粗粒度模型的训练样本,个体粗粒度模型的输出为粗粒度标签。如在一个应用场景中,该个体粗粒度模型的输出的粗粒度标签为“音乐”、“视频”和“网购”等,中控设备可以粗粒度标签的分布信息确定模型池中个体粗粒度模型的相关度。例如,第一节点设备和第二节点设备的个体粗粒度模型的输出大多集中在“音乐”这个粗粒度标签,第一节点设备的个体粗粒度模型在“音乐”这个粗粒度标签相对于所有的粗粒度标签的分布高于第二门限,第二节点设备的个体粗粒度模型在“音乐”这个粗粒度标签相对于所有的粗粒度标签的分布也高于第二门限,则中控设备确定第一节点设备的个体粗粒度模型和第二节点设备的个体粗粒度模型的相关度高于阈值。中控设备将第一节点设备的个体粗粒度模型和第二节点设备的个体粗粒度模型进行集成,将集成之后的个体粗粒度模型下发至第一节点设备和第二节点设备。
可以理解的是,每个节点设备将各自的个体粗粒度模型上传到云侧,个体粗粒度模型的更新是纵向的将相关度高于阈值的个体粗粒度模型进行集成,然后将集成之后的个体粗粒度模型下发到对应的节点设备。可以理解的是,该个体粗粒度模型的更新为纵向维度上端侧和云侧协同更新。群体粗粒度模型体现了系统内所有节点设备的群体性特征,而个体粗粒度模型相较于群体粗粒度模型,是将部分节点设备的个体粗粒度模型进行集成,能够体现部分节点设备的特征,而细粒度模型体现的是个体化特征,由此可见,个体粗粒度模型弥合群体粗粒度模型与细粒度模型的差距。
本实施例中,将群体粗粒度模型、个体粗粒度模型和细粒度模型组合为一个整体模型。将群体粗粒度模型的权重、个体粗粒度模型的权重和细粒度模型的权重相加得到整体模型的权重。
示例性的,请参阅图44a所示,其中,(w11,w21)表示群体粗粒度模型对应音乐类标签的权重,(w12,w22)表示群体粗粒度模型对应视频类标签的权重。请参阅图46a和46b所示,图46a为个体粗粒度模型的示意图,(w"11,w"21)表示个体粗粒度模型对应音乐类标签的权重,(w"12,w"22)表示个体粗粒度模型对应视频类标签的权重。图46b为整体模型的示意图,如细粒度标签有“爱奇艺”、“QQ音乐”和“网易音乐”三种。其中,“爱奇艺”对应的粗粒度标签是视频类,“QQ音乐”与“网易音乐”对应的粗粒度标签是音乐类。粗粒度标签“爱奇艺”、“QQ音乐”、“网易音乐”分别对应三组权重(w'11,w'21),(w'12,w'22)和(w'13,w'23)。群体粗粒度模型、个体粗粒度模型和细粒度模型结合的整体模型(也称为联合模型)中,输出层将输出结果映射至个体细粒度标签。其中,“爱奇艺”对应的权重为(w12+w"12+w'11,w22+w"22+w'21),“QQ音乐”对应的权重为(w11+w"11+w'12,w21+w"21+w'22),“网易音乐”对应的权重为(w11+w"11+w'13,w21+w"21+w'23)。
本实施例中,群体粗粒度模型、个体粗粒度模型和细粒度模型组合为一个整体模型,群体粗粒度模型能够挖掘群体性的规律,能够为端侧的联合模型提供一个好的初始点。但是存在群体性的规律与个体特征之间的差距巨大的情况,而个体粗粒度模型可以弥合少数情况下群体性与个体性的差距。细粒度模型在粗粒度提供的初始点上实现端侧的个性化。
本申请实施例中,上述模型训练方法,并不限定应用场景,在不同的场景中,训练数据不同而已,例如上述模型的训练方法还可以应用在意图识别,分类等应用场景中。
在一个应用场景中,节点设备以手机为例,该联合模型为APP预测模型为例。该APP 预测模型包括3个部分,即群体粗粒度模型,个体粗粒度模型和细粒度模型。每个节点设备中的群体粗粒度模型都是通过这100节点设备参与联合训练之后得到的,每个节点设备中的群体粗粒度模型的初始模型参数相同,最终训练得到的模型参数也相同,群体粗粒度模型具有群体性。例如,在8:00-9:00通勤时间,大多数用户往往会选择听一些提神醒脑的歌曲,但是对于不同的个体来说,使用的APP可能不相同。也就是说,不同节点设备中的训练数据中细粒度数据可能是不同的。例如,用户A的节点设备A中的样本数据是:8:00打开“酷狗音乐”,而用户B的节点设备B中的样本数据是:8:00打开QQ音乐,由于“酷狗音乐”和“QQ音乐”对应的粗粒度标签都是“音乐”标签,从而实现多个节点设备中粗粒度模型标记空间相同,即实现多个节点设备任务统一,由此,通过100个节点设备联合训练的粗粒度模型具有群体性,即在8:00-9:00可能打开“音乐”类的APP。但是,可能有少部分用户虽然也是听歌,但是并不是通过音乐APP听歌,而是通过视频APP听歌,例如,这少部分用户是爱好健身的用户,可能喜欢边听歌边看视频,那么,在云侧,可以通过用户画像将这部分用户的个人粗粒度模型进行集成,云侧将这部分用户的个人粗粒度模型集成后下发到对应的端侧,那么这部分用户的手机上的个人粗粒度模型弥合群体性与个性化的差异。联合模型输出的结果会映射到细粒度标签,也就是说,包括这三个部分的联合模型输出的结果是每个节点设备下载的APP,粗粒度模型指导细粒度模型,例如,节点设备A音乐类的APP下载的是“酷狗音乐”,而节点设备B下载的音乐类的APP是“QQ音乐”,那么,到具体的节点设备,节点设备A的联合模型输出的预测结果可能是“酷狗音乐”,而节点设备B的联合模型输出的预测结果可能是“QQ音乐”,从而实现不同端侧个性化APP预测。需要说明的是,该场景中仅是为了方便说明,输入特征以时间为例进行说明,并不造成对本申请的限定。该输入特征还可以包括用户相关特征信息,该用户相关特征信息包括用户场景信息,用户状态信息等。其中,用户场景信息可以为用户在室内还是室外等。用户状态信息可以包括用户是行走、坐或卧的状态,用户心情(可由心率等一些感知信息得到)等等。
以上对联合模型的训练方法进行了说明,下面对该联合模型的应用进行说明。
示例性的,本申请实施例提供了一种APP预测方法,该方法应用于节点设备。上述模型训练方法训练得到的联合模型用于APP预测。
首先,节点设备响应用户的操作,该操作可以是与节点设备进行交互的任意操作。例如,用于开启所述节点设备的屏幕。如该操作可以是点击屏幕的操作,人脸识别的解锁操作等,或者,该操作可以是语音操作,例如,语音指令等。用户对节点设备有操作,表明用户此时有可能会使用节点设备。
然后,节点设备响应该操作,并确定接收该操作时的时间信息。例如,该第一操作的时刻为8:15。然后,节点设备将所述时间信息输入到应用预测模型,APP预测模型输出预测结果,所述预测结果用于指示目标应用。例如,该目标应用为QQ音乐。
可选的,终端设备还可以确定接收该操作时的用户相关特征信息,该用户相关特征信息包括但不限定于用户场景信息,用户状态信息等,如用户场景信息可以为用户在室内还是室外,用户是行走、坐或卧的状态,用户心情(可由心率等一些感知信息得到)等。
最后,预加载目标应用(QQ音乐)。节点设备通过该APP预测模型预测用户可能会使用哪个APP,而预先加载该APP,节省开启该APP的响应时长,提升用户体验。
接下来对图15所示的意图识别决策系统501中的动作反馈模块608如何识别出用户真实 意图(即用户真实的执行动作)进行介绍。
请参阅图47,图47为本申请实施例中意图识别方法的一个数据流向示意图。图48为本申请实施例中意图识别方法的一个流程示意图。下面结合图47所示的数据流向示意图和图48所示的流程示意图,对本申请实施例中的意图识别方法进行描述:
S2201、电子设备确定待识别的打点数据序列。
在用户使用该电子设备的过程中,电子设备可以在本地记录用户的操作数据作为打点数据并组成打点数据序列。当电子设备需要确定这些打点数据的意图时,电子设备可以将这些打点数据序列作为待识别的打点数据序列。在一个例子中,待识别的打点数据序列可以中包括多个数据,多个数据中至少两个数据的输入方式不同。在一个例子中,待识别的打点数据序列可以是在决策推理模块607预测出动作序列后电子设备所记录的数据;其中,该步骤可以由图15所示的意图识别决策系统501中的多模态输入模块601执行。
示例性的,在电子设备启动了意图识别功能后,电子设备可以将新产生的打点数据组成打点数据序列作为待识别的打点数据序列。
S2202、电子设备将该待识别的打点数据序列输入多示例学习模型,得到多个子序列。
该多示例学习模型可以为按照上述多示例学习模型的训练方法训练完成的多示例学习模型,或按照下述多示例学习模型的更新过程更新训练后的多示例学习模型。在一个例子中,该步骤可以由由图15所示的意图识别决策系统501中的动作反馈模块608执行。
该多示例学习模型用于将输入序列划分为更小粒度的序列。
示例性的,图49为本申请实施例中多示例学习模型将输入序列划分为多个子序列的一个示例性示意图。用户早上起床后,通过语音助手打开音乐应用播放了一首歌。然后下楼打开地图导航应用叫了一辆车去公司。途中在车上打开视频应用程序A看了个小视频。在快到公司的时候查询下想要的健康早餐的内容。在此过程中电子设备在本地记录了如图23中(a)所示的用户操作的打点数据,并形成了打点数据序列:【V,唤醒语音助手-执行打开音乐】【A,语音助手拉起音乐应用】【L,返回桌面】【A,打开地图导航应用】【L,返回桌面】【A,打开视频应用程序A】【V,唤醒语音助手-执行打开浏览器】【A,语音助手拉起浏览器应用】【A,搜索关键词“健康早餐”】【A,打开燕麦早餐页面】【L,返回桌面】。
将该打点数据序列作为待识别的打点数据序列输入多示例学习模型后,可以将该输入序列划分为多个粒度更小的子序列:
子序列X1:【V,唤醒语音助手-执行打开音乐】【A,语音助手拉起音乐应用】【L,返回桌面】;
子序列X2:【A,打开地图导航应用】【L,返回桌面】;
子序列X3:【A,打开视频应用程序A】;
子序列X4:【V,唤醒语音助手-执行打开浏览器】【A,语音助手拉起浏览器应用】【A,搜索关键词“健康早餐”】【A,打开燕麦早餐页面】【L,返回桌面】。
在一个例子中,每个子序列中可以包括至少一个实体,多个子序列构成第一实体序列。示例性的,如图8所示,电子设备100将打点数据序列A1输入到多示例学习模型后,可以得到子序列B1,子序列B2,和子序列B3。其中,各个子序列(B1,B2,B3)中均包括了多个实体,以子序列B2为例,其包括的实体为:“打开录音机”,“返回桌面”。
S2203、电子设备按照第二预设规则确定各子序列的意图;
该第二预设规则用于据各序列中的打点数据确定各序列的意图。电子设备得到多示例学 习模块输出的多个子序列后,可以按照该第二预设规则确定各子序列的意图。
示例性的,对于图49中(b)所示的输出的各个子序列,若第二预设规则为序列中最后一个动作为意图。则电子设备可以确定各子序列的意图为:子序列X1的意图为打开音乐应用;子序列X2的意图为打开地图导航;子序列X3的意图为打开视频应用程序A;子序列X4的意图为打开燕麦早餐页面。
本申请实施例中,电子设备可以采用训练好的多示例学习模型,将用户操作产生的打点数据序列作为待识别的打点数据序列划分为粒度更小的多个子序列。再采用第二预设规则确定出各个子序列的意图。由于使用的该多示例学习模型是使用用户自己的打点数据训练出来的,因此该多示例学习模型划分的子序列更符合用户个性化的使用习惯。然后再使用第二预设规则确定各个子序列的意图,使得识别出的意图更准确。
需要说明的是,本方案中基于多示例模型进行意图识别的优势可以包括:
电子设备可以根据第一预设规则将获取的打点数据序列划分为不同的分序列,经过确定示例和示例标签,确定包和包标签,提取特征向量矩阵等过程后,使用提取的特征向量矩阵对多示例学习模型进行训练,得到训练完成的多示例学习模型。在对多示例学习模型的训练过程中,不需要开发人员对作为训练数据的打点数据进行提前标注,电子设备通过该过程可以实现对打点数据的自标注。然后电子设备可以使用该训练完成的多示例学习模型,自动将该打点数据序列或新输入的打点数据序列划分为粒度更小的子序列,根据第二预设规则确定出各子序列的意图。由于训练数据使用的用户自己的打点数据,且不需要开发人员进行人工标注,实现了自标注用户打点数据。又由于训练好的多示例学习模型能够将打点数据序列划分为粒度更小的子序列,再根据第二预设规则确定出各子序列的意图,从而能更准确的识别数据中的意图,进而提升了意图识别的准确性。
下面结合几种其他的意图识别的实现方式,对比说明本申请实施例中基于多示例模型的意图识别方法的优势:
在一种意图识别的实现方式中,定义命名实体为:文本中具有特定意义的实体,如人名、地名等。首先,从用户的查询日志中识别出命名实体和实体类型,并建立命名实体集合。接着,根据命名实体集合把每个查询切分成命名实体e1,e2和实体关系上下文ct,所有切分的结果组成集合。之后,聚合e1,e2和ct,用聚合后的数据训练主题发现模型,并采用变分期望最大算法(expectation-maximization,EM算法)估计主题模型的参数。最后,在预测用户意图时,用训练好的模型估计在两个命名实体e1,e2和实体关系上下文ct的条件下,意图为主题r的概率p(r|e1,e2,ct)。
在这种意图识别的实现方式中,一方面需要收集大量查询文档提取命名实体,且使用的主题发现模型的训练需要大量的训练数据。另一方面,其能识别的意图类别严重依赖于训练集,能识别出的意图有限。
而采用本申请实施例中基于多示例模型的意图识别方法,在有很少的打点数据的情况下就可以对多示例学习模型进行训练,并能准确识别出已经学习到的用户意图。随着打点数据的累积,还可以增量训练,不断优化识别结果。此外,本申请实施例中采用多示例学习模型将打点数据序列划分为更细粒度的子序列后,即可以根据第二预设规则识别出更细粒度子序列对应的意图。识别到的意图不完全依赖于训练集,理论上可以识别无穷多的意图。
在另一种意图识别的实现方式中,采取用上下文信息训练有监督模型从而实现意图识别的方式。具体地,首先获取用户历史查询日志,逐句地从日志中对用户提出的问题进行人工 标注,标注时关注每句对话的上下文。其次,对每一个标注的问题执行特征提取,生成训练语料,使用的特征为问题的位置信息和上文意图分类信息。接着,用有监督方法训练模型,例如使用逻辑回归(logistic regression,LR)。最后,用训练好的有监督模型预测用户的意图。
在这种意图识别的实现方式中,需要开发人员花费大量的时间对每个问题进行人工标注,并且模型是根据群体特征统一训练的,并不能体现用户的差异。
而采用本申请实施例中基于多示例模型的意图识别方法,不使用有监督学习的方法训练模型,而是使用用弱监督学习中的多示例学习训练模型。不需要使用人工标注,而是能够对打点数据自标注,节省了大量的标注时间。且训练数据基于每个用户自己的打点数据,从每个用户的打点数据中挖掘有用信息,训练用户自己的多示例学习模型,适用于每个用户。
可以理解的是,该基于多示例学习模型的意图识别方法也可以应用到图15所示的意图识别决策系统501中的其他模块中,在此不做限定。例如,应用到意图识别模块605,决策推理模块607中等等。
接下来对图15所示的意图识别决策系统501中的意图识别模块605如何识别用户意图进行介绍。
(1)基于知识图谱的意图识别
请参阅图50,图50是本申请实施例中一种基于知识图谱的意图识别方法。如图50所示,该意图识别方法可以包括以下步骤501-步骤503。
S501、电子设备获取用户感知数据。
用户感知数据用于表示用户的行为信息,用户感知数据并未明确表明用户的意图。
在具体的实现中,用户感知数据可以包括:传感器采集到的数据、电子设备安装的应用(application,APP)中记录的用户的操作数据。其中,传感器采集到的数据可以包括:用户动作、用户所处位置、当前时间、当前温度、当前湿度等。用户的操作数据可以包括:用户在第一应用中对音乐A的点击操作、用户在第二应用中对视频A的点击操作、用户在第三应用中对商品A的购买操作等。在一个例子中,用户感知数据可以构成电子设备在第一时间段内获取到的第一数据序列;其中,该用户感知数据可以是由图15所示的意图识别决策系统501中的多模态输入模块601获取到的。
在该场景下,电子设备获取用户感知数据的过程为:电子设备的处理器可以接收电子设备的传感器采集到的数据。电子设备的处理器可以周期性的从电子设备安装的各个应用中获取用户的操作数据。
示例性的,电子设备的处理器可以接收GPS发送的用户所处位置数据,例如,用户所处位置可以为:A道路的人行横道。处理器可以接收运动传感器发送的用户动作数据,例如,用户动作可以为:行走。处理器可以通过电子设备内置时钟获取当前时间,例如,当前时间为:2020年8月12号,星期三,8:30。
S502、电子设备根据用户感知数据和存储的知识图谱,确定多个候选意图。
在具体的实现中,电子设备在获取到用户感知数据之后,可以先确定用户感知数据中的实体和实体的描述数据。其中,实体的描述数据可以包括该实体的属性值。之后,电子设备可以根据实体和实体的描述数据,查找存储的知识图谱,以确定用户的状态信息和场景信息。其中,状态信息用于标识该用户的当前状态,场景信息用于标识用户当前所处的环境。最后,电子设备可以根据状态信息、场景信息和候选意图的对应关系,获取确定出的状态信息和场 景信息对应的多个候选意图。其中,状态信息、场景信息和候选意图的对应关系包含在知识图谱中。
示例性的,结合上述步骤501中的例子,假设电子设备获取的感知数据包括:用户位置为A道路的人行横道,用户动作为行走,当前时间为2020年8月12号,星期三,8:30。那么假设将用户动作作为实体,那么电子设备确定的实体为行走,该实体的描述数据为:用户在2020年8月12号,星期三,8:30时在A道路的人行横道上行走。电子设备根据上述行走实体,以及行走实体的描述数据,结合知识图谱中的个人知识:用户周一到周五上班,用户所处位置在家与公司之间等,确定的用户的状态信息为行走状态,场景信息为上班路上。最后,电子设备根据用户的状态信息:行走状态,场景信息:上班路上,确定出来的多个候选意图可以包括:听音乐意图、看新闻意图等。
在一个例子中,实体的描述数据也可以理解为为一个或多个实体,例如,将日期作为实体,那么电子设备确定的实体为2020年8月12号,星期三;将时间作为实体,那么电子设备确定的实体为8:30;将位置作为实体,那么电子设备确定的实体为A道路的人行横道。此时,电子设备由用户感知数据确定出的实体序列即为:行走,2020年8月12号,星期三,8:30,A道路的人行横道。也即是说,该步骤S502可以是先识别出用户感知数据中的实体序列,然后,再根据该实体序列和存储的知识图谱,确定出多个候选意图。在一个例子中,可以由图15所示的意图识别决策系统501中的实体识别模块603对用户感知数据中的实体进行识别。图15所示的意图识别决策系统501中的意图识别模块605可以从知识库602中获取到知识图谱,并基于实体识别模块603识别出的实体和获取到的知识图谱,识别出多个候选意图。
需要说明的是,在本申请实施例中,知识图谱能够提供候选意图的查询接口。在一种可能的实现方式中,知识图谱可以包括:状态信息的查询接口、场景信息的查询接口、候选意图的查询接口。其中,状态信息的查询接口用于将用户感知数据的实体和实体的描述数据输入知识图谱,输出用户的状态信息。场景信息的查询接口用于将用户感知数据的实体和实体的描述数据输入知识图谱,输出用户的场景信息。候选意图的查询接口用于将之前输出的用户的状态信息和场景信息输入知识图谱,输出多个候选意图。在另一种可能的实现方式中,知识图谱可以仅包括:状态信息、场景信息和候选意图的查询接口。其中,状态信息、场景信息和候选意图的查询接口用于将用户感知数据的实体和实体的描述数据输入知识图谱,知识图谱确定用户的状态信息和场景信息,并根据该用户的状态信息和场景信息确定对应的候选意图,最后输出:用户的状态信息和场景信息,以及候选意图。本申请实施例在此对候选意图的查询接口的具体实现不做具体限定。
S503、电子设备采用预设的强化学习算法,从多个候选意图中确定目标意图。
电子设备在确定出多个候选意图后,由于候选意图的数量可能较大,在该情况下,电子设备无法展示全部的候选意图,因此电子设备需要从多个候选意图中确定目标意图。在确定出目标意图后,电子设备会展示该目标意图。在展示目标意图时,一方面需要尽可能展示符合用户真实意图的意图,即展示置信度大的意图,另一方面对于每个意图都需要展示足够多的次数,以获取足够多的反馈,此时出现了探索与利用两难困境。为了解决该困境,电子设备可以采用预设强化学习算法,从多个候选意图中确定目标意图。在一个例子中,该步骤可以由图15所示的意图识别决策系统501中的意图识别模块605执行。
在具体的实现中,电子设备可以先确定与多个候选意图一一对应的意图摇臂。之后,电 子设备可以根据上下文信息(该上下文信息包括:用户感知数据、用户的状态信息、场景信息)、与多个候选意图一一对应的意图摇臂,以及强化学习算法,从多个候选意图中确定目标意图。
可以理解,可以在电子设备中预先存储意图与摇臂的对应关系,每个摇臂包括一组参数,该组参数用于表示一个摇臂模型。
示例性的,上述强化学习算法可以为“使用上下文信息的bandit算法”,“使用上下文信息的bandit算法”可以为基于回报与上下文成线性相关的假设的linear bandit算法,例如,贪婪算法(epsilon-greedy)、LinUCB算法、汤普森采样(Thompson Sampling)算法等。
在该情况下,电子设备可以采用以下三种方式,从多个候选意图中确定目标意图。在具体的实现中,电子设备具体采用以下三种方式中的哪种方式确定目标意图,本申请实施例在此不做限定。
方式1,采用贪婪算法。电子设备可以先随机获取一个(0,1)之间的值a。若a>ε,ε为(0,1)之间的超参数,则在与多个候选意图一一对应的意图摇臂中随机选择一个或多个意图摇臂,将该一个或多个意图摇臂对应的意图作为目标意图。若0<a<ε,则根据上下文信息,探索得到意图置信度最大的一个或多个意图摇臂,并将一个或多个意图摇臂对应的意图作为目标意图。
方式2,采用LinUCB算法。电子设备可以根据上下文信息,以及意图对应的意图摇臂,计算每个意图对应的意图置信度,并通过霍夫丁不等式计算该意图置信度与真实置信度之间的误差,LinUCB算法中该误差服从预设的分布。之后,电子设备可以在与多个候选意图一一对应的意图摇臂中,选择意图置信度与误差的和最大的一个或多个意图摇臂,并将一个或多个意图摇臂对应的意图作为目标意图。
方式3,采用汤普森采样算法。基于贝叶斯理论,认为意图摇臂包括的参数服从预设分布(如该预设分布可以为高斯分布)。在该情况下,电子设备可以对与多个候选意图一一对应的意图摇臂中的每个意图摇臂包括的参数进行采样,根据采样后的参数与上下文信息计算每个意图摇臂的计算结果。之后,电子设备可以选择计算结果最大的一个或多个意图摇臂,并将一个或多个意图摇臂对应的意图作为目标意图。
本申请实施例提供的基于知识图谱的意图识别方法,在获取到用于表示用户的行为信息的用户感知数据后,可以根据用户感知数据和存储的知识图谱,确定多个候选意图,并采用预设的强化学习算法,从多个候选意图中确定目标意图。这样,由于用户感知数据仅表示用户的行为信息,并未表明用户的意图,实现了意图识别装置在用户未表明自身意图的情况下,主动识别用户意图,从而提高了用户体验。示例性的,当用户感知数据的输入方式为多模态输入时,则可以在主动基于多模态输入的数据识别用户的意图,使得在用户无感的情况下就可以确定出用户的意图,提升了用户体验。
可选的,在本申请实施例中,电子设备在确定出目标意图之后,可以向用户展示该目标意图。具体的,基于图50,如图51所示,本申请实施例提供的意图识别方法还可以包括以下步骤504-步骤506。
S504、电子设备根据用户感知数据、状态信息、场景信息、目标意图对应的意图摇臂,确定目标意图对应的意图置信度。
其中,意图置信度用于表示目标意图与真实意图的预测符合程度。通常,意图置信度越高,表明目标意图与真实意图的预测符合程度越大,即目标意图贴近真实意图的可能性越大。
在具体的实现中,电子设备在从多个候选意图中确定出目标意图之后,可以确定目标意图对应的意图置信度。该目标意图的数量由对应的业务场景决定,可以为一个或多个。本申请实施例在此以确定一个目标意图对应的意图置信度为例进行说明。在一个例子中,该步骤可以由图15所示的意图识别决策系统501中的意图识别模块605执行。
电子设备可以使用“使用上下文信息的bandit算法”确定目标意图对应的意图置信度。“使用上下文信息的bandit算法”可以是基于回报与上下文成线性相关的假设的linear bandit算法,例如,贪婪算法、LinUCB算法、汤普森采样算法等。“使用上下文信息的bandit算法”也可以是提取深度特征的neural bandit算法或使用policy gradient实现基于梯度更新的bandit算法。
S505、电子设备根据意图置信度,确定展示目标意图使用的目标交互模式。
其中,目标交互模式可以为:消息提示框、通知、锁屏卡片、情景智能卡片或动画指引等。在一个例子中,该步骤可以由图15所示的意图识别决策系统501中的决策推理模块607执行。
在本申请实施例中,意图置信度不同,表明对应的目标意图与真实意图的符合程度不同。又由于电子设备与用户的交互模式多种多样,因此,对于不同意图置信度对应的目标意图,电子设备可以使用不同的交互模式。具体的,电子设备在确定出目标意图对应的意图置信度之后,可以在预存的多个置信区间中,确定意图置信度所属的目标置信度区间。其中,一个置信区间对应一个等级的交互模式,一个等级的交互模式包括一个或多个交互模式。然后,电子设备可以根据目标意图对应的业务,从目标置信区间对应的等级的交互模式中确定目标交互模式。
可以理解,在本申请实施例中,可以在电子设备中预先存储置信区间,以及与置信区间对应的等级的交互模式,一个等级的交互模式包括一个或多个交互模式。具体过程为:可以先采用规则设计、用户调研、感知模型分析等方式,来得到不同的交互模式对用户体验的影响力,以及不同的交互模式的提示能力。然后,根据交互模式对用户体验的影响力,以及交互模式的提示能力,来设置置信区间,以及与置信区间对应的等级的交互模式。
其中,上述设置置信区间,以及对应的交互模式通常遵循的规则是:当目标意图对应的意图置信度较低时,表明该目标意图贴近真实意图的可能性较低,此时需要选取对用户体验的影响力较小、提示能力较弱的交互模式,如消息提示框、通知等交互模式。当目标意图对应的意图置信度较高时,表明该目标意图贴近真实意图的可能性较大,此时需要选取对用户体验的影响力较大、提示能力交强的交互模式,如锁屏卡片、情景智能卡片、动画指引等交互模式。且,可以预先设置意图置信度的最低阈值,当目标意图对应的意图置信度低于该最低阈值时,表明该目标意图基本与真实意图不符,此时需要将该目标意图只在设备内使用而不展示给用户。
需要说明的是,在本申请实施例中,交互模式可以为:图形、语音、动作等方式的交互。其中,图形交互可以包括消息提示、通知、卡片、动画等多种交互形式。本申请实施例在此对交互模式的实现方式不做具体限定。
示例性的,假设置信区间,以及与置信区间对应的等级的交互模式如表3所示。
表3
置信区间 置信区间对应的等级的交互模式
[a,b) A等级交互模式包括:消息提示框、通知
[b,c) B等级交互模式包括:锁屏卡片、情景智能卡片、动画引导
[0,a) C等级交互模式包括:机内使用
表3中,a<b<c。由表3可知,置信度越大,对应的交互模式对用户体验的影响力越大,交互模式的提示能力越强。也就是说,B等级交互模式对用户体验的影响力>A等级交互模式对用户体验的影响力>C等级交互模式对用户体验的影响力(该影响力为零)。B等级交互模式的提示能力>A等级交互模式的提示能力>C等级交互模式的提示能力(该提示能力为零)。
假设a=0.2,b=0.6,c=0.9,那么在用户刚从家里出发的场景下,结合知识图谱中的用户历史数据,该用户通常的出行方式为打车或开车,偶尔步行,假设最终确定出的目标意图包括:打车意图、自驾意图和步行意图。其中,打车意图对应的意图置信度为0.3,自驾意图对应的意图置信度为0.8,步行意图对应的意图置信度为0.1。那么电子设备可以结合打车业务,确定打车意图使用的目标交互模式为通知,以通知用户打开某打车应用。电子设备可以结合驾驶业务,确定自驾意图使用的目标交互模式为锁屏卡片。电子设备不会显示步行意图。
S506、电子设备利用目标交互模式,展示目标意图的内容。
电子设备可以利用目标交互模式,根据目标意图对应的业务,获取并展示目标意图的内容。在一个例子中,该步骤可以由图15所示的意图识别决策系统501中的决策推理模块607执行。
例如,结合步骤505中的例子,电子设备在确定出打车意图使用的交互模式为通知后,可以在通知栏处展示一条通知消息,该通知消息包括的内容为“10:00打开打车应用”。如图52中的(A)所示,假设电子设备当前显示的页面为主屏幕页面,则电子设备可以在主屏幕页面的顶部位置显示该通知消息,在一段时间后,电子设备结束该通知消息的显示。之后,在用户从屏幕顶部进行由上至下的滑动操作后,电子设备可以显示一通知页面,该通知页面中包括通知消息,如图52中的(B)所示。
再例如,结合步骤505中的例子,假设电子设备当前显示的页面为锁屏页面,那么电子设备在确定出自驾意图使用的交互模式为锁屏卡片后,可以在锁屏页面显示一个锁屏卡片,该锁屏卡片可以用于指示用户打开地图导航的应用,或者推荐用户可能喜欢的音乐等。例如,该锁屏卡片包括的内容可以为“打开地图导航的应用、歌曲名称A和歌曲名称B”,如图53所示。
不同于现有技术中的仅依赖置信度来展示意图,即展示意图置信度大于阈值的意图,本申请实施例能够根据置信区间,以及置信区间对应的等级的交互模式,来选择展示目标意图的目标交互模式,减轻了展示低置信度的意图导致降低用户体验的问题。
可选的,在本申请实施例中,电子设备在利用目标交互模式,展示目标意图的内容之后,可以接收用户的反馈操作,并利用该反馈操作,更新知识图谱,以及强化学习算法中的一些参数。具体的,基于图51,如图54所示,本申请实施例提供的意图识别方法还可以包括以下步骤507-509。
S507、电子设备在利用目标交互模式,展示目标意图的内容的预设时间段内,识别对目标意图的目标操作。
电子设备以开始展示目标意图的内容为起始时间,在预设时间段内接收用户对目标意图的目标操作,并识别该目标操作。在一个例子中,该步骤可以由图15所示的意图识别决策系统501中的多模态输入模块601执行。
示例性的,该目标操作可以为点开操作,也可以为关闭操作,还可以为忽视操作,即未接收到用户对目标意图的任何操作,还可以为忽视但打开与目标意图的内容相关的内容的操作,即未接收到用户对目标意图的操作,但是接收到用户打开与目标意图的内容相关的内容的操作,如打开与目标意图的内容相关的应用、打开与目标意图的内容相关的网页。本申请实施例在此对目标操作的具体形式不做具体限制。
例如,结合图52中的(B),假设电子设备在通知页面显示一通知消息,那么用户可以通过点开操作,例如点击该通知消息,来打开打车应用,如图55所示。用户可以通过关闭操作,例如向左滑动该通知消息,或者向左滑动该通知消息后,电子设备显示该通知消息的部分内容,并在该通知消息的关联位置显示清除控件,用户点击该清除控件,来关闭该通知消息,如图56所示,为用户点击该清除控件。用户也可以忽视该通知消息,即不对该通知消息进行任何操作,但是用户可以点击主屏幕页面中的打车应用,如图57所示,为用户点击打车应用。
S508、电子设备根据目标操作和预设规则,确定目标操作对应的目标值。
其中,目标值用于表示目标意图与真实意图的实际符合程度。目标操作不同,对应的目标值不同。在具体的实现中,该目标值可以为奖励值,也可以为惩罚值。可以预先定义,目标值越大,表明目标意图与真实意图的实际符合程度越大。或者,也可以预先定义,目标值越小,表明目标意图与真实意图的实际符合程度越大。在一个例子中,该步骤可以由图15所示的意图识别决策系统501中的动作反馈模块608执行。
可以理解,该预设规则可以是预先设计好的规则,也可以是预设的函数,还可以是预设的模型。本申请实施例在此对预设规则的形式不做具体限定。
S509、电子设备根据目标值,更新多个候选意图,并更新强化学习算法中用于确定目标意图的参数。
可选的,在本申请实施例中,在目标值越大,表明目标意图与真实意图的实际符合程度越大的情况下,电子设备根据目标值,更新知识图谱中的多个候选意图的具体过程为:电子设备可以在确定目标值小于预设阈值的情况下,或者在确定目标值小于预设阈值的次数等于预设次数的情况下,删除上述步骤502中多个候选意图中的目标值对应的目标意图。当然,电子设备也可以根据在知识图谱中实时记录的用户的操作数据,在确定增加了新的意图时,可以在多个候选意图中增加新的意图。在一个例子中,该步骤可以由图15所示的意图识别决策系统501中的意图识别模块605执行,即意图识别模块605可以基于动作反馈模块608反馈的信息更新强化学习算法中用于确定目标意图的参数。
可以理解,在场景未变,对应的候选意图发生变化的情况下,电子设备需要重新确定候选意图对应的意图摇臂,从而构成摇臂集合。或者,在出现新场景的情况下,电子设备只需要确定对应的候选意图,并确定候选意图对应的意图摇臂,从而构成摇臂结合。
由于现有技术中的摇臂集合是固定的,包含电子设备预存的全部意图摇臂。但是,本申请实施例中,实现了摇臂集合随着候选意图改变而改变,从而实现了用户兴趣转移与意图变化的快速支持,提高了用户体验,以及提升了意图识别的准确性。
示例性的,如图58所示,假设电子设备中预存有用户的四个意图,以及每个意图对应的意图摇臂。四个意图分别为:看新闻意图、看视频意图、听音乐意图和导航意图,分别对应的四个意图摇臂为:看新闻摇臂、看视频摇臂、听音乐摇臂和导航摇臂。且假设用户的状态信息为静止状态,场景信息为乘坐公交车。与静止状态、乘坐公交车对应的候选意图为:听音乐意图、看新闻意图和看视屏意图。如果电子设备从候选意图中确定出的目标意图为看视 频意图,并在展示该看视频意图的预设时间段内,识别到用户的忽视操作,从而得到看视频意图对应的目标值。在该情况下,如果目标值小于预设阈值,则电子设备可以删除与静止状态、乘坐公交车对应的候选意图中的看视频意图。且在该情况下,如果电子设备实时在知识图谱中记录了用户打开导航的数据,则电子设备可以在与静止状态、乘坐公交车对应的候选意图中增加导航意图。此时,知识图谱中更新后的、与静止状态、乘坐公交车对应的候选意图为:听音乐意图、看新闻意图和导航意图。
现有技术中,电子设备在展示意图之后,仅考虑用户是否点击该意图,但是在实际应用中用户的反馈可能包含除是否点击外的其他操作,因此导致分析得到的反馈不准确。在本申请实施例中,通过考虑预设时间段内的反馈操作,该反馈操作的类型较多,并利用不同的反馈操作能够得到不同的目标值,这样增加了反馈信息的准确度,从而为后续更新强化学习算法中的各个参数奠定了基础,进而提升了意图识别的准确度。
(2)基于预先建立的意图识别模型识别意图
本方案中,图15所示的的意图识别决策系统501中的意图识别模块605可以将实体识别模块603识别出的实体输入到意图识别模型中,以识别出用户的意图。
以上即是对本方案中所涉及的电子设备的硬件结构、软件结构,意图识别决策系统等的相关介绍。为便于理解,下面举例进行介绍本方案中的意图识别过程。
实施例1:
下面结合上述示例性电子设备100的软硬件结构,对本申请实施例中意图识别方法进行具体描述,如图59所示,为本申请实施例中意图识别方法一个流程示意图:
S801、响应于第一触发,电子设备在第一时间段内获取第一数据序列。
该第一触发可以为电子设备100中可触发实体识别的任一个触发。可以理解的是,电子设备100中预存了可以触发实体识别的各种触发条件,当满足某个触发条件时,则触发获取相应时间窗格长度内相应输入类型的第一数据序列。本方案中,第一数据序列可以包括多个数据。其中,多个数据中至少两个数据的输入方式不同,也即是说,这些数据的输入方式是多模态的。例如,其中一个数据的输入方式为触控操作的输入,另一个数据的输入方式为传感器数据的输入,又一个数据的输入方式为文本数据的输入等等。
不同的触发条件可以包括被动场景变化的触发,例如,检测到从室外到室内时触发、检测到环境温度高于35度时触发、检测到环境噪声高于50分贝时触发、检测到到达交通站点时触发、检测到移动速度高于100km/h时触发、检测到局域网中有新的智能设备接入时触发等等;也可以包括用户主动操作的触发,例如,检测到用户连接wifi时触发、检测到用户打开相机时触发、检测到用户关闭闹钟时触发等等,此处不做限定。
不同的触发条件触发后,所对应的实体识别的时间窗格的长度,以及对多模态输入中哪几种输入类型的数据进行实体识别均为预先设置:
例如,可以设置其中一个触发条件为从室外到室内,该触发对应的实体识别的时间窗格为30秒,该触发对应的多模态输入的类型为用户操作输入、环境感知输入、文本输入、语音输入。再如,可以设置另一个触发条件为打开音乐播放器,该触发对应的实体识别的时间窗格为20秒,该触发对应的多模态输入的类型为用户操作输入、文本输入、语音输入。具体的不同的触发条件对应的时间窗格的长度与多模态输入的类型,根据实际情况和需求而定,此 处不作限定。
S802、电子设备根据该第一数据序列,确定第一实体序列。
本方案中,电子设备100获取到第一数据序列后,可以对第一数据序列中的数据进行识别,得到第一实体序列。
在一个例子中,电子设备100从第一数据序列中确定第一实体序列时,可以从第一数据序列中提取特征向量,得到第一特征向量集合。其中,第一特征向量集合中可以包括所有从第一数据序列中提取得到的特征向量,该特征向量可以用于表示第一数据序列中数据的特征。电子设备100在得到第一特征向量集合后,可以将第一特征向量集合输入到实体识别模型,得到第一实体序列。
在一个例子中,实体识别模型可以为实体识别模块603中的实体提取单元6031。若实体识别模块603的实体仓库单元6033中已预先存储了某些实体。由于实体的存储方式中又包含有表示该实体的特征向量,可以理解为实体仓库单元6033中存储了特征向量与实体的对应关系,而知识库602中又含有根据实体仓库单元6033中存储的实体训练出来的实体识别模型,因此电子设备能将实体仓库单元6033中已预先存储的这些实体识别出来;若某些特征向量在实体仓库单元6033中没有预先存储与其对应的实体,则会将这些特征向量存储起来,以供后续检测是否能提取出新的实体。电子设备100的实体仓库单元6033中已预先存储了大部分日常常用的需要进行识别的实体,能将这些实体识别出来。对实体的描述,具体可参阅上述术语描述中的实体部分,此处不再赘述。
可以理解的是,知识库602中的数据可以存储于电子设备100中,也可以存储在云端服务器中以便于多用户共享且实时更新相关现有领域知识,此处不作限定。
可选的,本申请的一些实施例中,在触发实体识别后,若多模态输入601中某个输入已能明确的确定其意图,则可以不再执行后续步骤,可直接根据确定的意图进行决策推理,执行相应的动作。例如,若用户打开语音助手为实体识别的一个触发,若用户对语音助手说:现在使用QQ音乐播放歌曲1。则可直接执行该动作,不再需要执行后续步骤。若用户对语音助手说:放歌。则意图不明确,需要根据多模态输入进行实体识别组成实体序列,继续执行后续步骤。
在一个例子中,第一实体序列可以为一个实体序列,该第一实体序列中至少包括在该第一时间窗格内多模态输入中识别出的实体及顺序。此外得到第一时间窗格内识别出的实体及顺序后,该实体序列可以与电子设备100的上下文模块604中存储的在此前的实体识别过程中识别出来的实体序列,共同组成第一实体序列。对实体序列的描述,具体可以参阅上述术语描述中的实体序列部分,此处不再赘述。
S803、电子设备确定该第一实体序列对应的第一意图;
作为一种可能的实现方式,根据电子设备100中存储的实体序列与意图的对应关系,电子设备可以确定该第一实体序列对应的第一意图。其中,第一意图为一个意图,第一意图可以用于确定动作序列。
其中,实体序列与意图的对应关系的表现形式可以为一种函数或一组函数,其可以包括模型类的函数,例如深度学习的模型、线性回归模型等,也可以包括规则类函数,例如,预先设置好的什么样的实体序列对应什么样的意图的规则函数。不管其表现形式如何,该实体序列与意图的对应关系均预先存储在电子设备中,比如可以存储在意图仓库单元6053中,且根据确定的实体序列的输入,能得到确定的意图的输出,其具体的表现形式,此处不作限定。
该实体序列与意图的对应关系可以为电子设备厂商预先设置的,可以为第三方数据服务商根据其获取的大数据提取出来的,可以为根据多用户共享的实体序列数据与意图数据训练出来的,也可以为仅根据用户自己的电子设备获取到的实体序列数据与用户标注的意图训练出来的,此处不做限定。
可以理解的是,该实体序列与意图的对应关系可以基于电子设备识别出的实体以及动作反馈模块608的反馈结果进行匹配更新,也可以周期性的从云端下载最新的对应关系数据进行更新,此处不作限定。
作为另一种可能的实现方式,电子设备100可以将第一实体序列输入到意图识别模型,得到第一意图。其中,该意图识别模型可以为根据对应的实体序列与意图的数据训练得到的实体序列与意图的对应关系。示例性的,电子设备100在确定第一实体序列后,可以加载或调用厂商放置于云服务器中的共享的意图识别模型,输入第一实体序列,输出第一意图。其中,若将加载的该意图识别模型存储在电子设备中,当有新的实体序列需要识别其意图时,则电子设备可以直接使用加载的该意图识别模型,也可以继续直接调用云服务器中共享的最新的意图识别模型,此处不作限定。在一个例子中,该意图识别模型可以由图41a中所示的模型训练方法训练得到,也可以由图43所示的模型训练方法训练得到。
作为又一种可能的实现方式,电子设备100可以根据第一实体序列中的实体,和存储的知识图谱,确定出多个候选意图。然后,电子设备100再采用预设的强化学习算法,从多个候选意图中确定出第一意图。示例性的,电子设备100可以根据第一实体序列中的实体,查找存储的知识图谱,确定用户的状态信息和场景信息。其中,状态信息可以用于标识该用户的当前状态,场景信息可以用于标识用户当前所处的环境。最后,手机可以根据状态信息、场景信息和候选意图的对应关系,可以获取到确定出的状态信息和场景信息对应的多个候选意图。其中,状态信息、场景信息和候选意图的对应关系包含在知识图谱中。
S804、电子设备至少根据该第一意图和第一实体序列,确定第一动作序列;
电子设备可以根据实体序列、意图与动作序列的对应关系、该第一意图以及该第一实体序列,确定第一动作序列,该第一动作序列为一个动作序列,该第一动作序列中包括第一待执行动作。
实体序列、意图与动作序列的对应关系的表现形式可以有很多种,可以为一种函数或一组函数,其可以包括模型类的函数,例如深度学习的模型、线性回归模型等,也可以包括规则类函数,例如,预先设置好的什么样的实体序列和意图对应什么样的动作序列的规则函数。
示例性的,该实体序列、意图与动作序列的对应关系可以为一个训练好的动作预测模型,训练该动作预测模型时,可以将大量的【实体序列、意图、动作序列】数据输入模型进行训练,训练完成后,通过输入实体序列,即可得到意图与对应的动作序列。
该动作预测模型可以为由电子设备厂商获取大量用户数据训练完成后共享给用户的,可以为第三方数据服务商根据其获取的大数据训练完成后发布给用户的,可以为根据多个用户共享的数据训练完成后共享使用的,也可以为仅根据用户自己的电子设备获取到的实体序列数据与用户标注的意图和动作序列训练出来的,还可以采用电子设备,此处不作限定。
示例性的,当电子设备确定第一意图和第一实体序列时,可以加载或调用厂商训练完成后放置于云服务器中的共享动作预测模型,输入第一意图和第一实体序列,输出第一动作序列。若加载该动作预测模型存储在电子设备中,当有新的实体序列需要识别其意图时,则电子设备可以直接使用加载的该动作预测模型,也可以继续直接调用云服务器中共享的最新的 动作预测模型,此处不作限定。
在一个例子中,该动作预测模型可以由图41a中所示的模型训练方法训练得到,也可以由图43所示的模型训练方法训练得到。
可以理解的是,该实体序列、意图与动作序列的对应关系可以存储在电子设备中,也可以存储在云端服务器中以便于多用户共享和更新,此处不作限定。
在一些简单场景中,除了根据该实体序列、意图与动作序列的对应关系来确定动作序列,电子设备还可以根据规则引擎606提供的规则来确定动作序列。例如,若当前识别得到的实体序列为【上午8点】【智能水壶】,识别到的意图为烧开水,而规则引擎606存储的规则中有一条【上午8点10分烧水,水温40度】,则电子设备可以不再使用存储的实体序列、意图与动作序列的对应关系,例如动作预测模型来预测该实体序列和意图对应的动作序列,而是直接根据该规则,生成【1,智能水壶,上午8点10分烧水,水温40度】的动作序列。在一个例子中,可以将实体序列和意图输入到规则引擎606中,并将规则引擎606的输出结果作为动作序列。示例性的,规则引擎606可以基于图21所示的方法确定动作序列。
确定的一个动作序列中可以有多个待执行动作,也可以只有一个待执行动作,此处不作限定。一个动作序列中的多个待执行动作可以需要同一个设备执行,也可以需要不同的设备执行。
S805、电子设备发送第一指令给该第一待执行动作对应的第一设备,指示该第一设备执行该第一待执行动作。
待执行动作可以包括启动特定目标应用程序/服务或执行预设目标操作以自动化完成操作、后台加载特定目标应用程序以提升打开该应用程序时的响应速度、无线连接特定目标设备以便于操作分布式场景下其他设备、发送通知消息以提醒用户等等电子设备可以执行的各种动作或服务,此处不作限定。
电子设备按照第一动作序列中各待执行动作对应的设备,发送指令给各待执行动作对应的设备,使其执行待执行动作中的动作/服务。
可以理解的是,若待执行动作对应的设备为电子设备自己,则电子设备可以直接执行该待执行动作中的动作/服务。
例如,若电子设备根据实体序列和意图决策推理确定出的一个动作序列为【1、电子设备、打开音乐播放器】,【2、车载设备、打开蓝牙】,【3、车载设备、蓝牙连接电子设备】,【4、电子设备、播放音乐播放器列表中的音乐】,则电子设备执行【1、电子设备、打开音乐播放器】,【4、电子设备、播放音乐播放器列表中的音乐】这两个待执行动作,将执行【2、车载设备,打开蓝牙】,【3、车载设备、蓝牙连接电子设备】这两个待执行动作的指令发送到车载设备,由车载设备执行打开蓝牙和蓝牙连接电子设备的动作。
上述实施例中,电子设备响应第一触发后,对第一时间窗格内多模态输入进行识别,得到第一实体序列,据此来预测用户意图,由于用户一段时间内的连续行为和设备状态变化会反应事件发生的潜在逻辑,相比于现有的仅根据用户当前时刻的单模态输入获取的信息来预测意图,采用多模态上下文输入信息,可以挖掘出大量数据中的隐含关联信息,为预测其意图提供了更充足的依据,提升了意图识别的准确性。
在预测得到意图后,根据该第一实体序列和第一意图决策推理得到需要执行的第一动作序列,发送指令给第一动作序列中各待执行动作对应的设备,指示各设备执行相应的待执行动作,为用户精准的提供了他所需要的响应或服务的决策,提升了用户体验。
可以理解的是,本方案中,电子设备可以在获取到第一数据序列后,基于第一数据序列确定用户的第一意图,以及基于该第一意图确定出第一待执行动作。在一个例子中,基于第一数据序列确定用户的第一意图,可以是将第一数据序列输入到意图识别模型中,由意图识别模型识别到第一意图,也可以是向上文所描述的先确定出第一实体序列,再由第一实体序列确定第一意图,在此不做限定。在一个例子中,电子设备基于第一意图确定出第一待执行动作,可以是将第一意图输入到动作预测模型中得到第一待执行动作,也可以是向上文所描述的基于第一实体序列和第一意图得到第一待执行动作,在此不做限定。
实施例2:
如图60所示,为一个多设备互联的分布式场景的示意图。多个智能设备,例如台灯、智能音响、空调、空气净化器、电视、电灯、体脂称等智能设备均可以通过路由器与手机互联,手机与智能手表以及汽车可以通过蓝牙互联,形成一个多设备互联的分布式场景。
下面以一具体应用场景为例,结合图59所示意图识别方法,对本申请实施例中的意图识别方法进行具体的示例性描述:
除了知识库中预置保存有的实体识别的触发点、触发点对应的时间窗格、触发点对应的多模态输入方式的类型外,手机可以根据获取到的用户对手机及与该手机互联的智能设备的日常使用数据,新增用户习惯规则、实体识别的触发点、触发点对应的时间窗格到知识库中。
例如:手机根据从联网的智能水壶获取到的启动记录,确定每天上午6点10分,用户会烧一壶温度为65度的水。手机将【用户上午6点10分使用智能水壶烧水,温度65度】的用户习惯规则添加到知识库602中,并在知识库602中添加一个触发点为时间触发:每天上午6点,同时添加该触发点对应的时间窗格为10分钟。
当手机确定时间为上午6点,手机根据从知识库602取得的触发点和该触发点对应的时间窗格,触发实体识别。触发点为上午6点,时间窗格为10分钟。
手机按照图59所示方法中的步骤S801和步骤S802,在这10分钟内,对从不同输入方式中获取到的数据进行实体识别:手机从时钟应用获取当前时间信息数据,从互联的路由器中获取联网的智能设备信息数据,从获取到的数据中提取特征向量,将这些特征向量输入从知识库602取得的实体识别模型中。手机中出厂预置的实体仓库单元6033中采用【实体编号、实体名称、特征向量集合】的方式存储有时间实体和常见的智能设备实体,因此知识库602中根据该实体仓库单元6033中的实体训练出来的实体识别模型根据输入的特征向量,能识别出来实体:上午6点,智能水壶。手机将识别出的这2个实体组成实体序列:【上午6点】【智能水壶】。
手机按照图59所示方法中的步骤S803,将该实体序列:【上午6点】【智能水壶】输入厂商预先存储在意图仓库单元6053中的意图识别模型(一种实体序列与意图的对应关系的表现形式)中,得到输出的意图:烧水。
手机按照图59所示方法中的步骤S804,确定该实体序列【上午6点】【智能水壶】与意图烧水在规则引擎606中有与其匹配的规则,不需要使用厂商根据所有用户数据训练出来的动作预测模型来预测其动作序列,可以直接调用规则引擎606根据知识库602中用户习惯规则【用户上午6点10分使用智能水壶烧水,温度65度】更新而来的规则【上午6点10分使用智能水壶烧水,温度65度】,确定动作序列,其中包括一个待执行动作:【1、智能水壶、6点10分启动、温度65度】。
手机按照图59所示方法中的步骤S805,确定该待执行动作【1、智能水壶、6点10分启动、温度65度】对应的设备为智能水壶,发送包含温度控制的定时启动指令给该智能水壶。智能水壶收到该包含温度控制的定时启动指令后,在6点10分定时启动,自动接水、烧水,并在检测到温度达到65度时开始保温。
再如,手机根据音乐播放器的启动和播放记录,确定每天上午8点至8点10分,用户会打开音乐播放器播放歌曲。手机将【用户8点开始听歌】的用户习惯规则添加到知识库中,并在知识库中添加一个触发点为时间触发:上午7点40分,添加该触发点对应的时间窗格为20分钟。
当手机确定时间为上午7点40分,手机根据从知识库602取得的触发点和该触发点对应的时间窗格,触发实体识别。触发点为上午7点50分,时间窗格为20分钟。
手机按照图59所示方法中的步骤S801和步骤S802,在这20分钟内,从日历应用中获取时间信息数据,从用户信息中获取家庭住址数据,从GPS获取定位数据,从与该手机互联的路由器获取联网智能设备状态信息数据,从手机历史应用记录中获取用户在7点至8点使用应用程序的记录数据,从获取到的数据中提取特征向量,将这些特征向量输入从知识库602取得的实体识别模型中。手机中出厂预置的实体仓库单元6033中采用【实体编号、实体名称、特征向量集合】的方式存储有时间实体、常见地址实体、常见的智能设备实体、应用程序实体等,因此知识库602中根据该实体仓库单元6033中的实体训练出来的实体识别模型根据输入的特征向量,能识别出来实体:7点40,休息日;地点:家;可用设备:手机、音箱;应用习惯:QQ音乐,微信,支付宝,抖音;手机将这些实体组成实体序列:【当前时间:7点40】【休息日】【地点:家】【可用设备:手机、音箱】【应用习惯:QQ音乐,微信,支付宝,抖音】。
手机按照图59所示方法中的步骤S803,将该实体序列:【当前时间:7点40】【休息日】【地点:家】【可用设备:手机、音箱】【应用习惯:QQ音乐,微信,支付宝,抖音】输入厂商预先存储在意图仓库单元6053中的意图识别模型中,得到输出的意图:听歌。
手机按照图59所示方法中的步骤S804,确定该实体序列【当前时间:7点40】【休息日】【地点:家】【可用设备:手机、智能音箱】【应用习惯:QQ音乐,微信,支付宝,抖音】与意图听歌在规则引擎606中有与其匹配的规则,不需要使用厂商根据所有用户数据训练出来的动作预测模型来预测其动作序列,可以直接调用规则引擎606根据知识库602中用户习惯规则【用户8点开始听歌】更新而来的规则【上午8点,使用可用播放设备和使用频率最高的歌曲播放应用程序播放歌曲】来确定动作序列,其中包括2个待执行动作:【1、手机、预加载QQ音乐】【2、手机、预加载音频隔空投送服务】。
手机按照图59所示方法中的步骤S805,确定该待执行动作【1、手机、预加载QQ音乐】【2、手机、预加载音频隔空投送服务】对应的设备均为手机,预加载QQ音乐并预加载音频隔空投送服务。当用户点击QQ音乐应用程序时,由于已经预先加载好了,手机即可迅速启动该QQ音乐播放器。当用户点击播放一首歌曲后,想要使用联网的智能音箱播放该歌曲,点击音频隔空投送控件时,由于已经预先加载好了音频隔空投送服务,手机可以迅速将播放器正在播放的音频投送到智能音箱进行播放。
知识库602中由厂商预设存储有一个实体识别的触发点:进入地库环境,以及该触发点 对应的时间窗格:30分钟。
当手机检测到环境声音分贝数降低,温度降低且GPS定位处于地库位置时,判断用户进入了地库环境,根据从知识库602取得的触发点和该触发点对应的时间窗格,触发实体识别。触发点为上午7点50分,时间窗格为20分钟。
触发实体识别。触发点为:进入地库环境,时间窗格为30分钟。
手机按照图59所示方法中的步骤S801和步骤S802,在这30分钟内,对从不同输入方式中获取到的数据进行实体识别:手机从GPS中获取位置数据,从无线连接模块获取蓝牙连接信息数据,从获取到的数据中提取特征向量,将这些特征向量输入从知识库602取得的实体识别模型中。手机中出厂预置的实体仓库单元6033中采用【实体编号、实体名称、特征向量集合】的方式存储有常见位置实体、无线连接模块实体以及距离实体,因此知识库602中根据该实体仓库单元6033中的实体训练出来的实体识别模型根据输入的特征向量,能识别出来实体:位置:停车场,蓝牙:连接上了车载蓝牙。手机将识别出的这2个实体与此前识别出的上下文实体,组成实体序列:【当前时间:7点40】【休息日】【地点:家】【可用设备:手机、智能音箱】【应用习惯:QQ音乐,微信,支付宝,抖音】【位置:停车场】【蓝牙:连接上了车载蓝牙】。
手机按照图59所示方法中的步骤S803,将该实体序列:【当前时间:7点40】【休息日】【地点:家】【可用设备:手机、智能音箱】【应用习惯:QQ音乐,微信,支付宝,抖音】【位置:停车场】【蓝牙:连接上了车载蓝牙】输入厂商预先存储在意图仓库单元6053中的意图识别模型中,得到输出的意图:上车。
手机按照图59所示方法中的步骤S804,确定该实体序列【当前时间:7点40】【休息日】【地点:家】【可用设备:手机、智能音箱】【应用习惯:QQ音乐,微信,支付宝,抖音】【位置:停车场】【蓝牙:连接上了车载蓝牙】与意图上车在规则引擎606中没有与其匹配的规则,使用厂商根据所有用户数据训练出来的动作预测模型来预测其动作序列。将实体序列【当前时间:7点40】【休息日】【地点:家】【可用设备:手机、智能音箱】【应用习惯:QQ音乐,微信,支付宝,抖音】【位置:停车场】【蓝牙:连接上了车载蓝牙】与意图上车输入决策推理模块607中存储的动作预测模型,得到动作序列输出,其中包括两个待执行动作:【1、汽车控制设备、唤醒】【2、车载播放器、继续播放手机播放器中的歌曲】。
手机按照图59所示方法中的步骤S805,确定该待执行动作【1、汽车控制设备、唤醒】对应的设备为汽车控制设备,发送唤醒指令给该汽车控制设备。汽车控制设备收到该唤醒指令后,唤醒汽车中所有电子设备。确定该待执行动作【2、车载播放器、继续播放手机播放器中的歌曲】对应的设备为车载播放器,发送继续播放指令给该车载播放器,车载播放器收到该继续播放指令后,基于蓝牙连接继续播放手机播放器中的歌曲。
知识库602中存储有用户从网络上共享下载的一个实体识别的触发点:车启动,以及该触发点对应的时间窗格:车启动直到车停止。
当手机从汽车控制设备获取到车启动的信息时,根据从知识库602取得的触发点和该触发点对应的时间窗格,触发实体识别。触发点为:车启动,时间窗格为:车启动直到车停止。
手机按照图59所示方法中的步骤S801和步骤S802,在车启动后,对从不同输入方式中获取到的数据进行实体识别:从互联的车载系统获取汽车当前状态数据,从速度传感器中获取当前速度信息数据,从互联的车载摄像头中获取拍摄的视频数据,从互联的智能手表中获 取心率数据,从获取到的数据中提取特征向量,将这些特征向量输入从知识库602取得的实体识别模型中。手机中出厂预置的实体仓库单元6033中采用【实体编号、实体名称、特征向量集合】的方式存储有汽车状态实体、速度实体、常见人物面部特征实体以及心率实体,因此知识库602中根据该实体仓库单元6033中的实体训练出来的实体识别模型根据输入的特征向量,能识别出来实体:汽车状态:行驶中,时速120km/h,用户双目无神,用户心率低于平均值。手机将识别出的这些实体与此前识别出的上下文实体,组成实体序列:【当前时间:7点40】【休息日】【地点:家】【可用设备:手机、智能音箱】【应用习惯:QQ音乐,微信,支付宝,抖音】【位置:停车场】【蓝牙:连接上了车载蓝牙】【汽车状态:行驶中】【时速120km/h】【用户双目无神】【用户心率低于平均值】。
手机按照图59所示方法中的步骤S803,将该实体序列:【当前时间:7点40】【休息日】【地点:家】【可用设备:手机、智能音箱】【应用习惯:QQ音乐,微信,支付宝,抖音】【位置:停车场】【蓝牙:连接上了车载蓝牙】【汽车状态:行驶中】【时速120km/h】【用户双目无神】【用户心率低于平均值】输入用户从网上共享下载的第三方数据服务商提供的意图识别模型中,得到输出的意图:振作用户精神。
手机按照图59所示方法中的步骤S804,确定该实体序列【当前时间:7点40】【休息日】【地点:家】【可用设备:手机、智能音箱】【应用习惯:QQ音乐,微信,支付宝,抖音】【位置:停车场】【蓝牙:连接上了车载蓝牙】【汽车状态:行驶中】【时速120km/h】【用户双目无神】【用户心率低于平均值】与意图振作用户精神在规则引擎606中没有与其匹配的规则,使用用户默认设置的从网上共享下载的第三方数据服务商提供的动作预测模型来预测其动作序列。将实体序列【当前时间:7点40】【休息日】【地点:家】【可用设备:手机、智能音箱】【应用习惯:QQ音乐,微信,支付宝,抖音】【位置:停车场】【蓝牙:连接上了车载蓝牙】【汽车状态:行驶中】【时速120km/h】【用户双目无神】【用户心率低于平均值】与意图振作用户精神,输入用户默认设置的从网上共享下载的第三方数据服务商提供的动作预测模型,得到动作序列输出,其中包括三个待执行动作:【1、汽车控制设备、打开换气系统】【2、汽车控制设备、调低空调温度】【3、汽车控制设备、播放安全警示】。
手机按照图59所示方法中的步骤S805,确定该待执行动作【1、汽车控制设备、打开换气系统】【2、汽车控制设备、调低空调温度】【3、汽车控制设备、播放安全警示】对应的设备均为汽车控制设备,发送打开换气系统、调低空调温度与播放安全警示的指令给该汽车控制设备。汽车控制设备收到该指令后,自动控制打开换气系统,使车内氧气充足,将温度适当调低,使用户神志清醒,并播放安全警示,提醒用户当前状态有风险,保证行车安全。
实施例3:
上面实施例中实体提取单元6031能从多模态输入模块601获取的数据中提取出特征向量,实体仓库单元6033中存储有预设好的常见的实体与特征向量集合的对应关系,因此若提取出来的特征向量集合在实体仓库单元6033中有与其对应的实体存储,则能将这些实体识别出来。若某些特征向量集合在实体仓库中没有对应的实体存储,则无法将其识别为实体。
进一步的,电子设备还可以检测实体仓库单元与实体序列,将出现频率超出预设第一频率阈值的异常特征向量集合确定为新的实体,添加到实体仓库单元中。
如图61所示,为本申请实施例中实体扩展的一个信息流示意图。电子设备中还可以包含异常检测模块1101,该异常检测模块1101可以通过对实体仓库单元6033和实体序列的检测, 将经常出现的异常特征向量集合确定为新的实体存储到实体仓库单元6033中,从而对实体仓库单元6033中存储的实体进行扩展。
具体的,实体提取单元6031可以从多模态输入模块601获取的数据中提取出特征向量,可以将其中不能识别为实体的特征向量集合也存储到实体仓库单元6033中。若某些还不能识别为实体的特征向量集合与其他可识别为实体的特征向量集合的区分度超出预设区分阈值,则异常检测模块1101可以认为这样的特征向量集合为异常特征向量集合。若异常检测模块1101检测到某异常特征向量集合在短期内反复出现,例如出现频率超出预设频率阈值,则将其判定为一个新的、以前从未出现过的实体,将其补充进实体仓库单元中。将异常特征向量集合补充进实体仓库的方式可以为其分配一个实体编号。
例如,若以前实体仓库单元6031中只存储有帽子、女孩、牛仔裤这三个实体,这三个实体在实体仓库单元6031中的存储形式为:【1234,帽子,特征向量集合1】、【1235,女孩,特征向量集合2】、【1236,牛仔裤,特征向量集合3】。因此在实体识别时,仅能识别出这三个实体。但在某个时间段内,一个新的特征向量集合4在实体识别时反复出现,超出了预设第一频率阈值1次/天。初次出现时,由于该新的特征向量集合4无法识别为实体,且其与已有实体对应的特征向量集合1、2、3的区分度超出了预设区分阈值,因此电子设备将其判定为异常特征向量集合。当其反复出现,出现频率超出预设第一频率阈值时,电子设备将该特征向量集合4确定为新的实体,为其分配一个实体编号,保存【1237,特征向量集合4】到实体仓库单元6031中。虽然此时电子设备不知道这个新的实体的实体名为鞋子,但是经过实体仓库单元6031的自动扩展,其在实体识别时已经能识别出来该实体,并用于后续的意图预测。
上面实施例中,意图仓库单元6053中存储有预设好的常见意图,且这些意图建立了与实体序列的对应关系。但随着用户的使用,可能会需要有体现用户新的需求的新意图。
进一步的,电子设备还可以将检测到的出现频率超出预设第二频率阈值的异常动作确定为新的意图,添加到意图仓库单元中。
如图62所示,为本申请实施例中意图扩展的一个信息流示意图;电子设备中的异常检测模块1101可以实时检测实体仓库单元6033、意图仓库单元6053、异常检测模块1101中的动作序列库、当前产生的实体序列、意图与动作序列,如发现用户的某个动作为此前未出现过的与其他意图对应的动作序列中的动作不同的动作,则判定其为一个异常动作,将其存入缓存。如果该异常动作在短期内反复出现,例如出现频率超出预设第二频率阈值,则将其判定为一个新的以前未出现过的意图,将其补充到现有的意图仓库单元,从而对意图仓库单元中的现有意图进行扩展。并根据检测到这个异常动作之前的实体序列,更新意图识别模型,建立实体序列与该新意图的对应关系。
例如,若由于用户长期晚上加班,此前有一个实体序列与意图的对应关系:实体序列:【工作日】【晚上11点】【公司】到意图:滴滴企业版打车回家(公司付费)。但由于用户这段时间经常不加班了,电子设备检测出来的实体序列成为【工作日】【晚上6点】【公司】,并在检测到这个实体序列后,检测到用户经常会打开普通滴滴打车(自费)。则电子设备会将普通滴滴打车(自费)作为一个新的意图存储到意图仓库中,并建立与实体序列【工作日】【晚上11点】【公司】的对应关系。
请参阅图63,为本申请实施例中的电子设备1200另一实施例包括:
输入装置1201、输出装置1202、处理器1203和存储器1204(其中电子设备1200中的处理器1403的数量可以一个或多个,图63中以一个处理器1203为例)。在本申请的一些实施例中,输入装置1201、输出装置1202、处理器1203和存储器1204可通过总线或其它方式连接,其中,图63中以通过总线连接为例。
其中,通过调用存储器1204存储的操作指令,处理器1203,用于执行上述实施例中的意图识别方法。在一个例子中,处理器1203可以为图13中的处理器110。
需要说明的是,本方案中对图15中所示的意图识别决策系统501中的一个或多个模块的改进均可以达到实现提升意图识别的准确性的目的。例如,对意图识别模块605中意图识别模型的改进可以提升意图识别的准确性;对决策推理模块607中动作预测模型的改进可以确定出的待执行动作的准确性,从而可以基于用户的反馈准确的更新意图识别模块605中的意图识别模型,进而提升意图识别模块605中的意图识别模型的意图识别的准确性;对动作反馈模块608中多示例学习模型的改进可以准确的确定出打点数据的子序列,从而提升意图识别的准确性,进而可以根据动作反馈模块608的反馈信息更新意图识别模块605中的意图识别模型,进而提升意图识别模块605中的意图识别模型的意图识别的准确性。
可以理解的是,图15中所示的意图识别决策系统501中的任意多个模块的组合改进也可以达到实现提升意图识别的准确性目的。例如,同时对决策推理模块607和动作反馈模块608进行改进,则可以提升两者确定的结果的准确性,而在两者确定出的结果的准确性均提高的情况下,意图识别模块605接收到的反馈数据的质量也将提高,从而可以精准的更新意图识别模块605中的意图识别模型,进而提升意图识别模块605中的意图识别模型的意图识别的准确性。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。
上述实施例中所用,根据上下文,术语“当…时”可以被解释为意思是“如果…”或“在…后”或“响应于确定…”或“响应于检测到…”。类似地,根据上下文,短语“在确定…时”或“如果检测到(所陈述的条件或事件)”可以被解释为意思是“如果确定…”或“响应于确定…”或“在检测到(所陈述的条件或事件)时”或“响应于检测到(所陈述的条件或事件)”。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或 多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如DVD)、或者半导体介质(例如固态硬盘)等。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,该流程可以由计算机程序来指令相关的硬件完成,该程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。而前述的存储介质包括:ROM或随机存储记忆体RAM、磁碟或者光盘等各种可存储程序代码的介质。

Claims (24)

  1. 一种意图识别方法,其特征在于,所述方法包括:
    第一电子设备确定第一触发;
    响应于所述第一触发,所述第一电子设备在第一时间段内获取第一数据序列,所述第一数据序列包括多个数据,所述多个数据中至少两个数据的输入方式不同;
    所述第一电子设备根据所述第一数据序列,确定用户的第一意图;
    所述第一电子设备根据所述第一意图,确定第一待执行动作。
  2. 根据权利要求1所述的方法,其特征在于,所述第一电子设备根据所述第一数据序列,确定用户的第一意图;包括:
    所述第一电子设备根据所述第一数据序列,确定第一实体序列,所述第一实体序列包括至少一个实体,所述实体为现实世界中客观存在的并可以相互区分的对象、事物或动作;
    所述第一电子设备根据所述第一实体序列,确定所述第一意图,其中,所述第一意图用于确定动作序列。
  3. 根据权利要求2所述的方法,其特征在于,所述所述第一电子设备根据所述第一意图,确定第一待执行动作,包括:
    所述第一电子设备根据所述第一实体序列和所述第一意图,确定第一动作序列,所述第一动作序列包括所述第一待执行动作;
    在所述第一电子设备确定第一待执行动作之后,还包括:
    所述第一电子设备执行所述第一待执行动作。
  4. 根据权利要求3所述的方法,其特征在于,所述第一待执行动作中包含设备标识与待执行动作,所述第一电子设备执行所述第一待执行动作,具体包括:
    所述第一电子设备确定所述第一待执行动作中的设备标识是否为所述第一电子设备的设备标识;
    当确定所述第一待执行动作中的设备标识为所述第一电子设备的设备标识时,所述第一电子设备执行所述第一待执行动作;
    否则,所述第一电子设备发送第一指令给所述第一待执行动作中设备标识对应的第二电子设备,所述第一指令用于指示所述第二电子设备执行所述第一待执行动作。
  5. 根据权利要求2-4任一所述的方法,其特征在于,所述方法还包括:
    所述第一电子设备将出现频率超出预设第一频率阈值的异常特征向量集合确定为新的实体,其中,所述异常特征向量集合为在实体识别时,与可识别为实体的特征向量集合的区分度超出预设区分阈值的无法识别为实体的特征向量集合。
  6. 根据权利要求2-5任一所述的方法,其特征在于,所述方法还包括:
    所述第一电子设备将出现频率超出预设第二频率阈值的异常动作确定为新的意图,其中,所述异常动作为未出现过的且不在已有意图对应的动作序列中的动作;
    所述第一电子设备根据所述异常动作出现前识别到的实体序列,建立所述新的意图与实体序列之间的对应关系。
  7. 根据权利要求2-6任一所述的方法,其特征在于,所述第一电子设备根据所述第一数据序列,确定第一实体序列,具体包括:
    所述第一电子设备从所述第一数据序列中提取特征向量,得到第一特征向量集合,所述 第一特征向量集合中包括所有从所述第一数据序列中提取得到的特征向量,所述特征向量用于表示所述第一数据序列的特征;
    所述第一电子设备将所述第一特征向量集合输入实体识别模型,得到所述第一实体序列,所述实体识别模型为根据所述第一电子设备中存储的实体数据训练得到的特征向量与实体的对应关系,所述实体数据为所述实体的存储形式,所述实体数据至少包括实体的编号及表示该实体的特征向量集合。
  8. 根据权利要求3-6任一所述的方法,其特征在于,所述第一电子设备根据所述第一实体序列,确定第一意图,具体包括:
    所述第一电子设备根据所述第一实体序列和存储的知识图谱,确定多个候选意图;
    所述第一电子设备采用预设的强化学习算法,从所述多个候选意图中确定所述第一意图。
  9. 根据权利要求8所述的方法,其特征在于,所述所述第一电子设备根据所述第一实体序列和存储的知识图谱,确定多个候选意图,具体包括:
    根据所述第一实体序列和所述知识图谱,确定所述用户的状态信息和场景信息;所述状态信息用于表示所述用户的当前状态,所述场景信息用于表示所述用户当前所处的环境;
    根据状态信息、场景信息和候选意图的对应关系,确定所述状态信息和所述场景信息对应的所述多个候选意图。
  10. 根据权利要求9所述的方法,其特征在于,所述采用预设的强化学习算法,从所述多个候选意图中确定所述第一意图,包括:
    确定与所述多个候选意图一一对应的意图摇臂;
    根据所述第一实体序列、所述状态信息、所述场景信息、与所述多个候选意图一一对应的意图摇臂,以及所述强化学习算法,从所述多个候选意图中确定所述第一意图。
  11. 根据权利要求3-6任一所述的方法,其特征在于,所述第一电子设备根据所述第一实体序列,确定第一意图,具体包括:
    所述第一电子设备将所述第一实体序列输入意图识别模型,得到所述第一意图,所述意图识别模型为根据对应的实体序列与意图的数据训练得到的实体序列与意图的对应关系。
  12. 根据权利要求11所述的方法,其特征在于,所述第一电子设备将所述第一实体序列输入意图识别模型之前,还包括:
    所述第一电子设备将测试数据输入至第一生成器,经过所述第一生成器处理后得到第一模拟数据;
    所述第一电子设备将所述测试数据和所述第一模拟数据输入至第一判别器,经过所述第一判别器处理后得到第一判别结果,所述第一判别结果用于指示所述测试数据和所述第一模拟数据之间的差异;
    所述第一电子设备根据所述第一判别结果更新所述第一生成器的权重系数,得到第二生成器;
    所述第一电子设备在所述第二生成器中生成第二模拟数据;
    所述第一电子设备将第一目标模拟数据输入预设的训练网络,训练得到所述意图识别模型,所述第一目标模拟数据包括所述第二模拟数据。
  13. 根据权利要求11所述的方法,其特征在于,所述第一电子设备中配置有群体粗粒度模型和细粒度模型;
    所述第一电子设备将所述第一实体序列输入意图识别模型之前,还包括:
    所述第一电子设备获取细粒度标签与粗粒度标签的映射关系;
    所述第一电子设备根据所述映射关系将训练数据集中的细粒度数据映射为粗粒度数据;
    所述第一电子设备将所述粗粒度数据输入到所述群体粗粒度模型进行训练,通过多个节点设备的联合学习对所述群体粗粒度模型进行更新,并将所述细粒度数据输入到所述细粒度模型进行训练,其中,所述多个节点设备中包括所述第一电子设备;
    所述第一电子设备组合所述群体粗粒度模型和所述细粒度模型得到所述意图识别模型,所述意图识别模型的标记空间映射为细粒度标签,所述意图识别模型的输出结果用于更新所述细粒度模型。
  14. 根据权利要求13所述的方法,其特征在于,所述第一电子设备中还配置有个体粗粒度模型,所述个体粗粒度模型的标记空间映射为粗粒度标签;
    所述第一电子设备组合所述群体粗粒度模型和所述细粒度模型得到所述意图识别模型,包括:
    所述第一电子设备组合所述群体粗粒度模型、个体粗粒度模型和所述细粒度模型以得到所述意图识别模型。
  15. 根据权利要求11-14任一所述的方法,其特征在于,所述第一电子设备执行所述第一待执行动作之后,还包括:
    所述第一电子设备确定待识别的打点数据序列,所述待识别的打点数据序列由打点数据组成,所述打点数据包括所述第一电子设备记录的用户的操作数据和/或所述第一电子设备对用户操作的响应数据;
    所述第一电子设备将所述待识别的打点数据序列输入多示例学习模型,得到多个子序列;所述多示例学习模型为已采用所述第一电子设备中的打点数据序列训练过的多示例学习模型;
    所述第一电子设备按照预设意图规则确定第一子序列的意图,所述第一子序列为所述多个子序列中的一个子序列,所述预设意图规则用于根据序列中的打点数据确定序列的意图;
    所述第一电子设备基于确定出的多个子序列的意图,更新所述意图识别模型。
  16. 根据权利要求3-6任一所述的方法,其特征在于,所述第一电子设备根据所述第一实体序列和所述第一意图,确定第一动作序列,具体包括:
    所述第一电子设备将所述第一实体序列和所述第一意图输入动作预测模型,得到所述第一动作序列,所述动作预测模型为根据对应的实体序列、意图与动作序列的数据训练得到的实体序列、意图与动作序列的对应关系。
  17. 根据权利要求3-6任一所述的方法,其特征在于,所述第一电子设备根据所述第一实体序列和所述第一意图,确定第一动作序列,具体包括:
    所述第一电子设备将所述第一实体序列和所述第一意图输入规则引擎,得到所述第一动作序列,所述规则引擎中包含根据用户使用习惯或使用场景设定的实体序列、意图与动作序列的对应关系。
  18. 根据权利要求17所述的方法,其特征在于,所述规则引擎包括:第一节点,所述第一节点至少包括第一类型节点和第二类型节点;
    所述第一类型节点,用于根据输入所述规则引擎中的第一实体的第一属性,从内存中获取第一语义对象对所述第一实体进行匹配,得到第一匹配结果,所述第一属性用于表征所述第一实体的变化频率;
    所述第二类型节点,用于根据输入所述规则引擎中的第二实体的第二属性,从文件中获取第二语义对象对所述第二实体进行匹配,得到第二匹配结果,所述第二属性用于表征所述第二实体的变化频率,所述第二属性不同于所述第一属性;
    其中,所述第一匹配结果和所述第二匹配结果共同用于确定是否执行所述第一待执行动作。
  19. 根据权利要求1-18任一所述的方法,其特征在于,所述第一时间段与所述第一触发具有对应关系。
  20. 根据权利要求1-19任一所述的方法,其特征在于,所述第一数据序列由所述第一电子设备从触控操作的输入、传感数据的输入、文本数据的输入、语音数据的输入、视频数据的输入以及与所述第一电子设备互联的智能设备的传输数据的输入中至少两种输入方式得到;
    所述第一待执行动作包括启动目标应用程序、启动目标服务、后台加载目标应用程序、无线连接目标设备、发送通知消息中一种动作或服务。
  21. 一种电子设备,其特征在于,包括:
    至少一个存储器,用于存储程序;
    至少一个处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行如权利要求1-20任一所述的方法。
  22. 一种计算机存储介质,所述计算机存储介质中存储有指令,当所述指令在计算机上运行时,使得计算机执行如权利要求1-20任一所述的方法。
  23. 一种包含指令的计算机程序产品,当所述指令在计算机上运行时,使得所述计算机执行如权利要求1-20任一所述的方法。
  24. 一种规则引擎的执行装置,其特征在于,所述装置运行计算机程序指令,以执行如权利要求1-20任一所述的方法。
PCT/CN2021/079723 2020-03-09 2021-03-09 意图识别方法及电子设备 WO2021180062A1 (zh)

Applications Claiming Priority (14)

Application Number Priority Date Filing Date Title
CN202010159364 2020-03-09
CN202010159364.X 2020-03-09
CN202010791068 2020-08-07
CN202010791068.1 2020-08-07
CN202010918192.X 2020-09-03
CN202010918192 2020-09-03
CN202010973466 2020-09-16
CN202010973466.5 2020-09-16
CN202011111562 2020-10-16
CN202011111562.5 2020-10-16
CN202110176533 2021-02-09
CN202110176533.5 2021-02-09
CN202110246051.2 2021-03-05
CN202110246051.2A CN113377899A (zh) 2020-03-09 2021-03-05 意图识别方法及电子设备

Publications (1)

Publication Number Publication Date
WO2021180062A1 true WO2021180062A1 (zh) 2021-09-16

Family

ID=77570607

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/079723 WO2021180062A1 (zh) 2020-03-09 2021-03-09 意图识别方法及电子设备

Country Status (2)

Country Link
CN (1) CN113377899A (zh)
WO (1) WO2021180062A1 (zh)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112684711A (zh) * 2020-12-24 2021-04-20 青岛理工大学 一种人体行为与意图的交互识别方法
CN114153990A (zh) * 2021-12-07 2022-03-08 支付宝(杭州)信息技术有限公司 一种知识生产管线的构建方法、系统及装置
CN114218586A (zh) * 2021-12-09 2022-03-22 杭州数鲲科技有限公司 商业数据智能管理方法、装置、电子设备及存储介质
CN114238648A (zh) * 2021-11-17 2022-03-25 中国人民解放军军事科学院国防科技创新研究院 一种基于知识图谱的博弈对抗行为决策方法及装置
CN114398464A (zh) * 2021-12-28 2022-04-26 北方工业大学 一种基于知识图谱的研讨数据展示方法及系统
CN114626530A (zh) * 2022-03-14 2022-06-14 电子科技大学 一种基于双边路径质量评估的强化学习知识图谱推理方法
CN114722180A (zh) * 2022-04-24 2022-07-08 贝塔通科技(北京)有限公司 生成意图标签的方法、装置、设备、介质和程序产品
CN114841165A (zh) * 2022-05-12 2022-08-02 平安科技(深圳)有限公司 用户数据分析及展示方法、装置、电子设备及存储介质
CN114970819A (zh) * 2022-05-26 2022-08-30 哈尔滨工业大学 一种基于意图推理与深度强化学习的移动目标搜索跟踪方法及系统
CN115017884A (zh) * 2022-01-20 2022-09-06 昆明理工大学 基于图文多模态门控增强的文本平行句对抽取方法
CN115145904A (zh) * 2022-07-06 2022-10-04 枣庄宏禹数字科技有限公司 用于ai云计算训练的大数据清洗方法及大数据采集系统
CN115223556A (zh) * 2022-06-15 2022-10-21 中国第一汽车股份有限公司 一种自反馈式车辆语音控制方法及系统
CN115374714A (zh) * 2022-10-26 2022-11-22 中国科学院、水利部成都山地灾害与环境研究所 一种基于生境适宜性的生态安全格局构建方法
CN116050428A (zh) * 2023-03-07 2023-05-02 腾讯科技(深圳)有限公司 意图识别方法、装置、设备及存储介质
WO2023083262A1 (zh) * 2021-11-12 2023-05-19 华为技术有限公司 基于多设备提供服务的方法、相关装置及系统
WO2023107182A1 (en) * 2021-12-10 2023-06-15 Microsoft Technology Licensing, Llc. Automatically forming and using a local network of smart edge devices
TWI814361B (zh) * 2022-04-27 2023-09-01 力鼎環境科技有限公司 地理圖資系統
CN116912867A (zh) * 2023-09-13 2023-10-20 之江实验室 结合自动标注和召回补全的教材结构提取方法和装置
CN117034957A (zh) * 2023-06-30 2023-11-10 海信集团控股股份有限公司 一种语义理解方法及设备
CN117252995A (zh) * 2023-11-17 2023-12-19 深圳市加推科技有限公司 智能名片的使用方法、智能销售系统及存储介质
CN117909508A (zh) * 2024-03-20 2024-04-19 成都赛力斯科技有限公司 意图识别方法、模型训练方法、装置、设备及存储介质
CN118378152A (zh) * 2024-06-24 2024-07-23 浙江聚米为谷信息科技有限公司 一种基于行为数据分析的用户画像分类方法及系统

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113918700B (zh) * 2021-10-15 2022-07-12 浙江百世技术有限公司 一种带噪的半监督意图识别模型训练方法
CN113946222B (zh) * 2021-11-17 2024-10-15 杭州逗酷软件科技有限公司 一种控制方法、电子设备及计算机存储介质
CN114417881B (zh) * 2022-01-04 2024-06-21 马上消费金融股份有限公司 敏感词检测方法、装置、电子设备及存储介质
CN114493781A (zh) * 2022-01-25 2022-05-13 工银科技有限公司 用户行为预测方法、装置、电子设备及存储介质
CN114116987B (zh) * 2022-01-26 2022-04-12 中国电子科技集团公司第五十四研究所 一种基于语义化网络的信息交换系统
CN114647756B (zh) * 2022-03-11 2024-08-13 北京百度网讯科技有限公司 基于图像的搜索方法、装置、电子设备及存储介质
CN114661910A (zh) * 2022-03-25 2022-06-24 平安科技(深圳)有限公司 一种意图识别方法、装置、电子设备及存储介质
CN114724078B (zh) * 2022-03-28 2023-06-23 西南交通大学 基于目标检测网络与知识推理的人员行为意图识别方法
CN114840417B (zh) * 2022-04-08 2024-08-13 浙江大学 一种基于滑动窗口和控件信息的移动应用控件采样方法
CN114863517B (zh) * 2022-04-22 2024-06-07 支付宝(杭州)信息技术有限公司 一种面部识别中的风险控制方法、装置以及设备
CN114866306B (zh) * 2022-04-24 2022-12-23 北京丁牛科技有限公司 一种安全防护方法、装置和存储介质
CN114925273B (zh) * 2022-05-23 2023-01-10 厦门亿加网络科技有限公司 基于大数据分析的用户行为预测方法及ai预测分析系统
CN115018198B (zh) * 2022-06-30 2024-08-09 国网河南省电力公司经济技术研究院 一种考虑差异化需求响应方案的居民用户用电优化方法
CN116027934B (zh) * 2022-08-11 2023-10-20 荣耀终端有限公司 展示卡片的方法及装置
CN115345970B (zh) * 2022-08-15 2023-04-07 哈尔滨工业大学(深圳) 基于生成对抗网络的多模态输入视频条件生成方法
CN115409133B (zh) * 2022-10-31 2023-02-03 中科航迈数控软件(深圳)有限公司 基于跨模态数据融合的数控机床操作意图识别方法及系统
CN116108375B (zh) * 2022-12-19 2023-08-01 南京理工大学 一种基于结构敏感图字典嵌入的图分类方法
CN116662674B (zh) * 2023-07-28 2023-10-13 安徽省模式识别信息技术有限公司 一种基于高效马尔科夫毯学习机制的服务推荐方法及系统
CN117010725B (zh) * 2023-09-26 2024-02-13 科大讯飞股份有限公司 一种个性化决策方法、系统以及相关装置
CN117672227B (zh) * 2024-01-25 2024-04-05 深圳市音随我动科技有限公司 基于智能音箱的问答控制方法、装置、计算机设备和介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106569613A (zh) * 2016-11-14 2017-04-19 中国电子科技集团公司第二十八研究所 一种多模态人机交互系统及其控制方法
CN109558479A (zh) * 2018-11-29 2019-04-02 北京羽扇智信息科技有限公司 一种规则匹配方法、装置、设备及存储介质
CN110288016A (zh) * 2019-06-21 2019-09-27 济南大学 一种多模态意图融合方法及应用
CN110287283A (zh) * 2019-05-22 2019-09-27 中国平安财产保险股份有限公司 意图模型训练方法、意图识别方法、装置、设备及介质
CN110704641A (zh) * 2019-10-11 2020-01-17 零犀(北京)科技有限公司 一种万级意图分类方法、装置、存储介质及电子设备

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845624A (zh) * 2016-12-16 2017-06-13 北京光年无限科技有限公司 与智能机器人的应用程序有关的多模态交互方法及系统
CN107845383A (zh) * 2017-09-27 2018-03-27 北京金山安全软件有限公司 控制服务设备执行服务操作的方法、装置、设备及介质
CN110262273A (zh) * 2019-07-12 2019-09-20 珠海格力电器股份有限公司 一种家居设备控制方法、装置、存储介质及智能家居系统
CN110597970B (zh) * 2019-08-19 2023-04-07 华东理工大学 一种多粒度医疗实体联合识别的方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106569613A (zh) * 2016-11-14 2017-04-19 中国电子科技集团公司第二十八研究所 一种多模态人机交互系统及其控制方法
CN109558479A (zh) * 2018-11-29 2019-04-02 北京羽扇智信息科技有限公司 一种规则匹配方法、装置、设备及存储介质
CN110287283A (zh) * 2019-05-22 2019-09-27 中国平安财产保险股份有限公司 意图模型训练方法、意图识别方法、装置、设备及介质
CN110288016A (zh) * 2019-06-21 2019-09-27 济南大学 一种多模态意图融合方法及应用
CN110704641A (zh) * 2019-10-11 2020-01-17 零犀(北京)科技有限公司 一种万级意图分类方法、装置、存储介质及电子设备

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112684711B (zh) * 2020-12-24 2022-10-11 青岛理工大学 一种人体行为与意图的交互识别方法
CN112684711A (zh) * 2020-12-24 2021-04-20 青岛理工大学 一种人体行为与意图的交互识别方法
WO2023083262A1 (zh) * 2021-11-12 2023-05-19 华为技术有限公司 基于多设备提供服务的方法、相关装置及系统
CN114238648A (zh) * 2021-11-17 2022-03-25 中国人民解放军军事科学院国防科技创新研究院 一种基于知识图谱的博弈对抗行为决策方法及装置
CN114238648B (zh) * 2021-11-17 2022-11-08 中国人民解放军军事科学院国防科技创新研究院 一种基于知识图谱的博弈对抗行为决策方法及装置
CN114153990A (zh) * 2021-12-07 2022-03-08 支付宝(杭州)信息技术有限公司 一种知识生产管线的构建方法、系统及装置
CN114218586A (zh) * 2021-12-09 2022-03-22 杭州数鲲科技有限公司 商业数据智能管理方法、装置、电子设备及存储介质
CN114218586B (zh) * 2021-12-09 2022-09-30 杭州数鲲科技有限公司 商业数据智能管理方法、装置、电子设备及存储介质
WO2023107182A1 (en) * 2021-12-10 2023-06-15 Microsoft Technology Licensing, Llc. Automatically forming and using a local network of smart edge devices
CN114398464A (zh) * 2021-12-28 2022-04-26 北方工业大学 一种基于知识图谱的研讨数据展示方法及系统
CN115017884B (zh) * 2022-01-20 2024-04-26 昆明理工大学 基于图文多模态门控增强的文本平行句对抽取方法
CN115017884A (zh) * 2022-01-20 2022-09-06 昆明理工大学 基于图文多模态门控增强的文本平行句对抽取方法
CN114626530A (zh) * 2022-03-14 2022-06-14 电子科技大学 一种基于双边路径质量评估的强化学习知识图谱推理方法
CN114722180A (zh) * 2022-04-24 2022-07-08 贝塔通科技(北京)有限公司 生成意图标签的方法、装置、设备、介质和程序产品
TWI814361B (zh) * 2022-04-27 2023-09-01 力鼎環境科技有限公司 地理圖資系統
CN114841165A (zh) * 2022-05-12 2022-08-02 平安科技(深圳)有限公司 用户数据分析及展示方法、装置、电子设备及存储介质
CN114841165B (zh) * 2022-05-12 2023-06-23 平安科技(深圳)有限公司 用户数据分析及展示方法、装置、电子设备及存储介质
CN114970819B (zh) * 2022-05-26 2024-05-03 哈尔滨工业大学 一种基于意图推理与深度强化学习的移动目标搜索跟踪方法及系统
CN114970819A (zh) * 2022-05-26 2022-08-30 哈尔滨工业大学 一种基于意图推理与深度强化学习的移动目标搜索跟踪方法及系统
CN115223556A (zh) * 2022-06-15 2022-10-21 中国第一汽车股份有限公司 一种自反馈式车辆语音控制方法及系统
CN115223556B (zh) * 2022-06-15 2024-05-14 中国第一汽车股份有限公司 一种自反馈式车辆语音控制方法及系统
CN115145904B (zh) * 2022-07-06 2023-04-07 北京正远达科技有限公司 用于ai云计算训练的大数据清洗方法及大数据采集系统
CN115145904A (zh) * 2022-07-06 2022-10-04 枣庄宏禹数字科技有限公司 用于ai云计算训练的大数据清洗方法及大数据采集系统
CN115374714A (zh) * 2022-10-26 2022-11-22 中国科学院、水利部成都山地灾害与环境研究所 一种基于生境适宜性的生态安全格局构建方法
CN116050428A (zh) * 2023-03-07 2023-05-02 腾讯科技(深圳)有限公司 意图识别方法、装置、设备及存储介质
CN117034957A (zh) * 2023-06-30 2023-11-10 海信集团控股股份有限公司 一种语义理解方法及设备
CN117034957B (zh) * 2023-06-30 2024-05-31 海信集团控股股份有限公司 一种融合大模型的语义理解方法及设备
CN116912867A (zh) * 2023-09-13 2023-10-20 之江实验室 结合自动标注和召回补全的教材结构提取方法和装置
CN116912867B (zh) * 2023-09-13 2023-12-29 之江实验室 结合自动标注和召回补全的教材结构提取方法和装置
CN117252995A (zh) * 2023-11-17 2023-12-19 深圳市加推科技有限公司 智能名片的使用方法、智能销售系统及存储介质
CN117252995B (zh) * 2023-11-17 2024-03-05 深圳市加推科技有限公司 智能名片的使用方法、智能销售系统及存储介质
CN117909508A (zh) * 2024-03-20 2024-04-19 成都赛力斯科技有限公司 意图识别方法、模型训练方法、装置、设备及存储介质
CN118378152A (zh) * 2024-06-24 2024-07-23 浙江聚米为谷信息科技有限公司 一种基于行为数据分析的用户画像分类方法及系统

Also Published As

Publication number Publication date
CN113377899A (zh) 2021-09-10

Similar Documents

Publication Publication Date Title
WO2021180062A1 (zh) 意图识别方法及电子设备
US11194842B2 (en) Methods and systems for interacting with mobile device
CN110168530B (zh) 电子设备和操作该电子设备的方法
US10970605B2 (en) Electronic apparatus and method of operating the same
US11507851B2 (en) System and method of integrating databases based on knowledge graph
WO2021018154A1 (zh) 信息表示方法及装置
AU2020203865B2 (en) System for securing a personal digital assistant with stacked data structures
US20230409919A1 (en) Method and electronic device for providing text-related image
WO2019140703A1 (zh) 一种用户画像的生成方法及装置
US20200204643A1 (en) User profile generation method and terminal
US10642231B1 (en) Switch terminal system with an activity assistant
WO2023083262A1 (zh) 基于多设备提供服务的方法、相关装置及系统
KR20190026560A (ko) 영상 표시 장치 및 그 동작 방법
KR20180072534A (ko) 텍스트와 연관된 이미지 제공 방법 및 이를 위한 전자 장치
KR20190076870A (ko) 연락처 정보를 추천하는 방법 및 디바이스
CN114281956A (zh) 文本处理方法、装置、计算机设备及存储介质
WO2022057764A1 (zh) 广告显示方法及电子设备
CN112287070B (zh) 词语的上下位关系确定方法、装置、计算机设备及介质
CN114281936A (zh) 分类方法、装置、计算机设备及存储介质
CN117217839A (zh) 媒体资源的下发方法、装置、设备和存储介质
KR102536674B1 (ko) 뉴럴 네트워크를 이용하여 도매 상품에 대한 추천 리스트를 셀러 단말에게 제공하는 방법 및 장치
US10401805B1 (en) Switch terminal system with third party access
CN116957678A (zh) 一种数据处理方法和相关装置
Campana et al. Lightweight modeling of user context combining physical and virtual sensor data
KR102422153B1 (ko) 뉴럴 네트워크를 이용한 상담 가이드 정보 추천 방법 및 장치

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21767836

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21767836

Country of ref document: EP

Kind code of ref document: A1