WO2024078210A1 - Memo reminding method and apparatus, and terminal and storage medium - Google Patents

Memo reminding method and apparatus, and terminal and storage medium Download PDF

Info

Publication number
WO2024078210A1
WO2024078210A1 PCT/CN2023/117394 CN2023117394W WO2024078210A1 WO 2024078210 A1 WO2024078210 A1 WO 2024078210A1 CN 2023117394 W CN2023117394 W CN 2023117394W WO 2024078210 A1 WO2024078210 A1 WO 2024078210A1
Authority
WO
WIPO (PCT)
Prior art keywords
memo
content
information
reminder
key information
Prior art date
Application number
PCT/CN2023/117394
Other languages
French (fr)
Chinese (zh)
Inventor
曾理
王立中
米岚
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2024078210A1 publication Critical patent/WO2024078210A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/109Time management, e.g. calendars, reminders, meetings or time accounting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/41Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Definitions

  • the embodiments of the present application relate to the field of human-computer interaction technology, and more particularly to a memo reminder method, device, terminal and storage medium.
  • the terminal takes a note of the text or image input by the user, and the note content is limited to a single mode, which limits the flexibility of the user's note and affects the efficiency of human-computer interaction.
  • the embodiment of the present application provides a memo reminder method, device, terminal and storage medium.
  • the technical solution is as follows:
  • an embodiment of the present application provides a memo reminder method, which is executed by a terminal and includes:
  • a memo reminder is performed based on the memo content.
  • an embodiment of the present application provides a memo reminder device, the device comprising:
  • the acquisition module is used to obtain the memo content when there is a need for memo recording
  • An information extraction module configured to extract information from the memo content based on the content mode corresponding to the memo content to obtain key information, wherein different information extraction methods are used in different content modes;
  • a storage module used for associating and storing the memo content and the key information
  • the memo reminder module is used to make a memo reminder based on the memo content when it is determined based on the key information that a memo reminder triggering condition is met.
  • an embodiment of the present application provides a terminal, which includes a processor and a memory; the memory stores at least one program, and the at least one program is used to be executed by the processor to implement the memo reminder method as described in the above aspect.
  • an embodiment of the present application provides a computer-readable storage medium, wherein the storage medium stores at least one program, and the at least one program is used to be executed by a processor to implement the memo reminder method as described in the above aspects.
  • an embodiment of the present application provides a computer program product, which includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • a processor of a computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the memo reminder method provided in the above aspect.
  • FIG1 shows a block diagram of a terminal provided by an exemplary embodiment of the present application
  • FIG2 shows a flow chart of a memo reminder method provided by an exemplary embodiment of the present application
  • FIG3 is a schematic diagram showing a memo reminder method according to an exemplary embodiment of the present application.
  • FIG4 shows a schematic diagram of actively acquiring memo content provided by an exemplary embodiment of the present application
  • FIG5 is a schematic diagram showing a method of obtaining an active trigger memo provided by an exemplary embodiment of the present application.
  • FIG6 shows a schematic diagram of obtaining a passive trigger memo provided by an exemplary embodiment of the present application
  • FIG. 7 is an exemplary visual modal memo content of the present application.
  • FIG8 is a schematic diagram showing an active trigger reminder provided by an exemplary embodiment of the present application.
  • FIG9 shows a flowchart of a passive trigger reminder provided by an exemplary embodiment of the present application.
  • FIG10 is a schematic diagram showing a memo combining spatiotemporal information provided by an exemplary embodiment of the present application.
  • FIG11 is a schematic diagram showing a method of acquiring extended content provided by an exemplary embodiment of the present application.
  • FIG. 12 shows a structural block diagram of a memo reminder device provided in one embodiment of the present application.
  • a and/or B can mean: A exists alone, A and B exist at the same time, and B exists alone.
  • the character "/" generally indicates that the related objects are in an "or” relationship.
  • FIG1 shows a block diagram of a terminal provided by an exemplary embodiment of the present application.
  • the terminal 100 may include one or more of the following components: a processor 110 and a memory 120 .
  • the processor 110 may include one or more processing cores.
  • the processor 110 uses various interfaces and lines to connect various parts within the entire terminal 100, and executes various functions of the computer device 100 and processes data by running or executing instructions, programs, code sets or instruction sets stored in the memory 120, and calling data stored in the memory 120.
  • the processor 110 can be implemented in at least one hardware form of digital signal processing (DSP), field programmable gate array (FPGA), and programmable logic array (PLA).
  • DSP digital signal processing
  • FPGA field programmable gate array
  • PDA programmable logic array
  • the processor 110 can integrate one or a combination of a central processing unit (CPU), a graphics processing unit (GPU), a neural network processor (NPU), and a modem.
  • the CPU mainly processes the operating system, user interface, and application programs; the GPU is responsible for rendering and drawing the content to be displayed on the touch display; the NPU is used to implement artificial intelligence (AI) functions; and the modem is used to process wireless communications. It is understandable that the above-mentioned modem may not be integrated into the processor 110, but may be implemented by a separate chip.
  • the memory 120 may include a random access memory (RAM) or a read-only memory (ROM).
  • the memory 120 includes a non-transitory computer-readable storage medium.
  • the memory 120 may be used to store instructions, programs, codes, code sets, or instruction sets.
  • the memory 120 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playback function, an image playback function, etc.), instructions for implementing the following various method embodiments, etc.; the data storage area may store data created according to the use of the computer device 100 (such as audio data, a phone book), etc.
  • the terminal may further include a display screen 130 and a microphone 140.
  • the display screen 130 is a component for displaying images.
  • the display screen 130 may be a built-in screen of the terminal, such as a screen of a smart phone, or an external screen of the terminal, such as an external display of a personal calculator.
  • the display screen 130 has a touch function in addition to the image display function, that is, the display content can be controlled by touching and clicking the display screen 130.
  • the microphone 140 is a component for collecting external sounds.
  • the terminal 100 supports the user to perform human-computer interaction in the memo reminder scenario through voice commands, and the microphone 140 can be used to collect user voice audio information for memo reminder.
  • the structure of the terminal 100 shown in the above figures does not limit the computer device, and the computer device may include more or fewer components than shown in the figure, or combine certain components, or arrange the components differently.
  • the terminal 100 may also include a camera component, a speaker, a radio frequency circuit, an input unit, a sensor, and a plurality of other components.
  • Devices such as acceleration sensors, angular velocity sensors, light sensors, etc.
  • audio circuits Wi-Fi modules
  • power supplies Bluetooth modules and other components are not described here.
  • Figure 2 shows a flow chart of a memo reminder method provided by an exemplary embodiment of the present application.
  • the method may include the following steps.
  • Step 201 when there is a need to record a memo, obtain the memo content.
  • the terminal obtains the corresponding memo content, wherein the memo content can be one of text, image, video, audio or a combination of multiple thereof, that is, the memo content can be multimodal.
  • the memo content will be described below in conjunction with some application scenarios.
  • the memo content in the scenario of web browsing, can be digital content, such as online products, news reports, e-books, and URLs (Uniform Resource Locator), etc.; in the scenario of convenient travel, the memo content can be map navigation data and screenshots indicating the route, and can also include the real physical coordinates of attractions, restaurants, etc., official websites, introduction pictures and audio, guide text, etc.; in the scenario of convenient life, the memo content can be schedule information, itinerary information, and note text, etc.
  • digital content such as online products, news reports, e-books, and URLs (Uniform Resource Locator), etc.
  • the memo content in the scenario of convenient travel, can be map navigation data and screenshots indicating the route, and can also include the real physical coordinates of attractions, restaurants, etc., official websites, introduction pictures and audio, guide text, etc.
  • the memo content in the scenario of convenient life, can be schedule information, itinerary information, and note text, etc.
  • the need for memo recording may be that the terminal receives a voice instruction for memo recording, and accordingly, the terminal obtains the memo content indicated by the voice instruction for memo recording.
  • the terminal performs ASR (Automatic Speech Recognition) processing on the voice audio information contained in the received voice instruction for memo recording, determines the memo content indicated by the voice instruction for memo recording, and further performs corresponding acquisition operations based on the memo content modality. For example, in the case where the memo content is voice input text, the terminal extracts the corresponding text from the text obtained by ASR processing as the memo content.
  • ASR Automatic Speech Recognition
  • the terminal obtains the corresponding memo content by taking a screenshot, shooting or saving a picture, and when the user instructs to take a memo for the current web page, the terminal obtains the web page link corresponding to the web page as the memo content.
  • the terminal obtains the user's memo recording voice command "Remind me to catch this flight" in response to the user's awakening voice, and the terminal determines that the memo content of the memo recording voice command is flight information and itinerary information through ASR and other processing methods. Then, the terminal obtains the currently displayed flight information picture through a screenshot tool, and obtains the flight information description text as the itinerary text.
  • the above flight information picture and itinerary text are both memo contents.
  • the need for memo recording may also be that the terminal obtains user behavior information and the user behavior information meets the memo conditions, and accordingly, the terminal determines the memo content based on the user behavior information.
  • the terminal In work and life, the terminal actively understands user behavior based on perceived user voice, operation behavior, action behavior, scenario information, etc., and actively obtains the memo content based on the judgment of user needs.
  • the memo can be completed without the user waking up the device, avoiding the user forgetting to perform the memo operation and missing information, and bringing a natural and seamless interactive experience to the user.
  • the terminal in a scenario where the terminal perceives based on the motion sensor and positioning component that the user is jogging outdoors, the terminal actively understands the user's action behavior and situational information, and determines that the user has the need to store the current jogging route, and then automatically obtains the route information and coordinate information as memo content.
  • the terminal when obtaining the memo content, if the terminal needs to use sensors with permission restrictions such as cameras and recorders, the terminal can trigger the sensor to turn on and obtain the memo content based on the user's memo record voice command, or it can actively ask for the user's permission by displaying a reminder pop-up window, and turn on the sensor and obtain the memo content based on the user's positive feedback.
  • Step 202 based on the content mode corresponding to the memo content, extract information from the memo content to obtain key information, wherein the information extraction method is different under different content modes.
  • the key information may include core summaries such as the attributes and themes of the memo content, as well as memo temporal and spatial information such as timestamps and climate, and information such as the intent type and triggering method determined based on the analysis of the memo content.
  • the terminal extracts information of a single modality for memo contents of different modalities respectively, obtains sub-key information of the memo content of each modality, and then merges them to obtain key information.
  • modalities such as website text + screenshot, user command text + photo, song audio + title text, etc.
  • Step 203 store the memo content and key information in association with each other.
  • the key information obtained in step 202 includes different data forms, such as text, entity relationship pairs, image region coordinates, pixel values, spectrograms, timestamps, temperature and humidity, altitude, longitude and latitude, etc. Key information in different data forms corresponds to different dimensions of the memo content.
  • the terminal constructs the obtained key information into a semi-structured Key-value data structure, such as a dictionary, a Hashmap, etc.
  • a semi-structured Key-value data structure such as a dictionary, a Hashmap, etc.
  • Table 1 illustrates the basic forms of storing key information through semi-structured data by enumeration.
  • Step 204 When it is determined based on the key information that the memo reminder triggering condition is met, a memo reminder is made based on the memo content.
  • the memo reminder method can be an active reminder, that is, actively making a memo reminder based on satisfying the memo reminder trigger condition, wherein the memo reminder method can be to display a reminder message in a message notification box, or to emit a reminder audio or to combine vibration, etc., which is not limited in this application.
  • the terminal can also make a memo reminder in a passive reminder manner, that is, the terminal uses the user's query input as the memo reminder trigger condition, and only makes a memo reminder based on the corresponding memo content after obtaining the user's query input.
  • the feedback form of the terminal's memo reminder can be the original file stored in the associated storage, such as the user's original voice, or the memo information obtained by the terminal processing the original file, such as text-to-speech or a visual notification displayed on the terminal.
  • the terminal based on the memo recording voice command input by the user, the terminal obtains multi-modal and multi-dimensional information such as text, vision, hearing, situational information, and time and space information through a variety of sensors to form the memo content, while being compatible with multi-modal data, the richness of the memo content is improved.
  • the embodiment of the present application determines the intention type of the memo content through information extraction, and then automatically determines the triggering method of the memo reminder, and performs a memo reminder when the memo reminder triggering conditions are met; the embodiment of the present application expands support for the input and output of multi-modal information, improves the way to help users remember, and thus improves the efficiency and quality of human-computer interaction.
  • information is extracted from the memo content to obtain key information, including:
  • the content modality corresponding to the memo content includes a text modality
  • natural language processing is performed on the memo content to obtain key information of the text
  • the content modality corresponding to the memo content includes a visual modality, performing image recognition processing on the memo content to obtain key image information;
  • audio recognition processing is performed on the memo content to obtain audio key information.
  • Performing natural language processing on the memo content to obtain key text information includes at least one of the following methods:
  • a causal inference analysis is performed on the memo content to obtain trigger mode information, wherein the trigger mode includes active triggering and passive triggering, and when the trigger mode indicated by the trigger mode information is active triggering, the trigger mode information includes active triggering conditions.
  • the key information includes text key information, and the text key information includes the trigger mode information;
  • a memo reminder is made based on the memo content, including:
  • an active memo reminder is performed based on the memo content
  • a passive memo reminder is performed based on the memo content.
  • image recognition processing is performed on the memo content to obtain key image information, including:
  • the memo content is a picture
  • optical character recognition is performed on the picture to obtain picture text
  • image natural language description processing is performed on the picture to obtain picture description text
  • the memo content is a video
  • the video is understood to obtain a video description text
  • the method further includes:
  • Natural language processing is performed on at least one of the picture text, the picture description text, and the video description text to obtain text key information.
  • audio recognition and extraction are performed on the memo content to obtain audio key information, including:
  • the method further includes:
  • the method further includes:
  • the associated storage of the memo content and key information includes:
  • the memo content, key information and extended content are stored in an associated manner.
  • obtaining extended content corresponding to the memo content includes:
  • the content mode of the memo content is a text mode and the amount of information of the key information is less than an information amount threshold, obtaining at least one of auditory extended content or visual extended content;
  • a memo reminder is performed based on the memo content, including:
  • a memo reminder is performed based on the memo content and the extended content.
  • obtaining extended content corresponding to the memo content includes:
  • a memo reminder is made based on the memo content, including:
  • a memo reminder is made based on the memo content
  • the method further includes:
  • a memo reminder is performed based on the extended content.
  • the method further comprises:
  • the memo content and key information are stored in association, including:
  • the memo content, key information and time and space information are stored in association;
  • a memo reminder is made based on the memo content, including:
  • a memo reminder is made based on the memo content.
  • the method after extracting information from the memo content based on the content modality corresponding to the memo content and obtaining key information, the method further includes:
  • the memo content and key information vector are stored in association
  • a memo reminder is made based on the memo content, including:
  • vector encoding is performed on the reminder instruction to obtain an instruction vector, including:
  • the key information of the instruction is vectorized and encoded to obtain an instruction vector.
  • obtaining the memo content includes:
  • the memo content is determined based on the user behavior information.
  • the method when it is determined based on the key information that the memo reminder triggering condition is met, after the memo reminder is made based on the memo content, the method further includes:
  • the memo content is deleted.
  • the information that users need to remember is not limited to text information, but also includes a large amount of visual information such as image information, video information, and auditory information such as music information and voice information. That is, when users use the terminal to help remember, the information content that needs to be stored is often multimodal.
  • the terminal supports the acquisition of multimodal memo content, and further, based on the memo content of different content modes, the terminal adopts a corresponding key information extraction method to determine the key information in the memo content for storage, so as to improve the human-computer interaction efficiency in the memo reminder scenario.
  • the method for extracting information from the memo content can be a combination of any one or more of the following:
  • NLP natural language processing
  • the terminal needs to understand the meaning of the memo content through NLU (Natural Language Understanding).
  • the terminal first performs Named Entity Recognition (NER) on the memo content to obtain entity information.
  • NER Named Entity Recognition
  • the predefined entity types may include time, location, name, object, and may also include currency, organization, etc. This application does not limit this.
  • the terminal performs entity recognition and annotation on the memo text based on the entity type.
  • NER [Time] Meet [Name] Li Ming at [Location] Zhongshan Park at 3 o'clock tomorrow. It should be noted that the method for implementing NER is not limited in the embodiment of this application.
  • the terminal performs entity and relation extraction (ERE) on the memo content and obtains To entity relationship information.
  • ERP entity and relation extraction
  • the terminal Based on the named entity recognition results of the memo content obtained by NER, the terminal performs entity extraction and relationship extraction, and simplifies the memo content into core entity relationships for text analysis. For example, based on the memo content of "I put the key in the desk drawer", the terminal obtains the recognition result through NER: I put [item] in [location] desk drawer. Based on the above recognition result, the terminal performs entity relationship extraction and obtains the entity relationship information: key [location] desk drawer.
  • the terminal can also extract the subject summary (Text Summarization) of the memo content to obtain the subject information.
  • the main information can be an extractive summary (Extractive Summarization) or a generative summary (Abstractive Summarization), which is not limited in this application.
  • the terminal On the basis of determining the text content of the memo, the terminal further performs text intent recognition on the memo content to obtain the intent type, which is used to characterize the intention of recording the memo.
  • the intent type For different types of memo content, users have different storage intentions. For example, when the memo content is flight information, the user expects to receive a terminal reminder at the corresponding time, and when the memo content is product shopping information, the user expects to query it in the future.
  • the terminal can determine the intent type of the memo content based on text classification (TC) technology. Among them, the intent type can include schedule, reminder, and memo. By classifying the memo content into different intent types, the terminal can provide a basis for the subsequent terminal to determine the memo reminder method.
  • TC text classification
  • the terminal can determine that the user expects to be prompted when the corresponding conditions are met based on the memo content, and then the terminal can determine that its intent type is a reminder; based on the memo record voice command of "Please remember this flight for me”, the terminal obtains the memo content of "The XXX flight from Chengdu to Beijing you purchased on March 1st 8:45-11:20 has been issued", and then the terminal determines its intent type as a schedule through text intent recognition; based on the memo content of "My power-on password is XXX", the terminal determines its intent type as a memo.
  • the terminal obtains the trigger mode information by performing causal inference (CI) analysis on the memo content, that is, the key information includes the text key information, and the text key information includes the trigger mode information.
  • the trigger mode includes active triggering and passive triggering, and when the trigger mode indicated by the trigger mode information is active triggering, the trigger mode information includes active triggering conditions.
  • the trigger mode information includes at least the trigger mode, and the trigger mode corresponds to the intention type of the memo content.
  • the terminal determines that the intention type based on the memo content is reminder, and then determines that its trigger mode is active triggering, and extracts the traffic condition entity "congestion" in the memo content as the trigger mode information; as shown in FIG6, based on the memo recording voice instruction of "Help me collect this skirt", the terminal obtains the product link corresponding to the product as the memo content, and then based on the intention type of the memo content is memo, the terminal determines that its trigger mode is passive triggering, and only when the user enters the reminder instruction, the terminal performs a memo reminder based on the memo content.
  • the above-mentioned various NLP processing such as NER, ERE, CI, etc.
  • the processing results obtained based on the above-mentioned processing constitute key information.
  • the processing result of a certain NLP processing may be empty, which has no effect on the memo reminder process.
  • the terminal performs natural language processing on at least one of the image text, the image description text, and the video description text to obtain key text information.
  • the terminal performs natural language processing on the above text information in a manner that can be a combination of any one or more sub-methods in method 1.
  • the terminal performs optical character recognition (OCR) on the picture to obtain the picture text.
  • OCR optical character recognition
  • the terminal can convert the text symbols therein into text information through OCR technology, and can further extract text key information based on the obtained text information to obtain the key information contained in the image.
  • the terminal performs natural language description processing (Image Caption, IC) to obtain the picture description text.
  • IC natural language description processing
  • the terminal converts the image into natural language describing the image content so as to determine the information contained in the image.
  • the terminal obtains the following text description through IC technology: a boy with a travel bag is traveling. Further, the terminal extracts the key information of the text as described in method 1 and determines that the theme of the picture is "travel".
  • the terminal can perform optical character recognition to obtain the text in the picture, and also perform natural language description processing to further enrich the key information corresponding to the picture. It should also be noted that for the picture, the terminal can use visual grounding (VG) technology to locate the description subject in the picture based on the picture and the picture description text, and obtain the location area information of the picture subject. For example, in the case where the memo content is the picture shown in Figure 7, based on the picture and the text description, the terminal determines the regional location information of the subject "boy" in the picture as part of the target result through VG technology.
  • VG visual grounding
  • the terminal performs video understanding on the video to obtain a video description text.
  • the technologies adopted for video understanding may include but are not limited to video scene recognition, video action understanding, and video event understanding.
  • video understanding the terminal expresses the video content in the form of natural language text, that is, the information contained in the memo content is reflected through the video description text.
  • the terminal can process the video description text by extracting key information from the text to further clarify the video information, so as to facilitate the subsequent memo reminder to the user based on the memo content.
  • the terminal performs automatic speech recognition on the memo content to obtain an audio text. That is, when the user inputs the memo content in the form of a memo recording voice command, the terminal converts the voice information into a natural language text, that is, an audio text, through ASR, and further, the terminal performs natural language processing on the audio text to obtain text key information.
  • the natural language processing method can be any one or more combinations of the processing methods in method 1.
  • the terminal extracts audio features from the memo content to obtain an audio fingerprint.
  • the terminal can extract digital features from a segment of audio through audio fingerprinting technology and represent them through identifiers, thereby obtaining information contained in the audio memo content.
  • the terminal can also calculate the spectrogram of the audio file, that is, the frequency information of the audio in the time domain. Based on the spectrogram or audio fingerprint, the terminal supports users to use audio (such as a humming) as a reminder instruction to query the memo content.
  • the terminal can be triggered actively or passively for reminder.
  • the intent type corresponding to the memo content belongs to the schedule type or reminder type
  • the trigger mode information of the memo content indicates that the trigger mode is active triggering.
  • the terminal performs an active memo reminder based on the memo content.
  • the terminal only actively reminds the user based on the time information and location information in the memo content.
  • the memo reminder trigger condition may include any one or more combinations of all the entity information in the memo content, that is, the terminal can use the time, location, climate, etc. in the entity information as active trigger conditions, or can use events, traffic conditions, etc. in the entity information as active trigger conditions, which increases the richness of the reminder scenarios and improves the human-computer interaction experience.
  • the terminal determines that the intent category of the memo content is a reminder, and takes an active triggering approach to make a memo reminder, wherein the triggering method information includes an active triggering condition [traffic conditions], that is, "Changjiang Road is congested”, and then when it is detected that the traffic conditions meet the active triggering conditions, the terminal actively makes a memo reminder.
  • the triggering method information includes an active triggering condition [traffic conditions], that is, "Changjiang Road is congested”, and then when it is detected that the traffic conditions meet the active triggering conditions, the terminal actively makes a memo reminder.
  • the terminal can make a memo reminder to an object other than the user who performs the memo operation. That is, compared with the prior art in which the terminal only makes a memo reminder to the user who uses the terminal, in the embodiment of the present application, the user can add the memo reminder object information in the memo content. For example, as shown in FIG8, the user inputs the memo content as "Remind mom to take medicine at 9 o'clock in the evening" through the memo recording voice command, and then the terminal can determine that the memo reminder object indicated by the key information is the user through information extraction. and actively remind the mother when the active trigger condition "9 o'clock in the evening" is met.
  • the intent type corresponding to the memo content belongs to the schedule category or the memo category, and the trigger mode information of the memo content indicates that the trigger mode is passive triggering.
  • the terminal performs a passive memo reminder based on the memo content.
  • the memo reminder triggering condition corresponding to the passive trigger can also be multimodal, that is, the user can query based on the text corresponding to the voice command, or based on multimodal or multimodal combination reminder instructions such as picture information, audio information, web page links, etc.
  • the terminal can enrich the freedom of user query operations by supporting multimodal queries, so that users can complete queries in a way that is easy to express, which fits the user's intuitive feelings, and in the case where the memo content mode is the same as the reminder instruction mode, the memo reminder efficiency can be improved and the human-computer interaction experience can be improved.
  • the user can retrieve the corresponding song audio in the memo content based on a humming (audio information), or the user can query the corresponding product information in the memo content based on a clothing picture (image information).
  • the terminal responds to the user's wake-up command to start, and retrieves and feedbacks based on the reminder command input by the user.
  • the terminal Based on the fact that the reminder command can be multimodal, similar to the way the terminal processes the memo content, the terminal extracts information from the reminder command, obtains the key information of the command, and then constructs semi-structured data such as a dictionary to store the key information of the command.
  • the information extraction and associated storage method are the same as those in the above embodiment, and will not be repeated here.
  • the terminal compares and matches the key information of the instruction with the key information corresponding to the memo content, and feeds back the key information with the highest relevance to the user as the query result to complete the memo reminder.
  • the relevance can be text similarity, image similarity, spectrogram similarity, etc., and the above matching process is carried out between two key information dictionaries, each of which contains multiple key-value pairs, and then the number of similar values in the two dictionaries needs to be introduced to determine the similarity during matching.
  • the terminal determines that the key information of the instruction is "key”, and then the terminal matches the key information dictionary corresponding to the memo content to obtain a dictionary whose attributes include "key”, and feeds back the original data of the obtained dictionary to the user "I put my keys in the desk drawer”.
  • the terminal when there is a need for memo recording, the terminal obtains spatiotemporal information, which is used to characterize the time and space state when the memo is recorded.
  • the spatiotemporal information may include the timestamp, current location, altitude, temperature and humidity, and climate information corresponding to the memo recording need.
  • Spatiotemporal information is important information for enriching the memo content, which can improve query support, provide query tags for subsequent users to query based on the memo content, and improve the efficiency of memo reminders.
  • the terminal associates and stores the memo content and key information with the spatiotemporal information. Accordingly, in the case of the presence of spatiotemporal information in the memo content, the terminal determines that the memo reminder triggering condition is met based on the key information and spatiotemporal information, and performs a memo reminder based on the memo content. For example, as shown in FIG10, based on the memo content of "The Summer Resort is really spectacular", the terminal obtains the spatiotemporal information such as the timestamp, temperature and humidity, and altitude of the memo time while obtaining the scenic spot photos, location coordinates, videos, and text descriptions, and constructs a key information dictionary for associated storage through information extraction.
  • the spatiotemporal information such as the timestamp, temperature and humidity, and altitude of the memo time
  • the terminal When the reminder instruction "Where did I go to play when it was the hottest last year” is obtained, the terminal extracts the instruction key information through the instruction information: [time] last year, [temperature] hottest, [theme] travel, and then based on the spatiotemporal information in the memo content to meet the memo reminder triggering condition, the terminal feedbacks the text "The Summer Resort is really beautiful" and memo contents such as pictures and videos to the user. Acquiring spatiotemporal information as the memo content not only facilitates the user to query the memo content based on the reminder instruction, but also improves the accuracy of the feedback result.
  • the user When using the memo recording voice command to store the memo content, the user usually inputs the memo content in the form of daily communication that the user is accustomed to.
  • the input content is often single and incomplete, so there is a situation where the memo content cannot meet their memo needs, which affects the subsequent memo reminder experience.
  • the terminal based on the terminal can support multimodal memo content, on the basis of extracting information from the memo content input by the user, the terminal obtains the extended content corresponding to the memo content, and enriches the memo content through the extended content, so as to improve the user's human-computer interaction experience in the memo reminder scenario.
  • the terminal obtains at least one of the auditory extended content or the visual extended content.
  • the memo content input by the user only includes text mode
  • the terminal extracts information from the text memo content to obtain key information.
  • the amount of key information is less than the information threshold, that is, when the number of values in the key-value pair is small
  • the reminder instructions that the user can use to query the corresponding memo content are limited, and thus there is a situation where the user cannot obtain the required memo content based on the key information dictionary.
  • the terminal obtains the corresponding extended content of the memo content based on the recording scenario, wherein the extended content can be a web page snapshot, screenshot, or target area.
  • Visual extended content such as pixels can also be auditory extended content such as ambient sound and music, providing richer information for memo reminders.
  • the terminal when obtaining multimodal extended content, the terminal needs to use sensors involving user privacy such as cameras and recorders. Therefore, in the scenario of obtaining extended content, the terminal can set active inquiries and collect corresponding information after obtaining positive feedback from the user. For example, the user records a voice instruction through the memo to indicate "please help me write down the delicious stir-fried dishes of Xiao Ming's home cooking.” Based on the small amount of information in the memo content, the terminal can remind the user to take a photo of the store sign through voice or visual display, obtain visual extended content, and enrich the memo information. Optionally, the terminal can also be set to automatically turn on the sensor in response to user instructions.
  • the extended content may also include location information, etc.
  • the user indicates in the memo recording voice command "Please write down for me how delicious the stir-fried dishes of Xiao Ming's home cooking are", and the terminal obtains the text modal memo content.
  • the terminal can correspondingly obtain the location coordinates of "Xiao Ming's Home Cooking" and the user's travel routes as extended content.
  • the terminal stores the memo content, key information and extended content in an associated manner. Since the extended content can be multimodal, the terminal can still construct semi-structured data in the form of key-value pairs to store the extended content in an associated manner. Table 2 lists the methods of storing the extended content in an associated manner.
  • a memo reminder is performed based on the memo content and the extended content.
  • the memo reminder method is the same as the above embodiment and will not be described in detail here.
  • web links such as product links in shopping applications or public account content links have a certain timeliness.
  • the terminal can enrich the information related to the memo content by obtaining extended content to avoid the situation where the query is fruitless.
  • the terminal when the memo content is time-sensitive, the terminal obtains the extended content corresponding to the memo content.
  • the terminal first extracts information from the obtained memo content, and when it is determined that the content modality corresponding to the memo content is time-sensitive, for example, the memo content is a URL, an online product, etc., the terminal obtains the corresponding extended content based on the key information.
  • the terminal obtains the user's memo recording voice command "help me write down this skirt" in response to the user's arousal voice, and the terminal processes the memo recording voice command through NLP and other methods.
  • the method determines that the memo content indicated by the memo record voice command is the product information that the user is browsing, and then the terminal obtains the current product link and captures the current product image through a screenshot tool, or obtains the product image through the page if the image can be saved, and obtains the current product introduction text "Spring and Autumn New Dress".
  • the above product link, product image, and description text are all the memo contents obtained by the terminal.
  • the terminal performs a memo reminder based on the memo content.
  • the memo reminder method is the same as the above embodiment and will not be described in detail here.
  • the terminal when it is determined based on the key information that the memo reminder triggering conditions are met and the memo content is invalid, the terminal performs a memo reminder based on the extended content.
  • the terminal uses the extended content or the processed extended content as feedback to perform a memo reminder. For example, when the user enters the reminder command "view the collected dresses", the corresponding online product link as the memo content is invalid, and the terminal stores the picture corresponding to the online link as the extended content, and then the terminal can feedback the picture to the user to complete the memo reminder.
  • the terminal can perform an online search based on the web page screenshots, snapshots, etc. in the extended content, and feedback similar results obtained from the online search to the user to complete the memo reminder, thereby avoiding the situation where the user searches for the memo content but cannot obtain the search results, thereby improving the human-computer interaction experience.
  • the terminal can achieve unified representation of memo content of different content modes through multimodal fusion, thereby improving the efficiency of memo reminders.
  • the terminal vectorizes the key information to obtain the key information vector.
  • the method of vectorization encoding can be the multimodality fusion technology (MFT) in deep learning.
  • the terminal uses deep neural networks (DNN) and multimodal pre-training models and other technologies to convert multimodal key information in semi-structured data into vectors in high-dimensional space, realize the unified representation of key information, and provide convenience for users to obtain the memo content through passive triggering, that is, through reminder instructions.
  • DNN deep neural networks
  • DNN deep neural networks
  • pre-training models and other technologies to convert multimodal key information in semi-structured data into vectors in high-dimensional space, realize the unified representation of key information, and provide convenience for users to obtain the memo content through passive triggering, that is, through reminder instructions.
  • the terminal stores the memo content and the key information vector in association.
  • the terminal can store the key information vector in the form of a key-value pair to obtain a key information dictionary, where the keyword can be a feature vector and the value is the key information vector obtained by vectorization encoding.
  • the terminal when the terminal receives the reminder instruction, it vectorizes and encodes the reminder instruction to obtain an instruction vector.
  • the terminal when receiving a reminder instruction, extracts information from the reminder instruction based on the content mode corresponding to the reminder instruction to obtain key instruction information.
  • the way in which the terminal extracts information from the reminder instruction is the same as the above-mentioned way of extracting information from the memo content, which will not be repeated here.
  • the terminal When obtaining the key information of the instruction, the terminal vectorizes the key information of the instruction to obtain an instruction vector.
  • the way in which the terminal vectorizes the key information of the instruction is the same as the above-mentioned embodiment, which will not be repeated here.
  • the terminal can compare the instruction vector with the key information vector in the key information dictionary, that is, calculate the cosine distance between the instruction vector and the key information vector in the high-dimensional space.
  • the terminal determines that the memo reminder trigger condition is met, and then feeds back the corresponding memo content to the user.
  • the terminal deletes the memo content in response to the memo deletion instruction. For example, the user performs a passive trigger through the reminder instruction "Where did I put my keys?", and the terminal gives the user feedback "I put them in the desk drawer” based on the key information. If the user believes that the memo content can be deleted, the user can voice input "Delete this memo" to notify the terminal to delete it.
  • the terminal can also remind the user to delete the expired information based on the timeliness information in the key information, so as to remind the user to delete the expired information to save storage space. For example, for flight information used as a memo content, if the flight time has passed, the terminal will remind the user of the flight expiration and actively prompt the user to delete the memo content.
  • the terminal by determining the intention type of the memo content, different trigger mode information is assigned to the memo content.
  • the memo content whose trigger mode indicated by the trigger mode information is active triggering, after the terminal actively pulls up to complete the memo reminder, the memo content often loses its storage value. Accordingly, the terminal can remind the user to delete the memo after completing the memo reminder, and delete the memo content in response to the user's positive feedback.
  • the information including but not limited to user device information, user personal information, etc.
  • data including but not limited to data used for analysis, stored data, displayed data, etc.
  • signals involved in this application are all authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data must comply with the relevant laws, regulations and standards of the relevant countries and regions.
  • the voice, itinerary information, geographic location, etc. involved in this application are all obtained with full authorization.
  • FIG. 12 shows a structural block diagram of a memo reminder device provided by an exemplary embodiment of the present application, the device comprising:
  • the acquisition module 1201 is used to acquire the memo content when there is a memo recording requirement
  • An information extraction module 1202 is used to extract information from the memo content based on the content mode corresponding to the memo content to obtain key information, wherein different information extraction methods are used in different content modes;
  • the storage module 1203 is used to store the memo content and the key information in association
  • the memo reminder module 1204 is configured to make a memo reminder based on the memo content when it is determined based on the key information that a memo reminder triggering condition is met.
  • the information extraction module 1202 is further used to:
  • the content modality corresponding to the memo content includes a text modality, performing natural language processing on the memo content to obtain text key information;
  • the content modality corresponding to the memo content includes a visual modality, performing image recognition processing on the memo content to obtain image key information;
  • the content modality corresponding to the memo content includes an auditory modality
  • audio recognition processing is performed on the memo content to obtain audio key information.
  • the information extraction module 1202 is further used to:
  • trigger mode information Perform causal inference analysis on the memo content to obtain trigger mode information, wherein the trigger mode includes active triggering and passive triggering, and when the trigger mode indicated by the trigger mode information is active triggering, the trigger mode information contains active triggering conditions.
  • the memo reminder module 1204 is further used to:
  • an active memo reminder is performed based on the memo content
  • a passive memo reminder is performed based on the memo content.
  • the information extraction module 1202 is further used to:
  • the memo content is a picture
  • the memo content is a video
  • Natural language processing is performed on at least one of the picture text, the picture description text, and the video description text to obtain the text key information.
  • the information extraction module 1202 is further used to:
  • the acquisition module 1201 is further used to:
  • the storage module 1203 is further used for:
  • the memo content, the key information and the extended content are stored in association.
  • the acquisition module 1201 is further used to:
  • the content mode of the memo content is a text mode and the information amount of the key information is less than an information amount threshold, obtaining at least one of auditory extended content or visual extended content;
  • the memo reminder module 1204 is also used for:
  • a memo reminder is performed based on the memo content and the extended content.
  • the acquisition module 1201 is further used to:
  • the memo reminder module 1204 is also used for:
  • a memo reminder is made based on the memo content
  • a memo reminder is performed based on the extended content.
  • the acquisition module 1201 is further used to:
  • time and space information When there is a need for memo recording, obtaining time and space information, wherein the time and space information is used to represent the time and space state when the memo is recorded;
  • the storage module 1203 is further used for:
  • the memo reminder module 1204 is also used for:
  • a memo reminder is performed based on the memo content.
  • the device further includes an encoding module, configured to perform vector encoding on the key information to obtain a key information vector;
  • the storage module 1203 is further used for:
  • the encoding module is further used for:
  • the memo reminder module 1204 is also used for:
  • the information extraction module 1202 is further used to:
  • the encoding module is further used for:
  • the key information of the instruction is vectorized and encoded to obtain the instruction vector.
  • the acquisition module 1201 is further used to:
  • the memo content is determined based on the user behavior information.
  • the device further comprises a deletion module, configured to delete the memo content in response to a memo deletion instruction.
  • a deletion module configured to delete the memo content in response to a memo deletion instruction.
  • the terminal uses an acquisition module to acquire multimodal and multidimensional information such as text, vision, hearing, situational information, and spatiotemporal information through a variety of sensors to form the memo content. While being compatible with multimodal data, it improves the richness of the memo content.
  • the embodiment of the present application performs all-round information extraction through the information extraction module and associates and stores it through the storage module.
  • the terminal automatically determines the triggering method of the memo reminder based on the determination of the intention type of the memo content. For passively triggered memo content, when the user passively triggers the memo reminder through the reminder instruction, it can also be triggered through multimodal input.
  • the embodiment of the present application improves the way to help users remember by expanding the support for the input and output of multimodal information, thereby improving the efficiency and quality of human-computer interaction.
  • the embodiment of the present application further provides a computer-readable storage medium, which stores at least one program, and the at least one program is used to be executed by a processor to implement the memo reminder method as described in the above embodiment.
  • the embodiment of the present application provides a computer program product or a computer program, which includes a computer instruction stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the memo reminder method provided in the above embodiment.
  • Computer-readable media include computer storage media and communication media, wherein the communication media include any media that facilitates the transmission of a computer program from one place to another.
  • the storage medium can be any available medium that a general or special-purpose computer can access.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Multimedia (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiments of the present application belong to the technical field of human-computer interaction. Disclosed are a memo reminding method and apparatus, and a terminal and a storage medium. The method comprises: when there is a memo record requirement, acquiring memo content (201); on the basis of a content mode corresponding to the memo content, performing information extraction on the memo content, so as to obtain key information (202); storing the memo content and the key information in an associated manner (203); and when it is determined on the basis of the key information that a memo reminding trigger condition is met, performing memo reminding on the basis of the memo content (204). Memory information required by a user often has multi-content modes, and therefore by means of the present solution, information extraction is performed regarding different content modes, and key information is stored, thereby enriching memo content and also improving the efficiency and quality of memo reminding.

Description

备忘提醒方法、装置、终端及存储介质Memo reminder method, device, terminal and storage medium
本申请要求于2022年10月12日提交的申请号为202211249686.9、发明名称为“备忘提醒方法、装置、终端及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to Chinese patent application No. 202211249686.9, filed on October 12, 2022, and entitled “Memo Reminder Method, Device, Terminal and Storage Medium”, the entire contents of which are incorporated by reference into this application.
技术领域Technical Field
本申请实施例涉及人机交互技术领域,特别涉及一种备忘提醒方法、装置、终端及存储介质。The embodiments of the present application relate to the field of human-computer interaction technology, and more particularly to a memo reminder method, device, terminal and storage medium.
背景技术Background technique
随着社会发展,人们的工作生活日益丰富,进而需要处理的信息日益增加,随之带来了沉重的记忆任务,基于数字科学同样发展迅速,人们日渐习惯于利用智能终端强大的存储功能辅助记忆,提升工作生活效率。With the development of society, people's work and life are becoming more and more abundant, and the information that needs to be processed is increasing, which brings about a heavy memory task. Based on the rapid development of digital science, people are becoming accustomed to using the powerful storage function of smart terminals to assist memory and improve work and life efficiency.
在相关技术中,终端对用户输入的文本或是图像等进行备忘,备忘内容局限于单一模态,限制了用户进行的备忘的灵活性,影响了人机交互效率。In the related art, the terminal takes a note of the text or image input by the user, and the note content is limited to a single mode, which limits the flexibility of the user's note and affects the efficiency of human-computer interaction.
发明内容Summary of the invention
本申请实施例提供了一种备忘提醒方法、装置、终端及存储介质。所述技术方案如下:The embodiment of the present application provides a memo reminder method, device, terminal and storage medium. The technical solution is as follows:
一方面,本申请实施例提供了一种备忘提醒方法,所述方法由终端执行,所述方法包括:On the one hand, an embodiment of the present application provides a memo reminder method, which is executed by a terminal and includes:
在存在备忘记录需求的情况下,获取备忘内容;If there is a need for memo recording, obtain the memo content;
基于所述备忘内容对应的内容模态,对所述备忘内容进行信息抽取,得到关键信息,其中,不同内容模态下进行信息抽取的方式不同;Based on the content mode corresponding to the memo content, extract information from the memo content to obtain key information, wherein the information extraction method is different under different content modes;
对所述备忘内容以及所述关键信息进行关联存储;storing the memo content and the key information in association with each other;
在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容进行备忘提醒。When it is determined based on the key information that the memo reminder triggering condition is met, a memo reminder is performed based on the memo content.
另一方面,本申请实施例提供了一种备忘提醒装置,所述装置包括:On the other hand, an embodiment of the present application provides a memo reminder device, the device comprising:
获取模块,用于在存在备忘记录需求的情况下,获取备忘内容;The acquisition module is used to obtain the memo content when there is a need for memo recording;
信息抽取模块,用于基于所述备忘内容对应的内容模态,对所述备忘内容进行信息抽取,得到关键信息,其中,不同内容模态下进行信息抽取的方式不同;An information extraction module, configured to extract information from the memo content based on the content mode corresponding to the memo content to obtain key information, wherein different information extraction methods are used in different content modes;
存储模块,用于对所述备忘内容以及所述关键信息进行关联存储;A storage module, used for associating and storing the memo content and the key information;
备忘提醒模块,用于在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容进行备忘提醒。The memo reminder module is used to make a memo reminder based on the memo content when it is determined based on the key information that a memo reminder triggering condition is met.
另一方面,本申请实施例提供了一种终端,所述终端包括处理器和存储器;所述存储器存储有至少一段程序,所述至少一段程序用于被所述处理器执行以实现如上述方面所述的备忘提醒方法。On the other hand, an embodiment of the present application provides a terminal, which includes a processor and a memory; the memory stores at least one program, and the at least one program is used to be executed by the processor to implement the memo reminder method as described in the above aspect.
另一方面,本申请实施例提供了一种计算机可读存储介质,所述存储介质存储有至少一段程序,所述至少一段程序用于被处理器执行以实现如上述方面所述的备忘提醒方法。On the other hand, an embodiment of the present application provides a computer-readable storage medium, wherein the storage medium stores at least one program, and the at least one program is used to be executed by a processor to implement the memo reminder method as described in the above aspects.
另一方面,本申请实施例提供了一种计算机程序产品,该计算机程序产品包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述方面提供的备忘提醒方法。On the other hand, an embodiment of the present application provides a computer program product, which includes computer instructions, and the computer instructions are stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the memo reminder method provided in the above aspect.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1示出了本申请一个示例性实施例提供的终端的结构方框图;FIG1 shows a block diagram of a terminal provided by an exemplary embodiment of the present application;
图2示出了本申请一个示例性实施例提供的备忘提醒方法的流程图;FIG2 shows a flow chart of a memo reminder method provided by an exemplary embodiment of the present application;
图3示出了本申请一个示例性实施例示出的备忘提醒方法的示意图;FIG3 is a schematic diagram showing a memo reminder method according to an exemplary embodiment of the present application;
图4示出了本申请一个示例性实施例提供的主动获取备忘内容的示意图; FIG4 shows a schematic diagram of actively acquiring memo content provided by an exemplary embodiment of the present application;
图5示出了本申请一个示例性实施例提供的获取主动触发备忘的示意图;FIG5 is a schematic diagram showing a method of obtaining an active trigger memo provided by an exemplary embodiment of the present application;
图6示出了本申请一个示例性实施例提供的获取被动触发备忘的示意图;FIG6 shows a schematic diagram of obtaining a passive trigger memo provided by an exemplary embodiment of the present application;
图7是本申请一个示例性的视觉模态备忘内容;FIG. 7 is an exemplary visual modal memo content of the present application;
图8示出了本申请一个示例性实施例提供的主动触发提醒的示意图;FIG8 is a schematic diagram showing an active trigger reminder provided by an exemplary embodiment of the present application;
图9示出了本申请一个示例性实施例提供的被动触发提醒的流程图;FIG9 shows a flowchart of a passive trigger reminder provided by an exemplary embodiment of the present application;
图10示出了本申请一个示例性实施例提供的结合时空信息备忘的示意图;FIG10 is a schematic diagram showing a memo combining spatiotemporal information provided by an exemplary embodiment of the present application;
图11示出了本申请一个示例性实施例提供的获取扩展内容的示意图;FIG11 is a schematic diagram showing a method of acquiring extended content provided by an exemplary embodiment of the present application;
图12示出了本申请一个实施例提供的备忘提醒装置的结构框图。FIG. 12 shows a structural block diagram of a memo reminder device provided in one embodiment of the present application.
具体实施方式Detailed ways
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。In order to make the objectives, technical solutions and advantages of the present application more clear, the implementation methods of the present application will be further described in detail below with reference to the accompanying drawings.
在本文中提及的“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。The term "multiple" as used herein refers to two or more than two. "And/or" describes the relationship between related objects, indicating that three relationships can exist. For example, A and/or B can mean: A exists alone, A and B exist at the same time, and B exists alone. The character "/" generally indicates that the related objects are in an "or" relationship.
请参考图1,其示出了本申请一个示例性实施例提供的终端的结构方框图。终端100可以包括一个或多个如下部件:处理器110、存储器120。Please refer to FIG1 , which shows a block diagram of a terminal provided by an exemplary embodiment of the present application. The terminal 100 may include one or more of the following components: a processor 110 and a memory 120 .
处理器110可以包括一个或者多个处理核心。处理器110利用各种接口和线路连接整个终端100内的各个部分,通过运行或执行存储在存储器120内的指令、程序、代码集或指令集,以及调用存储在存储器120内的数据,执行计算机设备100的各种功能和处理数据。可选地,处理器110可以采用数字信号处理(Digital Signal Processing,DSP)、现场可编程门阵列(Field Programmable Gate Array,FPGA)、可编程逻辑阵列(Programmable Logic Array,PLA)中的至少一种硬件形式来实现。处理器110可集成中央处理器(Central Processing Unit,CPU)、图像处理器(Graphics Processing Unit,GPU)、神经网络处理器(Neural-network Processing Unit,NPU)和调制解调器等中的一种或几种的组合。其中,CPU主要处理操作系统、用户界面和应用程序等;GPU用于负责触摸显示屏所需要显示的内容的渲染和绘制;NPU用于实现人工智能(Artificial Intelligence,AI)功能;调制解调器用于处理无线通信。可以理解的是,上述调制解调器也可以不集成到处理器110中,单独通过一块芯片进行实现。The processor 110 may include one or more processing cores. The processor 110 uses various interfaces and lines to connect various parts within the entire terminal 100, and executes various functions of the computer device 100 and processes data by running or executing instructions, programs, code sets or instruction sets stored in the memory 120, and calling data stored in the memory 120. Optionally, the processor 110 can be implemented in at least one hardware form of digital signal processing (DSP), field programmable gate array (FPGA), and programmable logic array (PLA). The processor 110 can integrate one or a combination of a central processing unit (CPU), a graphics processing unit (GPU), a neural network processor (NPU), and a modem. Among them, the CPU mainly processes the operating system, user interface, and application programs; the GPU is responsible for rendering and drawing the content to be displayed on the touch display; the NPU is used to implement artificial intelligence (AI) functions; and the modem is used to process wireless communications. It is understandable that the above-mentioned modem may not be integrated into the processor 110, but may be implemented by a separate chip.
存储器120可以包括随机存储器(Random Access Memory,RAM),也可以包括只读存储器(Read-Only Memory,ROM)。可选地,该存储器120包括非瞬时性计算机可读介质(non-transitory computer-readable storage medium)。存储器120可用于存储指令、程序、代码、代码集或指令集。存储器120可包括存储程序区和存储数据区,其中,存储程序区可存储用于实现操作系统的指令、用于至少一个功能的指令(比如触控功能、声音播放功能、图像播放功能等)、用于实现下述各个方法实施例的指令等;存储数据区可存储根据计算机设备100的使用所创建的数据(比如音频数据、电话本)等。The memory 120 may include a random access memory (RAM) or a read-only memory (ROM). Optionally, the memory 120 includes a non-transitory computer-readable storage medium. The memory 120 may be used to store instructions, programs, codes, code sets, or instruction sets. The memory 120 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playback function, an image playback function, etc.), instructions for implementing the following various method embodiments, etc.; the data storage area may store data created according to the use of the computer device 100 (such as audio data, a phone book), etc.
在一些实施例中,终端还可以包括显示屏130和麦克风140。显示屏130是用于进行图像显示的组件。显示屏130可以是终端的内置屏幕,比如智能手机的屏幕,也可以是终端的外接屏幕,比如个人计算器的外接显示器。In some embodiments, the terminal may further include a display screen 130 and a microphone 140. The display screen 130 is a component for displaying images. The display screen 130 may be a built-in screen of the terminal, such as a screen of a smart phone, or an external screen of the terminal, such as an external display of a personal calculator.
在一些实施例中,显示屏130除了图像显示功能外,还具有触控功能,即通过触摸点击显示屏130可以实现对显示内容的控制。In some embodiments, the display screen 130 has a touch function in addition to the image display function, that is, the display content can be controlled by touching and clicking the display screen 130.
麦克风140是用与收集外界声音的组件。在本申请实施例中,终端100支持用户通过语音指令进行备忘提醒场景下的人机交互,麦克风140可以用于采集用户语音音频信息,以进行备忘提醒。The microphone 140 is a component for collecting external sounds. In the embodiment of the present application, the terminal 100 supports the user to perform human-computer interaction in the memo reminder scenario through voice commands, and the microphone 140 can be used to collect user voice audio information for memo reminder.
除此之外,本领域技术人员可以理解,上述附图所示出的终端100的结构并不构成对计算机设备的限定,计算机设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。比如,终端100中还包括摄像组件、扬声器、射频电路、输入单元、传感 器(比如加速度传感器、角速度传感器、光线传感器等等)、音频电路、Wi-Fi模块、电源、蓝牙模块等部件,在此不再赘述。In addition, those skilled in the art will appreciate that the structure of the terminal 100 shown in the above figures does not limit the computer device, and the computer device may include more or fewer components than shown in the figure, or combine certain components, or arrange the components differently. For example, the terminal 100 may also include a camera component, a speaker, a radio frequency circuit, an input unit, a sensor, and a plurality of other components. Devices (such as acceleration sensors, angular velocity sensors, light sensors, etc.), audio circuits, Wi-Fi modules, power supplies, Bluetooth modules and other components are not described here.
请参考图2,其示出了本申请一个示例性实施例提供的备忘提醒方法的流程图。该方法可以包括如下步骤。Please refer to Figure 2, which shows a flow chart of a memo reminder method provided by an exemplary embodiment of the present application. The method may include the following steps.
步骤201,在存在备忘记录需求的情况下,获取备忘内容。Step 201, when there is a need to record a memo, obtain the memo content.
响应于用户的备忘操作,终端获取相应的备忘内容,其中,备忘内容可以是文本、图像、视频、音频中的一种或其多种的组合,也即备忘内容可以是多模态的。下面将结合部分应用场景对备忘内容进行说明,如图3所示,在进行网页浏览的场景下,备忘内容可以是数字内容,例如线上商品、新闻报道、电子书籍以及URL(Uniform Resource Locator,统一资源定位符)等;在便利出行的场景下,备忘内容可以是指示路线的地图导航数据、截图,还可以包括景点、餐厅等的物理世界真实实体坐标、官方网址、介绍图片及音频、攻略文本等;在便捷生活的场景下,备忘内容可以是日程信息、行程信息以及便签文本等。In response to the user's memo operation, the terminal obtains the corresponding memo content, wherein the memo content can be one of text, image, video, audio or a combination of multiple thereof, that is, the memo content can be multimodal. The memo content will be described below in conjunction with some application scenarios. As shown in FIG3 , in the scenario of web browsing, the memo content can be digital content, such as online products, news reports, e-books, and URLs (Uniform Resource Locator), etc.; in the scenario of convenient travel, the memo content can be map navigation data and screenshots indicating the route, and can also include the real physical coordinates of attractions, restaurants, etc., official websites, introduction pictures and audio, guide text, etc.; in the scenario of convenient life, the memo content can be schedule information, itinerary information, and note text, etc.
在一种可能的实施方式中,存在备忘记录需求的情况可以是,终端接收到备忘记录语音指令,相应的,终端获取备忘记录语音指令所指示的备忘内容。终端对所接收的备忘记录语音指令中包含的语音音频信息进行ASR(Automatic Speech Recognition,自动语音识别技术)处理,确定备忘记录语音指令所指示的备忘内容,并进一步基于备忘内容模态执行相应的获取操作,例如,在备忘内容为语音输入文本的情况下,终端从ASR处理所得文本中截取相应文本作为备忘内容,在备忘记录语音指令指示备忘内容为图片的情况下,终端通过截图、拍摄或保存图片等方式获取相应备忘内容,而在用户指示对当前网页进行备忘时,终端获取该网页对应网页链接作为备忘内容。In a possible implementation, the need for memo recording may be that the terminal receives a voice instruction for memo recording, and accordingly, the terminal obtains the memo content indicated by the voice instruction for memo recording. The terminal performs ASR (Automatic Speech Recognition) processing on the voice audio information contained in the received voice instruction for memo recording, determines the memo content indicated by the voice instruction for memo recording, and further performs corresponding acquisition operations based on the memo content modality. For example, in the case where the memo content is voice input text, the terminal extracts the corresponding text from the text obtained by ASR processing as the memo content. In the case where the voice instruction for memo recording indicates that the memo content is a picture, the terminal obtains the corresponding memo content by taking a screenshot, shooting or saving a picture, and when the user instructs to take a memo for the current web page, the terminal obtains the web page link corresponding to the web page as the memo content.
示意性的,在基于备忘记录语音指令进行备忘提醒的情况下,响应于用户的唤起语音获取用户的备忘记录语音指令“提醒我赶这班飞机”,终端通过ASR等处理方式,确定该备忘记录语音指令的备忘内容为航班信息以及行程信息,进而终端通过截图工具获取当前显示的航班信息图片,并获取航班信息描述文本作为行程文本,上述航班信息图片以及行程文本均为备忘内容。Illustratively, in the case of a memo reminder based on a memo recording voice command, the terminal obtains the user's memo recording voice command "Remind me to catch this flight" in response to the user's awakening voice, and the terminal determines that the memo content of the memo recording voice command is flight information and itinerary information through ASR and other processing methods. Then, the terminal obtains the currently displayed flight information picture through a screenshot tool, and obtains the flight information description text as the itinerary text. The above flight information picture and itinerary text are both memo contents.
可选的,存在备忘记录需求的情况还可以是,终端获取到用户行为信息,且用户行为信息满足备忘条件,相应的,终端基于用户行为信息确定备忘内容。在工作生活中,终端基于感知得到的用户语音、操作行为、动作行为、情景信息等主动理解用户行为,并基于对用户需求的判断主动获取备忘内容,无需用户唤起设备即可完成备忘,避免用户忘记进行备忘操作而遗漏信息,并为用户带来自然无感的交互体验。Optionally, the need for memo recording may also be that the terminal obtains user behavior information and the user behavior information meets the memo conditions, and accordingly, the terminal determines the memo content based on the user behavior information. In work and life, the terminal actively understands user behavior based on perceived user voice, operation behavior, action behavior, scenario information, etc., and actively obtains the memo content based on the judgment of user needs. The memo can be completed without the user waking up the device, avoiding the user forgetting to perform the memo operation and missing information, and bringing a natural and seamless interactive experience to the user.
示意性的,如图4所示,在终端基于运动传感器以及定位组件感知得到用户正在进行户外慢跑的场景下,终端对用户的动作行为以及情景信息进行主动理解,并判断用户具有存储当前慢跑线路的需求,进而自动获取线路信息以及坐标信息作为备忘内容。Schematically, as shown in FIG4 , in a scenario where the terminal perceives based on the motion sensor and positioning component that the user is jogging outdoors, the terminal actively understands the user's action behavior and situational information, and determines that the user has the need to store the current jogging route, and then automatically obtains the route information and coordinate information as memo content.
需要说明的是,为保护用户隐私权,在获取备忘内容时,若需要终端使用相机、录音机等设有权限限制的传感器,终端可以基于用户的备忘记录语音指令触发传感器开启并获取备忘内容,也可以通过显示提醒弹窗等方式,主动询问获取用户的权限许可,并基于用户的正向反馈开启传感器并获取备忘内容。It should be noted that in order to protect the privacy of users, when obtaining the memo content, if the terminal needs to use sensors with permission restrictions such as cameras and recorders, the terminal can trigger the sensor to turn on and obtain the memo content based on the user's memo record voice command, or it can actively ask for the user's permission by displaying a reminder pop-up window, and turn on the sensor and obtain the memo content based on the user's positive feedback.
步骤202,基于备忘内容对应的内容模态,对备忘内容进行信息抽取,得到关键信息,其中,不同内容模态下进行信息抽取的方式不同。Step 202, based on the content mode corresponding to the memo content, extract information from the memo content to obtain key information, wherein the information extraction method is different under different content modes.
其中,关键信息可以包括有备忘内容的属性、主题等核心提要,还可以包括有时间戳、气候等备忘时空信息,以及基于对备忘内容分析确定的意图类型、触发方式等信息。Among them, the key information may include core summaries such as the attributes and themes of the memo content, as well as memo temporal and spatial information such as timestamps and climate, and information such as the intent type and triggering method determined based on the analysis of the memo content.
在用户输入的备忘内容复合多种模态的情况下,例如网站文本+截图、用户指令文本+照片、歌曲音频+标题文本等,终端对于不同模态备忘内容分别进行单一模态的信息抽取,得到各模态备忘内容的子关键信息而后进行合并得到关键信息。When the memo content input by the user is a combination of multiple modalities, such as website text + screenshot, user command text + photo, song audio + title text, etc., the terminal extracts information of a single modality for memo contents of different modalities respectively, obtains sub-key information of the memo content of each modality, and then merges them to obtain key information.
步骤203,对备忘内容以及关键信息进行关联存储。 Step 203: store the memo content and key information in association with each other.
其中,步骤202中得到的关键信息包含有不同的数据形式,例如文本、实体关系对、图像区域坐标、像素值、语谱图、时间戳、温湿度、海拔、经纬度等。不同数据形式的关键信息对应于备忘内容的不同维度。The key information obtained in step 202 includes different data forms, such as text, entity relationship pairs, image region coordinates, pixel values, spectrograms, timestamps, temperature and humidity, altitude, longitude and latitude, etc. Key information in different data forms corresponds to different dimensions of the memo content.
在一种可能的实施方式中,如图3所示,基于关键信息与备忘内容间的对应关系,终端将所得关键信息构造成一种半结构化的Key-value(键值对)数据结构,例如字典、Hashmap(哈希表)等,表一通过列举的方式说明了通过半结构化数据对关键信息进行存储的基本形式。In a possible implementation, as shown in FIG3 , based on the correspondence between the key information and the memo content, the terminal constructs the obtained key information into a semi-structured Key-value data structure, such as a dictionary, a Hashmap, etc. Table 1 illustrates the basic forms of storing key information through semi-structured data by enumeration.
表一
Table I
步骤204,在基于关键信息确定满足备忘提醒触发条件的情况下,基于备忘内容进行备忘提醒。Step 204: When it is determined based on the key information that the memo reminder triggering condition is met, a memo reminder is made based on the memo content.
其中,基于备忘内容对应的意图类型的不同,备忘提醒的方式可以是主动提醒也即基于满足备忘提醒触发条件主动进行备忘提醒,其中,备忘提醒的方式可以是在消息通知框中显示提醒消息,也可以是发出提醒音频或是结合震动等方式,本申请对此不作限定。终端还可以通过被动提醒的方式进行备忘提醒,也即终端以用户的查询输入作为备忘提醒触发条件,且仅在获取到用户的查询输入后基于相应备忘内容进行备忘提醒。Among them, based on the different intent types corresponding to the memo content, the memo reminder method can be an active reminder, that is, actively making a memo reminder based on satisfying the memo reminder trigger condition, wherein the memo reminder method can be to display a reminder message in a message notification box, or to emit a reminder audio or to combine vibration, etc., which is not limited in this application. The terminal can also make a memo reminder in a passive reminder manner, that is, the terminal uses the user's query input as the memo reminder trigger condition, and only makes a memo reminder based on the corresponding memo content after obtaining the user's query input.
终端进行备忘提醒的反馈形式可以是关联存储的原始文件,例如用户的原始语音,也可以是终端对原始文件进行处理得到的备忘信息,例如文本合成语音或是显示于终端的视觉通知等。The feedback form of the terminal's memo reminder can be the original file stored in the associated storage, such as the user's original voice, or the memo information obtained by the terminal processing the original file, such as text-to-speech or a visual notification displayed on the terminal.
综上所述,本申请实施例中,基于用户输入的备忘记录语音指令,终端通过多种传感器获取文本、视觉、听觉、情景信息、时空信息等多模态多维度信息构成备忘内容,在兼容多模态数据的同时,提高了备忘内容的丰富度。在获取到备忘内容的基础上,本申请实施例通过信息抽取确定备忘内容的意图类型,进而自动判断备忘提醒的触发方式,并在满足备忘提醒触发条件的情况下进行备忘提醒;本申请实施例扩展支持多模态信息的输入与输出,改良了帮助用户记忆的方式,进而提高了人机交互的效率和质量。In summary, in the embodiment of the present application, based on the memo recording voice command input by the user, the terminal obtains multi-modal and multi-dimensional information such as text, vision, hearing, situational information, and time and space information through a variety of sensors to form the memo content, while being compatible with multi-modal data, the richness of the memo content is improved. On the basis of obtaining the memo content, the embodiment of the present application determines the intention type of the memo content through information extraction, and then automatically determines the triggering method of the memo reminder, and performs a memo reminder when the memo reminder triggering conditions are met; the embodiment of the present application expands support for the input and output of multi-modal information, improves the way to help users remember, and thus improves the efficiency and quality of human-computer interaction.
在一些实施例中,基于备忘内容对应的内容模态,对备忘内容进行信息抽取,得到关键信息,包括:In some embodiments, based on the content modality corresponding to the memo content, information is extracted from the memo content to obtain key information, including:
在备忘内容对应的内容模态包括文本模态的情况下,对备忘内容进行自然语言处理,得到文本关键信息;When the content modality corresponding to the memo content includes a text modality, natural language processing is performed on the memo content to obtain key information of the text;
在备忘内容对应的内容模态包括视觉模态的情况下,对备忘内容进行图像识别处理,得到图像关键信息;When the content modality corresponding to the memo content includes a visual modality, performing image recognition processing on the memo content to obtain key image information;
在一些实施例中,在备忘内容对应的内容模态包括听觉模态的情况下,对备忘内容进行音频识别处理,得到音频关键信息。In some embodiments, when the content modality corresponding to the memo content includes an auditory modality, audio recognition processing is performed on the memo content to obtain audio key information.
对所述备忘内容进行自然语言处理,得到文本关键信息,包括如下至少一种方式:Performing natural language processing on the memo content to obtain key text information includes at least one of the following methods:
对备忘内容进行命名实体识别,得到实体信息;Perform named entity recognition on the memo content to obtain entity information;
对备忘内容进行实体关系抽取,得到实体关系信息;Extract entity relationships from the memo content to obtain entity relationship information;
对备忘内容进行主题摘要抽取,得到主题信息;Extract the subject summary of the memo content to obtain the subject information;
对备忘内容进行文本意图识别,得到意图类型,意图类型用于表征进行备忘记录的意图; Perform text intent recognition on the memo content to obtain the intent type, which is used to characterize the intent of recording the memo;
对备忘内容进行因果推断分析,得到触发方式信息,其中,触发方式包括主动触发和被动触发,且在触发方式信息指示的触发方式为主动触发的情况下,触发方式信息中包含主动触发条件。A causal inference analysis is performed on the memo content to obtain trigger mode information, wherein the trigger mode includes active triggering and passive triggering, and when the trigger mode indicated by the trigger mode information is active triggering, the trigger mode information includes active triggering conditions.
在一些实施例中,关键信息包含文本关键信息,且文本关键信息中包含所述触发方式信息;In some embodiments, the key information includes text key information, and the text key information includes the trigger mode information;
在基于关键信息确定满足备忘提醒触发条件的情况下,基于备忘内容进行备忘提醒,包括:When the memo reminder triggering condition is met based on the key information, a memo reminder is made based on the memo content, including:
在触发方式信息指示的触发方式为主动触发,且满足主动触发条件的情况下,基于备忘内容进行主动备忘提醒;When the trigger mode indicated by the trigger mode information is active triggering and the active triggering conditions are met, an active memo reminder is performed based on the memo content;
在触发方式信息指示的触发方式为被动触发,且关键信息与提醒指令匹配的情况下,基于备忘内容进行被动备忘提醒。When the trigger mode indicated by the trigger mode information is a passive trigger and the key information matches the reminder instruction, a passive memo reminder is performed based on the memo content.
在一些实施例中,对备忘内容进行图像识别处理,得到图像关键信息,包括:In some embodiments, image recognition processing is performed on the memo content to obtain key image information, including:
在备忘内容为图片的情况下,对图片进行光学字符识别,得到图片文本;和/或,对图片进行图像自然语言描述处理,得到图片描述文本;When the memo content is a picture, optical character recognition is performed on the picture to obtain picture text; and/or, image natural language description processing is performed on the picture to obtain picture description text;
在备忘内容为视频的情况下,对视频进行视频理解,得到视频描述文本;When the memo content is a video, the video is understood to obtain a video description text;
该方法还包括:The method further includes:
对图片文本、图片描述文本以及视频描述文本中的至少一种进行自然语言处理,得到文本关键信息。Natural language processing is performed on at least one of the picture text, the picture description text, and the video description text to obtain text key information.
在一些实施例中,对备忘内容进行音频识别提取,得到音频关键信息,包括:In some embodiments, audio recognition and extraction are performed on the memo content to obtain audio key information, including:
对备忘内容进行自动语音识别,得到音频文本;和/或,对备忘内容进行音频特征提取,得到音频指纹;Performing automatic speech recognition on the memo content to obtain an audio text; and/or, performing audio feature extraction on the memo content to obtain an audio fingerprint;
该方法还包括:The method further includes:
对音频文本进行自然语言处理,得到文本关键信息。Perform natural language processing on audio text to obtain key information of the text.
在一些实施例中,获取备忘内容之后,方法还包括:In some embodiments, after obtaining the memo content, the method further includes:
获取备忘内容对应的扩展内容;Get the extended content corresponding to the memo content;
所述对备忘内容以及关键信息进行关联存储,包括:The associated storage of the memo content and key information includes:
对备忘内容、关键信息以及扩展内容进行关联存储。The memo content, key information and extended content are stored in an associated manner.
在一些实施例中,获取备忘内容对应的扩展内容,包括:In some embodiments, obtaining extended content corresponding to the memo content includes:
在备忘内容的内容模态为文本模态,且关键信息的信息量小于信息量阈值的情况下,获取听觉扩展内容或视觉扩展内容中的至少一种;When the content mode of the memo content is a text mode and the amount of information of the key information is less than an information amount threshold, obtaining at least one of auditory extended content or visual extended content;
在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于备忘内容进行备忘提醒,包括:When it is determined based on the key information that the memo reminder triggering condition is met, a memo reminder is performed based on the memo content, including:
在基于关键信息确定满足备忘提醒触发条件的情况下,基于备忘内容以及扩展内容进行备忘提醒。When it is determined based on the key information that the memo reminder triggering condition is met, a memo reminder is performed based on the memo content and the extended content.
在一些实施例中,获取备忘内容对应的扩展内容,包括:In some embodiments, obtaining extended content corresponding to the memo content includes:
在备忘内容具有时效的情况下,获取备忘内容对应的扩展内容;When the memo content is time-limited, obtain the extended content corresponding to the memo content;
在基于关键信息确定满足备忘提醒触发条件的情况下,基于备忘内容进行备忘提醒,包括:When it is determined based on the key information that the memo reminder triggering condition is met, a memo reminder is made based on the memo content, including:
在基于关键信息确定满足备忘提醒触发条件,且备忘内容有效的情况下,基于备忘内容进行备忘提醒;When it is determined based on the key information that the memo reminder triggering condition is met and the memo content is valid, a memo reminder is made based on the memo content;
该方法还包括:The method further includes:
在基于关键信息确定满足备忘提醒触发条件,且备忘内容失效的情况下,基于扩展内容进行备忘提醒。When it is determined based on the key information that the memo reminder triggering condition is met and the memo content is invalid, a memo reminder is performed based on the extended content.
在一些实施例中,该方法还包括:In some embodiments, the method further comprises:
在存在备忘记录需求的情况下,获取时空信息,时空信息用于表征进行备忘记录时的时 间以及空间状态;When there is a need for memo recording, obtain spatiotemporal information, which is used to represent the time when the memo is recorded. time and space status;
对所述备忘内容以及关键信息进行关联存储,包括:The memo content and key information are stored in association, including:
对备忘内容、关键信息以及时空信息进行关联存储;The memo content, key information and time and space information are stored in association;
在基于关键信息确定满足备忘提醒触发条件的情况下,基于备忘内容进行备忘提醒,包括:When it is determined based on the key information that the memo reminder triggering condition is met, a memo reminder is made based on the memo content, including:
在基于关键信息以及时空信息确定满足备忘提醒触发条件的情况下,基于备忘内容进行备忘提醒。When it is determined that the memo reminder triggering condition is met based on the key information and the time-space information, a memo reminder is made based on the memo content.
在一些实施例中,基于备忘内容对应的内容模态,对备忘内容进行信息抽取,得到关键信息之后,方法还包括:In some embodiments, after extracting information from the memo content based on the content modality corresponding to the memo content and obtaining key information, the method further includes:
对关键信息进行向量化编码,得到关键信息向量;Vectorize and encode the key information to obtain a key information vector;
对备忘内容以及关键信息向量进行关联存储;The memo content and key information vector are stored in association;
在基于关键信息确定满足备忘提醒触发条件的情况下,基于备忘内容进行备忘提醒,包括:When it is determined based on the key information that the memo reminder triggering condition is met, a memo reminder is made based on the memo content, including:
在接收到提醒指令的情况下,对提醒指令进行向量化编码,得到指令向量;When a reminder instruction is received, vector encoding is performed on the reminder instruction to obtain an instruction vector;
在指令向量与所述关键信息向量的向量相似度大于阈值的情况下,确定满足备忘提醒触发条件,并基于备忘内容进行备忘提醒。When the vector similarity between the instruction vector and the key information vector is greater than a threshold, it is determined that a memo reminder triggering condition is met, and a memo reminder is performed based on the memo content.
在一些实施例中,在接收到提醒指令的情况下,对提醒指令进行向量化编码,得到指令向量,包括:In some embodiments, when a reminder instruction is received, vector encoding is performed on the reminder instruction to obtain an instruction vector, including:
在接收到提醒指令的情况下,基于提醒指令对应的内容模态,对提醒指令进行信息抽取,得到指令关键信息;When a reminder instruction is received, information of the reminder instruction is extracted based on the content mode corresponding to the reminder instruction to obtain key information of the instruction;
对指令关键信息进行向量化编码,得到指令向量。The key information of the instruction is vectorized and encoded to obtain an instruction vector.
在一些实施例中,在存在备忘记录需求的情况下,获取备忘内容,包括:In some embodiments, when there is a need to record a memo, obtaining the memo content includes:
在接收到备忘记录语音指令的情况下,获取备忘记录语音指令所指示的备忘内容;或者,When receiving a memo recording voice instruction, obtaining the memo content indicated by the memo recording voice instruction; or,
在获取到用户行为信息,且用户行为信息满足备忘条件的情况下,基于用户行为信息确定备忘内容。When the user behavior information is obtained and the user behavior information meets the memo condition, the memo content is determined based on the user behavior information.
在一些实施例中,在基于关键信息确定满足备忘提醒触发条件的情况下,基于备忘内容进行备忘提醒之后,该方法还包括:In some embodiments, when it is determined based on the key information that the memo reminder triggering condition is met, after the memo reminder is made based on the memo content, the method further includes:
响应于备忘删除指令,删除备忘内容。In response to the memo deletion instruction, the memo content is deleted.
在日常生活中,用户需要记忆的信息不仅仅局限于文本信息,更包含有大量的图像信息、视频信息等视觉信息,以及音乐信息、语音信息等听觉信息,也即,用户在利用终端帮助记忆时,需要存储的信息内容往往是多模态的。相较于现有技术在备忘过程中时仅支持单一模态输入,本申请实施例中,终端支持获取多模态备忘内容,并且进一步的,基于不同内容模态的备忘内容,终端采取相应的关键信息提取方式,以确定备忘内容中的关键信息进行存储,便于提高备忘提醒场景下的人机交互效率。如图3所示,关于对备忘内容进行信息提取的方法,可以是以下任意一种或多种的结合:In daily life, the information that users need to remember is not limited to text information, but also includes a large amount of visual information such as image information, video information, and auditory information such as music information and voice information. That is, when users use the terminal to help remember, the information content that needs to be stored is often multimodal. Compared with the prior art that only supports single modal input during the memo process, in the embodiment of the present application, the terminal supports the acquisition of multimodal memo content, and further, based on the memo content of different content modes, the terminal adopts a corresponding key information extraction method to determine the key information in the memo content for storage, so as to improve the human-computer interaction efficiency in the memo reminder scenario. As shown in Figure 3, the method for extracting information from the memo content can be a combination of any one or more of the following:
1、在备忘内容对应的内容模态包括文本模态的情况下,对备忘内容进行自然语言处理(Natural Language Processing,NLP),得到文本关键信息。1. When the content modality corresponding to the memo content includes text modality, natural language processing (NLP) is performed on the memo content to obtain key information of the text.
在一种可能的实施方式中,如图3所示,终端需通过NLU(Natural Language Understanding,自然语言理解)理解备忘内容的含义。终端首先对备忘内容进行命名实体识别(Named Entity Recognition,NER),得到实体信息。其中,预先定义的实体类型可以包括时间、位置、人名、物品,还可以包括货币、组织机构等本申请对此不作限定。在获取到文本模态的备忘内容时,终端基于实体类型对备忘文本进行实体识别并标注,例如,在基于备忘内容“明天三点,在中山公园和李明见面”,终端通过NER得到识别结果为:[时间]明天三点,在[地点]中山公园和[人名]李明见面。需要说明的是,本申请实施例中对实现NER的方法不作限定。In a possible implementation, as shown in FIG3 , the terminal needs to understand the meaning of the memo content through NLU (Natural Language Understanding). The terminal first performs Named Entity Recognition (NER) on the memo content to obtain entity information. Among them, the predefined entity types may include time, location, name, object, and may also include currency, organization, etc. This application does not limit this. When the memo content in text mode is obtained, the terminal performs entity recognition and annotation on the memo text based on the entity type. For example, based on the memo content "Meet Li Ming at 3 o'clock tomorrow in Zhongshan Park", the terminal obtains the recognition result through NER: [Time] Meet [Name] Li Ming at [Location] Zhongshan Park at 3 o'clock tomorrow. It should be noted that the method for implementing NER is not limited in the embodiment of this application.
进一步的,终端对备忘内容进行实体关系抽取(ERE,Entity and Relation Extraction),得 到实体关系信息。终端基于NER所得到的备忘内容的命名实体识别结果,进行实体抽取以及关系抽取,将备忘内容简化为核心的实体关系以便进行文本分析。例如,基于备忘内容为“我把钥匙放在书桌抽屉里了”,终端通过NER得到识别结果为:我把[物品]放在[位置]书桌抽屉里了,基于上述识别结果,终端进行实体关系抽取,得到实体关系信息为:钥匙[位置]书桌抽屉。Furthermore, the terminal performs entity and relation extraction (ERE) on the memo content and obtains To entity relationship information. Based on the named entity recognition results of the memo content obtained by NER, the terminal performs entity extraction and relationship extraction, and simplifies the memo content into core entity relationships for text analysis. For example, based on the memo content of "I put the key in the desk drawer", the terminal obtains the recognition result through NER: I put [item] in [location] desk drawer. Based on the above recognition result, the terminal performs entity relationship extraction and obtains the entity relationship information: key [location] desk drawer.
相应的,终端还可以对备忘内容进行主题摘要(Text Summarization)抽取,得到主题信息。在本申请中获取备忘内容文本摘要的方式不同,主体信息可以是抽取式摘要(Extractive Summarization)也可以是生成式摘要(Abstractive Summarization),本申请对此不作限定。示意性的,基于备忘内容为“明晚和张三一起吃个饭”,终端通过主题摘要抽取可以确定其文本主体为“吃饭”。Correspondingly, the terminal can also extract the subject summary (Text Summarization) of the memo content to obtain the subject information. In this application, the method of obtaining the text summary of the memo content is different. The main information can be an extractive summary (Extractive Summarization) or a generative summary (Abstractive Summarization), which is not limited in this application. Schematically, based on the memo content of "having dinner with Zhang San tomorrow night", the terminal can determine that its text body is "eating" through subject summary extraction.
在确定备忘文本内容的基础上,进一步的,终端对备忘内容进行文本意图识别,得到意图类型,意图类型用于表征进行备忘记录的意图。对于不用种类的备忘内容,用户具有不同的存储意图,例如,当备忘内容为航班信息时,用户期望在对应时间可以接收到终端提醒,而在备忘内容为商品购物信息时,用户期望日后可以对其进行查询。终端可以基于文本分类(Text Classification,TC)技术判断备忘内容的意图类型。其中,意图类型可以包括日程、提醒、备忘,终端通过将备忘内容归类值不同的意图类型中,可以为后续终端确定备忘提醒方式提供依据。On the basis of determining the text content of the memo, the terminal further performs text intent recognition on the memo content to obtain the intent type, which is used to characterize the intention of recording the memo. For different types of memo content, users have different storage intentions. For example, when the memo content is flight information, the user expects to receive a terminal reminder at the corresponding time, and when the memo content is product shopping information, the user expects to query it in the future. The terminal can determine the intent type of the memo content based on text classification (TC) technology. Among them, the intent type can include schedule, reminder, and memo. By classifying the memo content into different intent types, the terminal can provide a basis for the subsequent terminal to determine the memo reminder method.
示意性的,在备忘内容为“晚上九点记得喊妈妈吃药”的情况下,基于备忘内容中包含有条件性提示内容,终端可以判断用户期望基于备忘内容在满足相应条件时得到提示,进而终端可以确定其意图类型为提醒;基于备忘记录语音指令为“帮我记下这班飞机”,终端获取备忘内容为“您购买的3月1日8:45-11:20成都至北京的XXX航班已出票”,进而终端通过文本意图识别确定其意图类型为日程;基于备忘内容为“我的开机密码是XXX”,终端确定其意图类型为备忘。Illustratively, when the memo content is "Remember to call mom to take medicine at 9 o'clock in the evening", based on the conditional prompt content contained in the memo content, the terminal can determine that the user expects to be prompted when the corresponding conditions are met based on the memo content, and then the terminal can determine that its intent type is a reminder; based on the memo record voice command of "Please remember this flight for me", the terminal obtains the memo content of "The XXX flight from Chengdu to Beijing you purchased on March 1st 8:45-11:20 has been issued", and then the terminal determines its intent type as a schedule through text intent recognition; based on the memo content of "My power-on password is XXX", the terminal determines its intent type as a memo.
终端通过对备忘内容进行因果推断分析(Causal Inference,CI),得到触发方式信息,也即,关键信息中包含文本关键信息,且文本关键信息中包含所述触发方式信息。其中,触发方式包括主动触发和被动触发,且在触发方式信息指示的触发方式为主动触发的情况下,触发方式信息中包含主动触发条件。触发方式信息中至少包括触发方式,且触发方式与备忘内容的意图类型相对应。示意性的,如图5所示,基于备忘内容为“如果长江路拥堵了,提醒我骑电动车上班时”,终端确定基于该备忘内容的意图类型为提醒,进而确定其触发方式为主动触发,并提取备忘内容中的交通状况实体“拥堵”作为触发方式信息;如图6所示,基于备忘记录语音指令为“帮我收藏这条裙子”,终端获取该商品对应的商品链接等作为备忘内容,进而基于该备忘内容的意图类型为备忘,终端确定其触发方式为被动触发,仅当用户输入提醒指令时,终端基于备忘内容进行备忘提醒。The terminal obtains the trigger mode information by performing causal inference (CI) analysis on the memo content, that is, the key information includes the text key information, and the text key information includes the trigger mode information. Among them, the trigger mode includes active triggering and passive triggering, and when the trigger mode indicated by the trigger mode information is active triggering, the trigger mode information includes active triggering conditions. The trigger mode information includes at least the trigger mode, and the trigger mode corresponds to the intention type of the memo content. Schematically, as shown in FIG5, based on the memo content of "If Changjiang Road is congested, remind me to ride an electric bike to work", the terminal determines that the intention type based on the memo content is reminder, and then determines that its trigger mode is active triggering, and extracts the traffic condition entity "congestion" in the memo content as the trigger mode information; as shown in FIG6, based on the memo recording voice instruction of "Help me collect this skirt", the terminal obtains the product link corresponding to the product as the memo content, and then based on the intention type of the memo content is memo, the terminal determines that its trigger mode is passive triggering, and only when the user enters the reminder instruction, the terminal performs a memo reminder based on the memo content.
需要说明的是,上述各类NLP处理,例如NER、ERE、CI等,在终端对备忘内容进行关键信息抽取的过程中同步进行,基于上述处理得到的处理结果沟通构成关键信息,在一种可能的情况中,某种NLP处理的处理结果可以为空,其对备忘提醒过程并无影响。It should be noted that the above-mentioned various NLP processing, such as NER, ERE, CI, etc., are carried out simultaneously in the process of extracting key information from the memo content by the terminal. The processing results obtained based on the above-mentioned processing constitute key information. In a possible situation, the processing result of a certain NLP processing may be empty, which has no effect on the memo reminder process.
2、在备忘内容对应的内容模态包括视觉模态的情况下,对备忘内容进行图像识别处理,得到图像关键信息。进一步的,终端对图片文本、图片描述文本以及视频描述文本中的至少一种进行自然语言处理,得到文本关键信息。其中,终端对上述文本信息进行自然语言处理的方式可以是方法1中的任意一种或多种子方法的组合。2. When the content modality corresponding to the memo content includes a visual modality, image recognition processing is performed on the memo content to obtain key image information. Further, the terminal performs natural language processing on at least one of the image text, the image description text, and the video description text to obtain key text information. The terminal performs natural language processing on the above text information in a manner that can be a combination of any one or more sub-methods in method 1.
在备忘内容为图片的情况下,在一种可能的实施方式中,终端对图片进行光学字符识别(Optical Character Recognition,OCR),得到图片文本。对于以文本内容为主的图片信息,终端可以通过OCR技术将其中的文字符号转化为文本信息,并可以进一步的基于所得到的文本信息进行文本关键信息抽取,以得到图像中所包含的关键信息。In the case where the memo content is a picture, in a possible implementation, the terminal performs optical character recognition (OCR) on the picture to obtain the picture text. For picture information with text content as the main content, the terminal can convert the text symbols therein into text information through OCR technology, and can further extract text key information based on the obtained text information to obtain the key information contained in the image.
在另一种可能的实施方式中,终端对图片进行图像自然语言描述处理(Image Caption, IC),得到图片描述文本。对于以图画内容为主的图片信息,终端将图像转化为描述图像内容的自然语言,以便确定图像中所包含的信息。示意性的,在备忘内容为如图7所示图片的情况下,终端通过IC技术,得到文本描述如下:一个背着旅行包的男孩在旅游,进一步的,终端对上述文本进行如方法1所述的文本关键信息抽取,确定该图主题为“旅游”。In another possible implementation, the terminal performs natural language description processing (Image Caption, IC) to obtain the picture description text. For picture information with picture content as the main content, the terminal converts the image into natural language describing the image content so as to determine the information contained in the image. Schematically, in the case where the memo content is a picture as shown in FIG7, the terminal obtains the following text description through IC technology: a boy with a travel bag is traveling. Further, the terminal extracts the key information of the text as described in method 1 and determines that the theme of the picture is "travel".
需要说明的是,对于同一张图片,终端可以既对其进行采取光学字符识别,获取图片中的文本,又对其进行自然语言描述处理,进一步丰富图片对应的关键信息。同样需要说明的是,对于图片,终端可以通过视觉定位技术(Visual Grounding,VG),基于图片以及图片描述文本对图片中的描述主体进行定位,获取图片主体的位置区域信息,例如,在备忘内容为图7所示的图片的情况下,基于图片以及文本描述,终端通过VG技术确定图片中主体“男孩”的区域位置信息作为目标结果的一部分。It should be noted that for the same picture, the terminal can perform optical character recognition to obtain the text in the picture, and also perform natural language description processing to further enrich the key information corresponding to the picture. It should also be noted that for the picture, the terminal can use visual grounding (VG) technology to locate the description subject in the picture based on the picture and the picture description text, and obtain the location area information of the picture subject. For example, in the case where the memo content is the picture shown in Figure 7, based on the picture and the text description, the terminal determines the regional location information of the subject "boy" in the picture as part of the target result through VG technology.
在备忘内容为视频的情况下,终端对视频进行视频理解,得到视频描述文本。其中,关于视频理解所采取的技术,可以包括但不限于视频场景识别、视频动作理解、视频事件理解。通过视频理解,终端将视频内容以自然语言文本的方式进行表达,也即通过视频描述文本体现备忘内容中包含的信息,进一步的,终端可以通过文本关键信息抽取对视频描述文本进行处理,进一步明确视频信息,便于后续基于备忘内容对用户进行备忘提醒。In the case where the memo content is a video, the terminal performs video understanding on the video to obtain a video description text. Among them, the technologies adopted for video understanding may include but are not limited to video scene recognition, video action understanding, and video event understanding. Through video understanding, the terminal expresses the video content in the form of natural language text, that is, the information contained in the memo content is reflected through the video description text. Furthermore, the terminal can process the video description text by extracting key information from the text to further clarify the video information, so as to facilitate the subsequent memo reminder to the user based on the memo content.
3、在备忘内容对应的内容模态包括听觉模态的情况下,对备忘内容进行音频识别处理,得到音频关键信息。3. When the content modality corresponding to the memo content includes an auditory modality, audio recognition processing is performed on the memo content to obtain audio key information.
在一种可能的实施方式中,终端对备忘内容进行自动语音识别,得到音频文本。也即,在用户以备忘记录语音指令的方式输入备忘内容的情况下,终端通过ASR将语音信息转换为自然语言文本也即音频文本,并进一步的,终端对音频文本进行自然语言处理,得到文本关键信息。其中,关于自然语言处理的方式,可以是方法1中各处理方式中的任意一个或多个的组合。In a possible implementation, the terminal performs automatic speech recognition on the memo content to obtain an audio text. That is, when the user inputs the memo content in the form of a memo recording voice command, the terminal converts the voice information into a natural language text, that is, an audio text, through ASR, and further, the terminal performs natural language processing on the audio text to obtain text key information. Among them, the natural language processing method can be any one or more combinations of the processing methods in method 1.
在另一种可能的实施方式中,终端对备忘内容进行音频特征提取,得到音频指纹。终端可以通过音频指纹技术(Audio Fingerprinting Technology)提取一段音频中的数字特征并通过标识符进行表示,进而获取音频类备忘内容中所包含的信息。可选的,终端还可以计算得到音频文件的语谱图(Spectrogram),也即音频在时域中的频率信息。基于语谱图或是音频指纹,终端支持用户以音频(例如一段哼唱)作为提醒指令,对备忘内容进行查询。In another possible implementation, the terminal extracts audio features from the memo content to obtain an audio fingerprint. The terminal can extract digital features from a segment of audio through audio fingerprinting technology and represent them through identifiers, thereby obtaining information contained in the audio memo content. Optionally, the terminal can also calculate the spectrogram of the audio file, that is, the frequency information of the audio in the time domain. Based on the spectrogram or audio fingerprint, the terminal supports users to use audio (such as a humming) as a reminder instruction to query the memo content.
基于文本关键信息中的触发方式信息,关于终端进行备忘提醒方式,可以是主动触发或是被动触发。Based on the triggering mode information in the key text information, the terminal can be triggered actively or passively for reminder.
在一种可能的实施方式中,对应于备忘内容的意图类型属于日程类或提醒类,该备忘内容的触发方式信息指示的触发方式为主动触发,终端在满足主动触发条件的情况下,基于备忘内容进行主动备忘提醒。In a possible implementation, the intent type corresponding to the memo content belongs to the schedule type or reminder type, and the trigger mode information of the memo content indicates that the trigger mode is active triggering. When the active triggering conditions are met, the terminal performs an active memo reminder based on the memo content.
相较于相关技术中,终端仅单独基于备忘内容中的时间信息、位置信息对用户进行主动提醒,在本申请实施例中,备忘提醒触发条件可以包括备忘内容中的全部实体信息中的任意一个或多个的组合,也即终端可以将实体信息中的时间、位置、气候等作为主动触发条件,也可以将实体信息中的事件、交通情况等作为主动触发条件,提高了提醒情景的丰富度,改善了人机交互体验。Compared with the related art, the terminal only actively reminds the user based on the time information and location information in the memo content. In the embodiment of the present application, the memo reminder trigger condition may include any one or more combinations of all the entity information in the memo content, that is, the terminal can use the time, location, climate, etc. in the entity information as active trigger conditions, or can use events, traffic conditions, etc. in the entity information as active trigger conditions, which increases the richness of the reminder scenarios and improves the human-computer interaction experience.
示意性的,当备忘内容为“如果长江路拥堵了,提醒我骑电动车上班”时,通过意图识别以及因果推断,终端确定该备忘内容的意图类别为提醒,并且采取主动触发的方式进行备忘提醒,其中,触发方式信息中包含有主动触发条件[交通情况],也即“长江路拥堵”,进而在检测到交通路况符合主动触发条件的情况下,终端主动进行备忘提醒。Illustratively, when the memo content is "If Changjiang Road is congested, remind me to ride an electric bike to work", through intent recognition and causal inference, the terminal determines that the intent category of the memo content is a reminder, and takes an active triggering approach to make a memo reminder, wherein the triggering method information includes an active triggering condition [traffic conditions], that is, "Changjiang Road is congested", and then when it is detected that the traffic conditions meet the active triggering conditions, the terminal actively makes a memo reminder.
可选的,终端可以对执行备忘操作用户之外的对象进行备忘提醒,也即,相较于现有技术中终端仅针对终端的使用用户进行备忘提醒,在本申请实施例中,用户可以在备忘内容中增加备忘提醒对象信息,例如,如图8所示,用户通过备忘记录语音指令输入备忘内容为“提醒妈妈晚上九点吃药”,进而终端可以通过信息抽取确定关键信息指示的备忘提醒对象为用户 的“妈妈”,并在满足主动触发条件“晚上九点”这一事件的情况下,对其进行主动备忘提醒。Optionally, the terminal can make a memo reminder to an object other than the user who performs the memo operation. That is, compared with the prior art in which the terminal only makes a memo reminder to the user who uses the terminal, in the embodiment of the present application, the user can add the memo reminder object information in the memo content. For example, as shown in FIG8, the user inputs the memo content as "Remind mom to take medicine at 9 o'clock in the evening" through the memo recording voice command, and then the terminal can determine that the memo reminder object indicated by the key information is the user through information extraction. and actively remind the mother when the active trigger condition "9 o'clock in the evening" is met.
在另一种可能的实施方式中,对应于备忘内容的意图类型属于日程类或备忘类,该备忘内容的触发方式信息指示的触发方式为被动触发,终端在关键信息与提醒指令匹配的情况下,基于备忘内容进行被动备忘提醒。In another possible implementation, the intent type corresponding to the memo content belongs to the schedule category or the memo category, and the trigger mode information of the memo content indicates that the trigger mode is passive triggering. When the key information matches the reminder instruction, the terminal performs a passive memo reminder based on the memo content.
其中,被动触发对应的备忘提醒触发条件也可以是多模态的,也即用户可以基于语音指令对应的文本,也可以基于图片信息、音频信息、网页链接等多模态的或是多模态结合的提醒指令进行查询。在基于存储内容为多模态信息构成的半结构化数据的情况下,终端通过支持多模态查询可以丰富用户查询操作的自由度,以便用户通过便于表述的方式完成查询,贴合用户直观感受,并且在备忘内容模态与提醒指令模态相同的情况下,可以提高备忘提醒效率,改善人机交互体验。示意性的,用户可以基于一段哼唱(音频信息)在备忘内容中检索对应歌曲音频,或是用户可以基于一张服装类图片(图像信息),在备忘内容中查询相应的商品信息。Among them, the memo reminder triggering condition corresponding to the passive trigger can also be multimodal, that is, the user can query based on the text corresponding to the voice command, or based on multimodal or multimodal combination reminder instructions such as picture information, audio information, web page links, etc. In the case of semi-structured data based on the storage content consisting of multimodal information, the terminal can enrich the freedom of user query operations by supporting multimodal queries, so that users can complete queries in a way that is easy to express, which fits the user's intuitive feelings, and in the case where the memo content mode is the same as the reminder instruction mode, the memo reminder efficiency can be improved and the human-computer interaction experience can be improved. Schematically, the user can retrieve the corresponding song audio in the memo content based on a humming (audio information), or the user can query the corresponding product information in the memo content based on a clothing picture (image information).
如图9所示,对于被动触发的备忘内容,终端响应于用户的唤醒指令启动,并基于用户输入的提醒指令进行检索以及反馈。基于提醒指令可以是多模态的,类比于终端对备忘内容进行处理的方式,终端对提醒指令进行信息抽取,得到指令关键信息,并进而构建半结构化数据例如字典等存储指令关键信息。关于信息抽取以及关联存储方式与上述实施例相同,此处不再赘述。As shown in FIG9 , for the passively triggered memo content, the terminal responds to the user's wake-up command to start, and retrieves and feedbacks based on the reminder command input by the user. Based on the fact that the reminder command can be multimodal, similar to the way the terminal processes the memo content, the terminal extracts information from the reminder command, obtains the key information of the command, and then constructs semi-structured data such as a dictionary to store the key information of the command. The information extraction and associated storage method are the same as those in the above embodiment, and will not be repeated here.
在确定指令关键信息的情况下,终端将指令关键信息与备忘内容对应的关键信息进行比较、匹配,并将相关性最高的关键信息作为查询结果向用户进行反馈,完成备忘提醒。其中,基于关键信息模态不同,比较相关性的依据不同,相关性可以是文本相似性、图像相似性、语谱图相似性等,并且上述匹配过程在两个关键信息字典之间进行,各字典中包含有多个键值对,进而在匹配时需引入两字典中相似值的数量确定相似性。When the key information of the instruction is determined, the terminal compares and matches the key information of the instruction with the key information corresponding to the memo content, and feeds back the key information with the highest relevance to the user as the query result to complete the memo reminder. Among them, based on the different modalities of the key information, the basis for comparing the relevance is different. The relevance can be text similarity, image similarity, spectrogram similarity, etc., and the above matching process is carried out between two key information dictionaries, each of which contains multiple key-value pairs, and then the number of similar values in the two dictionaries needs to be introduced to determine the similarity during matching.
示意性的,基于提示指令“我把钥匙放在哪里了?”终端确定指令关键信息为“钥匙”,进而终端在备忘内容对应的关键信息字典中进行匹配得到属性中包含有“钥匙”的字典,并向用户反馈所得字典的原始数据“我把钥匙放在书桌抽屉里了”。Illustratively, based on the prompt instruction "Where did I put my keys?" the terminal determines that the key information of the instruction is "key", and then the terminal matches the key information dictionary corresponding to the memo content to obtain a dictionary whose attributes include "key", and feeds back the original data of the obtained dictionary to the user "I put my keys in the desk drawer".
在一种可能的实施方式中,在存在备忘记录需求的情况下,终端获取时空信息,时空信息用于表征进行备忘记录时的时间以及空间状态。其中,时空信息可以包括备忘记录需求对应的时间戳、当前位置、海拔、温湿度以及气候信息等。时空信息是丰富备忘内容的重要信息,可以提高查询支持度,为后续用户基于备忘内容进行查询时提供查询标签,提高备忘提醒效率。In a possible implementation, when there is a need for memo recording, the terminal obtains spatiotemporal information, which is used to characterize the time and space state when the memo is recorded. The spatiotemporal information may include the timestamp, current location, altitude, temperature and humidity, and climate information corresponding to the memo recording need. Spatiotemporal information is important information for enriching the memo content, which can improve query support, provide query tags for subsequent users to query based on the memo content, and improve the efficiency of memo reminders.
进一步的,在存在时空信息的情况下,终端对备忘内容、关键信息以述时空信息进行关联存储。相应的,在备忘内容中存在时空信息的情况下,终端基于关键信息以及时空信息确定满足备忘提醒触发条件时,基于备忘内容进行备忘提醒。例如,如图10所示,基于备忘内容为“避暑山庄真壮美”,终端在获取景点照片、位置坐标、视频、文本描述的同时,获取备忘时刻的时间戳、温湿度以及海拔等时空信息,并通过信息抽取构建关键信息字典进行关联存储,在获取到提醒指令“去年最热的时候我去哪里玩了”的情况下,终端通过指令信息抽取获取指令关键信息为:[时间]去年、[温度]最热、[主题]旅游,进而基于备忘内容中的时空信息满足备忘提醒触发条件,终端向用户反馈文本“避暑山庄真壮美”以及图片、视频等备忘内容。获取时空信息作为备忘内容,在便利用户基于提醒指令查询备忘内容的同时,提高了反馈结果的精确度。Further, in the case of the presence of spatiotemporal information, the terminal associates and stores the memo content and key information with the spatiotemporal information. Accordingly, in the case of the presence of spatiotemporal information in the memo content, the terminal determines that the memo reminder triggering condition is met based on the key information and spatiotemporal information, and performs a memo reminder based on the memo content. For example, as shown in FIG10, based on the memo content of "The Summer Resort is really magnificent", the terminal obtains the spatiotemporal information such as the timestamp, temperature and humidity, and altitude of the memo time while obtaining the scenic spot photos, location coordinates, videos, and text descriptions, and constructs a key information dictionary for associated storage through information extraction. When the reminder instruction "Where did I go to play when it was the hottest last year" is obtained, the terminal extracts the instruction key information through the instruction information: [time] last year, [temperature] hottest, [theme] travel, and then based on the spatiotemporal information in the memo content to meet the memo reminder triggering condition, the terminal feedbacks the text "The Summer Resort is really magnificent" and memo contents such as pictures and videos to the user. Acquiring spatiotemporal information as the memo content not only facilitates the user to query the memo content based on the reminder instruction, but also improves the accuracy of the feedback result.
在利用备忘记录语音指令的方式存储备忘内容时,用户通常以其习惯的日常交流形式进行备忘内容输入,输入内容往往单一、不全面,因而存在备忘内容无法满足其备忘需求的情况,对后续的备忘提醒体验造成影响。在本申请实施例中,基于终端可以支持多模态备忘内容,在对用户输入的备忘内容进行信息抽取的基础上,终端获取备忘内容对应的扩展内容,并通过扩展内容对备忘内容进行丰富,以提高用户在备忘提醒提醒场景下的人机交互体验。 When using the memo recording voice command to store the memo content, the user usually inputs the memo content in the form of daily communication that the user is accustomed to. The input content is often single and incomplete, so there is a situation where the memo content cannot meet their memo needs, which affects the subsequent memo reminder experience. In the embodiment of the present application, based on the terminal can support multimodal memo content, on the basis of extracting information from the memo content input by the user, the terminal obtains the extended content corresponding to the memo content, and enriches the memo content through the extended content, so as to improve the user's human-computer interaction experience in the memo reminder scenario.
在备忘内容的内容模态为文本模态,且关键信息的信息量小于信息量阈值的情况下,终端获取听觉扩展内容或视觉扩展内容中的至少一种。When the content mode of the memo content is text mode and the information volume of the key information is less than the information volume threshold, the terminal obtains at least one of the auditory extended content or the visual extended content.
在一种可能的实施方式中,用户仅输入的备忘内容仅包括文本模态,终端对文本备忘内容进行信息抽取获得关键信息,而在关键信息的信息量小于信息量阈值,也即键值对中值得数量较小时,在被动触发的场景下,限缩了用户查询相应备忘内容所能使用的提醒指令,进而存在用户无法基于关键信息字典得到所需备忘内容的情况。在本申请实施例中,终端基于记录情景获取备忘内容相应的扩展内容,其中,扩展内容可以是网页快照、截图、目标区域。像素等视觉扩展内容,还可以是环境音、音乐等听觉扩展内容,为备忘提醒提供更为丰富的信息。In one possible implementation, the memo content input by the user only includes text mode, and the terminal extracts information from the text memo content to obtain key information. When the amount of key information is less than the information threshold, that is, when the number of values in the key-value pair is small, in a passive triggering scenario, the reminder instructions that the user can use to query the corresponding memo content are limited, and thus there is a situation where the user cannot obtain the required memo content based on the key information dictionary. In an embodiment of the present application, the terminal obtains the corresponding extended content of the memo content based on the recording scenario, wherein the extended content can be a web page snapshot, screenshot, or target area. Visual extended content such as pixels can also be auditory extended content such as ambient sound and music, providing richer information for memo reminders.
需要说明的是,在获取多模态扩展内容时,终端需要利用相机、录音机等涉及用户隐私的传感器,因此在获取扩展内容的场景下,终端可以设置主动询问,在获得用户的正向反馈后进行相应信息的采集。例如,用户通过备忘记录语音指令指示“帮我记下这家小明家常菜的小炒好吃”,基于备忘内容信息量小,终端可以通过语音或是视觉显示等方式提醒用户对门店招牌拍照,获取视觉扩展内容,丰富备忘信息。可选的,还可以设置终端响应于用户指令,自动开启传感器。It should be noted that when obtaining multimodal extended content, the terminal needs to use sensors involving user privacy such as cameras and recorders. Therefore, in the scenario of obtaining extended content, the terminal can set active inquiries and collect corresponding information after obtaining positive feedback from the user. For example, the user records a voice instruction through the memo to indicate "please help me write down the delicious stir-fried dishes of Xiao Ming's home cooking." Based on the small amount of information in the memo content, the terminal can remind the user to take a photo of the store sign through voice or visual display, obtain visual extended content, and enrich the memo information. Optionally, the terminal can also be set to automatically turn on the sensor in response to user instructions.
可选的,扩展内容还可以包括位置信息等,例如,用户通过备忘记录语音指令指示“帮我记下这家小明家常菜的小炒好吃”,终端获取文本模态备忘内容,为丰富备忘信息,在提醒用户对门店进行拍照获取视觉扩展内容的同时,终端可以相应的获取“小明家常菜”的位置坐标以及用户往来路线等作为扩展内容。Optionally, the extended content may also include location information, etc. For example, the user indicates in the memo recording voice command "Please write down for me how delicious the stir-fried dishes of Xiao Ming's home cooking are", and the terminal obtains the text modal memo content. To enrich the memo information, while reminding the user to take a photo of the store to obtain visual extended content, the terminal can correspondingly obtain the location coordinates of "Xiao Ming's Home Cooking" and the user's travel routes as extended content.
进一步的,终端对备忘内容、关键信息以及扩展内容进行关联存储。其中,基于扩展内容可以是多模态的,终端仍可以采用键值对的形式构建半结构化数据对扩展内容进行关联存储,表二以列举的方式说明了终端进行关联存储的方式。Furthermore, the terminal stores the memo content, key information and extended content in an associated manner. Since the extended content can be multimodal, the terminal can still construct semi-structured data in the form of key-value pairs to store the extended content in an associated manner. Table 2 lists the methods of storing the extended content in an associated manner.
表二
Table II
相应的,在基于关键信息确定满足备忘提醒触发条件的情况下,基于备忘内容以及扩展内容进行备忘提醒。关于备忘提醒的方式与上述实施例相同,此处不再赘述。Correspondingly, when it is determined based on the key information that the memo reminder triggering condition is met, a memo reminder is performed based on the memo content and the extended content. The memo reminder method is the same as the above embodiment and will not be described in detail here.
在应用中,购物应用中的商品链接或是公众号内容链接等网页链接具有一定的时效性,例如,对于商品链接,在商家下架处理该商品后,用户以备忘内容形式存储的网络链接随即失效,进而在后续查询过程中,用户无法基于备忘内容获得相关商品信息,影响用户体验。在本申请实施例中,终端可以通过获取扩展内容的方式,丰富备忘内容相关信息,以避免查询无果的情况发生。In applications, web links such as product links in shopping applications or public account content links have a certain timeliness. For example, for product links, after the merchant removes the product from the shelves, the network link stored by the user in the form of memo content will become invalid immediately, and then in the subsequent query process, the user cannot obtain relevant product information based on the memo content, affecting the user experience. In the embodiment of the present application, the terminal can enrich the information related to the memo content by obtaining extended content to avoid the situation where the query is fruitless.
在一种可能的实施方式中,在备忘内容具有时效的情况下,终端获取备忘内容对应的扩展内容。终端首先对获取到的备忘内容进行信息提取,在确定备忘内容对应的内容模态存在时效性时,例如备忘内容为URL、线上商品等,终端基于关键信息获取对应的扩展内容。In a possible implementation, when the memo content is time-sensitive, the terminal obtains the extended content corresponding to the memo content. The terminal first extracts information from the obtained memo content, and when it is determined that the content modality corresponding to the memo content is time-sensitive, for example, the memo content is a URL, an online product, etc., the terminal obtains the corresponding extended content based on the key information.
示意性的,如图11所示,在基于备忘记录语音指令进行备忘提醒的情况下,终端响应于用户的唤起语音获取用户的备忘记录语音指令“帮我记下这件裙子”,终端通过NLP处理等方 式,确定该备忘记录语音指令所指示的备忘内容为用户正在浏览的商品信息,进而终端获取当前商品链接,并通过截图工具截取当前商品图片,或是在图片可以保存的情况下,通过页面获取商品图片,以及获取当前商品的介绍文本“春秋新款连衣裙”,上述商品链接、商品图片、描述文本均为终端所获取的备忘内容。Schematically, as shown in FIG. 11 , in the case of a memo reminder based on a memo recording voice command, the terminal obtains the user's memo recording voice command "help me write down this skirt" in response to the user's arousal voice, and the terminal processes the memo recording voice command through NLP and other methods. The method determines that the memo content indicated by the memo record voice command is the product information that the user is browsing, and then the terminal obtains the current product link and captures the current product image through a screenshot tool, or obtains the product image through the page if the image can be saved, and obtains the current product introduction text "Spring and Autumn New Dress". The above product link, product image, and description text are all the memo contents obtained by the terminal.
进一步的,在基于关键信息确定满足备忘提醒触发条件,且备忘内容有效的情况下,终端基于备忘内容进行备忘提醒。其中,备忘提醒方式与上述实施例相同,此处不再赘述。Further, if the memo reminder triggering condition is met based on the key information and the memo content is valid, the terminal performs a memo reminder based on the memo content. The memo reminder method is the same as the above embodiment and will not be described in detail here.
相应的,在基于关键信息确定满足备忘提醒触发条件,且备忘内容失效的情况下,终端基于扩展内容进行备忘提醒。在基于扩展内容即可实现备忘提醒的情况下,终端以扩展内容或经过处理的扩展内容为反馈,进行备忘提醒。例如,当用户输入提醒指令“查看收藏过的连衣裙”时,对应的作为备忘内容的线上商品链接失效,同时终端存储有该线上链接对应的图片作为扩展内容,进而终端可以向用户反馈该图片完成备忘提醒。Accordingly, when it is determined based on the key information that the memo reminder triggering conditions are met and the memo content is invalid, the terminal performs a memo reminder based on the extended content. In the case where the memo reminder can be implemented based on the extended content, the terminal uses the extended content or the processed extended content as feedback to perform a memo reminder. For example, when the user enters the reminder command "view the collected dresses", the corresponding online product link as the memo content is invalid, and the terminal stores the picture corresponding to the online link as the extended content, and then the terminal can feedback the picture to the user to complete the memo reminder.
可选的,在基于扩展内容无法实现备忘提醒的情况下,终端可以基于扩展内容中的网页截图、快照等,进行在线搜索,并向用户反馈在线搜索得到的相似结果完成备忘提醒,避免在出现用户针对备忘内容进行检索而无法获得检索结果的情况,提升人机交互体验。Optionally, when the memo reminder cannot be realized based on the extended content, the terminal can perform an online search based on the web page screenshots, snapshots, etc. in the extended content, and feedback similar results obtained from the online search to the user to complete the memo reminder, thereby avoiding the situation where the user searches for the memo content but cannot obtain the search results, thereby improving the human-computer interaction experience.
在针对多模态备忘内容进行被备忘提醒时,不同模态间的数据信息存在异构性,因此在基于多模态备忘内容构建半结构化数据进行关联存储时,存在数据形式复杂多变的问题,影响备忘提醒过程中的输入输出效率。在本申请实施例中,终端可以通过多模态融合令不同内容模态的备忘内容实现统一表示,提高备忘提醒效率。When multimodal memo content is reminded, the data information between different modes is heterogeneous. Therefore, when semi-structured data is constructed based on multimodal memo content for associated storage, there is a problem of complex and changeable data format, which affects the input and output efficiency of the memo reminder process. In an embodiment of the present application, the terminal can achieve unified representation of memo content of different content modes through multimodal fusion, thereby improving the efficiency of memo reminders.
在备忘内容的输入阶段,终端对关键信息进行向量化编码,得到关键信息向量。其中,关于进行向量化编码的方式,可以是深度学习中的多模态融合技术(Multimodality Fusion Technology,MFT),终端利用深度神经网络(Deep Neural Networks,DNN)以及多模态预训练模型等技术,将半结构化数据中的多模态关键信息转换为高维空间下的向量,实现关键信息的统一表示,为用户通过被动触发的方式,也即通过提醒指令获取备忘内容提供了便利。During the input stage of the memo content, the terminal vectorizes the key information to obtain the key information vector. Among them, the method of vectorization encoding can be the multimodality fusion technology (MFT) in deep learning. The terminal uses deep neural networks (DNN) and multimodal pre-training models and other technologies to convert multimodal key information in semi-structured data into vectors in high-dimensional space, realize the unified representation of key information, and provide convenience for users to obtain the memo content through passive triggering, that is, through reminder instructions.
进一步的,在获得关键信息向量的情况下,终端对备忘内容以及关键信息向量进行关联存储。终端可以将关键信息向量以键值对的形式进行存储得到关键信息字典,其中关键词可以为特征向量,值为通过向量化编码得到的关键信息向量。Furthermore, when the key information vector is obtained, the terminal stores the memo content and the key information vector in association. The terminal can store the key information vector in the form of a key-value pair to obtain a key information dictionary, where the keyword can be a feature vector and the value is the key information vector obtained by vectorization encoding.
相应的,在备忘提醒阶段,终端在接收到提醒指令的情况下,对提醒指令进行向量化编码,得到指令向量。Correspondingly, in the memo reminder stage, when the terminal receives the reminder instruction, it vectorizes and encodes the reminder instruction to obtain an instruction vector.
在一种可能的实施方式中,在接收到提醒指令的情况下,终端基于提醒指令对应的内容模态,对提醒指令进行信息抽取,得到指令关键信息。其中,终端对提醒指令进行信息抽取的方式与上述对于备忘内容进行信息抽取的方式相同,此处不再赘述。在得到指令关键信息的情况下,终端对指令关键信息进行向量化编码,得到指令向量。其中,终端对指令关键信息进行向量化编码的方式与上述实施例相同,此处不再赘述。In a possible implementation, when receiving a reminder instruction, the terminal extracts information from the reminder instruction based on the content mode corresponding to the reminder instruction to obtain key instruction information. The way in which the terminal extracts information from the reminder instruction is the same as the above-mentioned way of extracting information from the memo content, which will not be repeated here. When obtaining the key information of the instruction, the terminal vectorizes the key information of the instruction to obtain an instruction vector. The way in which the terminal vectorizes the key information of the instruction is the same as the above-mentioned embodiment, which will not be repeated here.
进一步的,在指令向量与关键信息向量的向量相似度大于阈值的情况下,确定满足备忘提醒触发条件,并基于备忘内容进行备忘提醒。在确定指令向量的情况下,终端可以将指令向量与关键信息字典中的关键信息向量进行比较,也即计算高维空间中指令向量与关键信息向量间的余弦距离,在余弦距离所表征的向量相似度大于向量相似度阈值的情况下,终端确定满足备忘提醒触发条件,进而向用户反馈对应的备忘内容。Furthermore, when the vector similarity between the instruction vector and the key information vector is greater than a threshold, it is determined that the memo reminder trigger condition is met, and a memo reminder is performed based on the memo content. When the instruction vector is determined, the terminal can compare the instruction vector with the key information vector in the key information dictionary, that is, calculate the cosine distance between the instruction vector and the key information vector in the high-dimensional space. When the vector similarity represented by the cosine distance is greater than the vector similarity threshold, the terminal determines that the memo reminder trigger condition is met, and then feeds back the corresponding memo content to the user.
需要说明的是,在基于备忘内容完成被动触发后,若用户认为该备忘内容无需保留,则响应于备忘删除指令,终端删除该备忘内容。例如,用户通过提醒指令“我把钥匙放在哪里了”进行被动触发,终端基于关键信息向用户反馈“放在书桌抽屉里了”,用户在认为该备忘内容可以删除的情况下,语音输入“删除这条备忘”通知终端进行删除。It should be noted that after the passive trigger is completed based on the memo content, if the user believes that the memo content does not need to be retained, the terminal deletes the memo content in response to the memo deletion instruction. For example, the user performs a passive trigger through the reminder instruction "Where did I put my keys?", and the terminal gives the user feedback "I put them in the desk drawer" based on the key information. If the user believes that the memo content can be deleted, the user can voice input "Delete this memo" to notify the terminal to delete it.
可选的,终端还可以基于关键信息中的时效信息向用户进行删除提醒,以便提醒用户删除过期信息节省存储空间,例如,对于作为备忘内容的航班信息,在航班时间已经过的情况下,终端向用户进行航班过期提醒,主动提示用户对该备忘内容进行删除。 Optionally, the terminal can also remind the user to delete the expired information based on the timeliness information in the key information, so as to remind the user to delete the expired information to save storage space. For example, for flight information used as a memo content, if the flight time has passed, the terminal will remind the user of the flight expiration and actively prompt the user to delete the memo content.
可选的,基于本申请实施例通过确定备忘内容的意图类型,赋予备忘内容不同的触发方式信息,对于触发方式信息指示的触发方式为主动触发的备忘内容,在终端主动拉起完成备忘提醒后,该备忘内容也往往丧失了存储价值,相应的,终端可以在完成备忘提醒后向用户进行删除提醒,并响应于用户的正向反馈删除该备忘内容。Optionally, based on the embodiment of the present application, by determining the intention type of the memo content, different trigger mode information is assigned to the memo content. For the memo content whose trigger mode indicated by the trigger mode information is active triggering, after the terminal actively pulls up to complete the memo reminder, the memo content often loses its storage value. Accordingly, the terminal can remind the user to delete the memo after completing the memo reminder, and delete the memo content in response to the user's positive feedback.
需要说明的是,本申请所涉及的信息(包括但不限于用户设备信息、用户个人信息等)、数据(包括但不限于用于分析的数据、存储的数据、展示的数据等)以及信号,均为经用户授权或者经过各方充分授权的,且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。例如,本申请中涉及到的语音、行程信息、地理位置等等都是在充分授权的情况下获取的。It should be noted that the information (including but not limited to user device information, user personal information, etc.), data (including but not limited to data used for analysis, stored data, displayed data, etc.) and signals involved in this application are all authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data must comply with the relevant laws, regulations and standards of the relevant countries and regions. For example, the voice, itinerary information, geographic location, etc. involved in this application are all obtained with full authorization.
请参考图12,其示出了本申请一个示例性实施例提供的备忘提醒装置的结构框图,该装置包括:Please refer to FIG. 12 , which shows a structural block diagram of a memo reminder device provided by an exemplary embodiment of the present application, the device comprising:
获取模块1201,用于在存在备忘记录需求的情况下,获取备忘内容;The acquisition module 1201 is used to acquire the memo content when there is a memo recording requirement;
信息抽取模块1202,用于基于所述备忘内容对应的内容模态,对所述备忘内容进行信息抽取,得到关键信息,其中,不同内容模态下进行信息抽取的方式不同;An information extraction module 1202 is used to extract information from the memo content based on the content mode corresponding to the memo content to obtain key information, wherein different information extraction methods are used in different content modes;
存储模块1203,用于对所述备忘内容以及所述关键信息进行关联存储;The storage module 1203 is used to store the memo content and the key information in association;
备忘提醒模块1204,用于在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容进行备忘提醒。The memo reminder module 1204 is configured to make a memo reminder based on the memo content when it is determined based on the key information that a memo reminder triggering condition is met.
可选的,所述信息抽取模块1202,还用于:Optionally, the information extraction module 1202 is further used to:
在所述备忘内容对应的所述内容模态包括文本模态的情况下,对所述备忘内容进行自然语言处理,得到文本关键信息;In a case where the content modality corresponding to the memo content includes a text modality, performing natural language processing on the memo content to obtain text key information;
在所述备忘内容对应的所述内容模态包括视觉模态的情况下,对所述备忘内容进行图像识别处理,得到图像关键信息;In a case where the content modality corresponding to the memo content includes a visual modality, performing image recognition processing on the memo content to obtain image key information;
在所述备忘内容对应的所述内容模态包括听觉模态的情况下,对所述备忘内容进行音频识别处理,得到音频关键信息。In the case that the content modality corresponding to the memo content includes an auditory modality, audio recognition processing is performed on the memo content to obtain audio key information.
可选的,在对所述备忘内容进行自然语言处理,得到文本关键信息的情况下,所述信息抽取模块1202,还用于:Optionally, when natural language processing is performed on the memo content to obtain key text information, the information extraction module 1202 is further used to:
对所述备忘内容进行命名实体识别,得到实体信息;Performing named entity recognition on the memo content to obtain entity information;
对所述备忘内容进行实体关系提取抽取,得到实体关系信息;Extracting entity relationships from the memo content to obtain entity relationship information;
对所述备忘内容进行主题摘要抽取,得到主题信息;Extracting a subject summary from the memo content to obtain subject information;
对所述备忘内容进行文本意图识别,得到意图类型,所述意图类型用于表征进行备忘记录的意图;Performing text intent recognition on the memo content to obtain an intent type, where the intent type is used to characterize the intent of recording the memo;
对所述备忘内容进行因果推断分析,得到触发方式信息,其中,触发方式包括主动触发和被动触发,且在所述触发方式信息指示的触发方式为主动触发的情况下,所述触发方式信息中包含主动触发条件。Perform causal inference analysis on the memo content to obtain trigger mode information, wherein the trigger mode includes active triggering and passive triggering, and when the trigger mode indicated by the trigger mode information is active triggering, the trigger mode information contains active triggering conditions.
可选的,在所述关键信息包含所述文本关键信息,且所述文本关键信息中包含所述触发方式信息的情况下,所述备忘提醒模块1204,还用于:Optionally, when the key information includes the text key information, and the text key information includes the trigger mode information, the memo reminder module 1204 is further used to:
在所述触发方式信息指示的触发方式为主动触发,且满足所述主动触发条件的情况下,基于所述备忘内容进行主动备忘提醒;When the trigger mode indicated by the trigger mode information is active triggering and the active triggering condition is met, an active memo reminder is performed based on the memo content;
在所述触发方式信息指示的触发方式为被动触发,且所述关键信息与提醒指令匹配的情况下,基于所述备忘内容进行被动备忘提醒。When the trigger mode indicated by the trigger mode information is a passive trigger and the key information matches the reminder instruction, a passive memo reminder is performed based on the memo content.
可选的,在对所述备忘内容进行图像识别处理,得到图像关键信息的情况下,所述信息抽取模块1202,还用于:Optionally, when image recognition processing is performed on the memo content to obtain image key information, the information extraction module 1202 is further used to:
在所述备忘内容为图片的情况下,对图片进行光学字符识别,得到图片文本;和/或,对图片进行图像自然语言描述处理,得到图片描述文本;In the case where the memo content is a picture, performing optical character recognition on the picture to obtain picture text; and/or performing image natural language description processing on the picture to obtain picture description text;
在所述备忘内容为视频的情况下,对视频进行视频理解,得到视频描述文本; When the memo content is a video, performing video understanding on the video to obtain a video description text;
对所述图片文本、所述图片描述文本以及所述视频描述文本中的至少一种进行自然语言处理,得到所述文本关键信息。Natural language processing is performed on at least one of the picture text, the picture description text, and the video description text to obtain the text key information.
可选的,在对所述备忘内容进行音频识别提取,得到音频关键信息的情况下,所述信息抽取模块1202,还用于:Optionally, when audio recognition and extraction are performed on the memo content to obtain audio key information, the information extraction module 1202 is further used to:
对所述备忘内容进行自动语音识别,得到音频文本;和/或,对所述备忘内容进行音频特征提取,得到音频指纹;Performing automatic speech recognition on the memo content to obtain an audio text; and/or performing audio feature extraction on the memo content to obtain an audio fingerprint;
对所述音频文本进行自然语言处理,得到所述文本关键信息。Perform natural language processing on the audio text to obtain key information of the text.
可选的,所述获取模块1201,还用于:Optionally, the acquisition module 1201 is further used to:
获取所述备忘内容对应的扩展内容;Obtaining the extended content corresponding to the memo content;
所述存储模块1203,还用于:The storage module 1203 is further used for:
对所述备忘内容、所述关键信息以及所述扩展内容进行关联存储。The memo content, the key information and the extended content are stored in association.
可选的,所述获取模块1201,还用于:Optionally, the acquisition module 1201 is further used to:
在所述备忘内容的所述内容模态为文本模态,且所述关键信息的信息量小于信息量阈值的情况下,获取听觉扩展内容或视觉扩展内容中的至少一种;When the content mode of the memo content is a text mode and the information amount of the key information is less than an information amount threshold, obtaining at least one of auditory extended content or visual extended content;
所述备忘提醒模块1204,还用于:The memo reminder module 1204 is also used for:
在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容以及所述扩展内容进行备忘提醒。In the case where it is determined based on the key information that the memo reminder triggering condition is met, a memo reminder is performed based on the memo content and the extended content.
可选的,所述获取模块1201,还用于:Optionally, the acquisition module 1201 is further used to:
在所述备忘内容具有时效的情况下,获取所述备忘内容对应的所述扩展内容;In the case where the memo content is time-limited, obtaining the extended content corresponding to the memo content;
所述备忘提醒模块1204,还用于:The memo reminder module 1204 is also used for:
在基于所述关键信息确定满足备忘提醒触发条件,且所述备忘内容有效的情况下,基于所述备忘内容进行备忘提醒;If it is determined based on the key information that a memo reminder triggering condition is met and the memo content is valid, a memo reminder is made based on the memo content;
在基于所述关键信息确定满足备忘提醒触发条件,且所述备忘内容失效的情况下,基于所述扩展内容进行备忘提醒。When it is determined based on the key information that a memo reminder triggering condition is met and the memo content is invalid, a memo reminder is performed based on the extended content.
可选的,所述获取模块1201,还用于:Optionally, the acquisition module 1201 is further used to:
在存在备忘记录需求的情况下,获取时空信息,所述时空信息用于表征进行备忘记录时的时间以及空间状态;When there is a need for memo recording, obtaining time and space information, wherein the time and space information is used to represent the time and space state when the memo is recorded;
所述存储模块1203,还用于:The storage module 1203 is further used for:
对所述备忘内容、所述关键信息以及所述时空信息进行关联存储;storing the memo content, the key information and the time and space information in association with each other;
所述备忘提醒模块1204,还用于:The memo reminder module 1204 is also used for:
在基于所述关键信息以及所述时空信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容进行备忘提醒。When it is determined based on the key information and the spatiotemporal information that a memo reminder triggering condition is met, a memo reminder is performed based on the memo content.
可选的,所述装置还包括编码模块,用于对所述关键信息进行向量化编码,得到关键信息向量;Optionally, the device further includes an encoding module, configured to perform vector encoding on the key information to obtain a key information vector;
所述存储模块1203,还用于:The storage module 1203 is further used for:
对所述备忘内容以及所述关键信息向量进行关联存储;storing the memo content and the key information vector in association with each other;
所述编码模块,还用于:The encoding module is further used for:
在接收到提醒指令的情况下,对所述提醒指令进行向量化编码,得到指令向量;When receiving a reminder instruction, vectorize and encode the reminder instruction to obtain an instruction vector;
所述备忘提醒模块1204,还用于:The memo reminder module 1204 is also used for:
在所述指令向量与所述关键信息向量的向量相似度大于阈值的情况下,确定满足备忘提醒触发条件,并基于所述备忘内容进行备忘提醒。When the vector similarity between the instruction vector and the key information vector is greater than a threshold, it is determined that a memo reminder triggering condition is met, and a memo reminder is performed based on the memo content.
可选的,在对所述提醒指令进行向量化编码,得到指令向量的情况下,所述信息抽取模块1202,还用于:Optionally, when the reminder instruction is vectorized and encoded to obtain an instruction vector, the information extraction module 1202 is further used to:
在接收到所述提醒指令的情况下,基于所述提醒指令对应的内容模态,对所述提醒指令进行信息抽取,得到指令关键信息; When the reminder instruction is received, extracting information from the reminder instruction based on the content mode corresponding to the reminder instruction to obtain key instruction information;
所述编码模块,还用于:The encoding module is further used for:
对所述指令关键信息进行向量化编码,得到所述指令向量。The key information of the instruction is vectorized and encoded to obtain the instruction vector.
可选的,所述获取模块1201,还用于:Optionally, the acquisition module 1201 is further used to:
在接收到备忘记录语音指令的情况下,获取所述备忘记录语音指令所指示的所述备忘内容;或者,In case of receiving a memo recording voice instruction, obtaining the memo content indicated by the memo recording voice instruction; or,
在获取到用户行为信息,且所述用户行为信息满足备忘条件的情况下,基于所述用户行为信息确定所述备忘内容。When the user behavior information is acquired and the user behavior information satisfies the memo condition, the memo content is determined based on the user behavior information.
可选的,所述装置还包括删除模块,用于响应于备忘删除指令,删除所述备忘内容。Optionally, the device further comprises a deletion module, configured to delete the memo content in response to a memo deletion instruction.
综上所述,本申请实施例中,终端利用获取模块通过多种传感器获取文本、视觉、听觉、情景信息、时空信息等多模态多维度信息构成备忘内容,在兼容多模态数据的同时,提高了备忘内容的丰富度。在处理备忘内容方面,本申请实施例通过信息抽取模块进行全方位信息抽取并通过存储模块关联存储,终端在确定备忘内容的意图类型的基础上,自动判断备忘提醒的触发方式,对于被动触发的备忘内容,用户在通过提醒指令被动触发备忘提醒时,同样可以通过多模态输入实现触发。本申请实施例通过扩展支持多模态信息的输入与输出,改良了帮助用户记忆的方式,进而提高了人机交互的效率和质量。In summary, in the embodiment of the present application, the terminal uses an acquisition module to acquire multimodal and multidimensional information such as text, vision, hearing, situational information, and spatiotemporal information through a variety of sensors to form the memo content. While being compatible with multimodal data, it improves the richness of the memo content. In terms of processing the memo content, the embodiment of the present application performs all-round information extraction through the information extraction module and associates and stores it through the storage module. The terminal automatically determines the triggering method of the memo reminder based on the determination of the intention type of the memo content. For passively triggered memo content, when the user passively triggers the memo reminder through the reminder instruction, it can also be triggered through multimodal input. The embodiment of the present application improves the way to help users remember by expanding the support for the input and output of multimodal information, thereby improving the efficiency and quality of human-computer interaction.
本申请实施例还提供了一种计算机可读存储介质,该存储介质存储有至少一段程序,至少一段程序用于被处理器执行以实现如上述实施例所述的备忘提醒方法。The embodiment of the present application further provides a computer-readable storage medium, which stores at least one program, and the at least one program is used to be executed by a processor to implement the memo reminder method as described in the above embodiment.
本申请实施例提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述实施例提供的备忘提醒方法。The embodiment of the present application provides a computer program product or a computer program, which includes a computer instruction stored in a computer-readable storage medium. The processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the memo reminder method provided in the above embodiment.
本领域技术人员应该可以意识到,在上述一个或多个示例中,本申请实施例所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是通用或专用计算机能够存取的任何可用介质。Those skilled in the art should be aware that in one or more of the above examples, the functions described in the embodiments of the present application can be implemented with hardware, software, firmware, or any combination thereof. When implemented using software, these functions can be stored in a computer-readable medium or transmitted as one or more instructions or codes on a computer-readable medium. Computer-readable media include computer storage media and communication media, wherein the communication media include any media that facilitates the transmission of a computer program from one place to another. The storage medium can be any available medium that a general or special-purpose computer can access.
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。 The above description is only an optional embodiment of the present application and is not intended to limit the present application. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present application shall be included in the protection scope of the present application.

Claims (18)

  1. 一种备忘提醒方法,所述方法由终端执行,所述方法包括:A memo reminder method, the method being executed by a terminal, the method comprising:
    在存在备忘记录需求的情况下,获取备忘内容;If there is a need for memo recording, obtain the memo content;
    基于所述备忘内容对应的内容模态,对所述备忘内容进行信息抽取,得到关键信息,其中,不同内容模态下进行信息抽取的方式不同;Based on the content mode corresponding to the memo content, extract information from the memo content to obtain key information, wherein the information extraction method is different under different content modes;
    对所述备忘内容以及所述关键信息进行关联存储;storing the memo content and the key information in association with each other;
    在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容进行备忘提醒。When it is determined based on the key information that the memo reminder triggering condition is met, a memo reminder is performed based on the memo content.
  2. 根据权利要求1所述的方法,其中,所述基于所述备忘内容对应的内容模态,对所述备忘内容进行信息抽取,得到关键信息,包括:The method according to claim 1, wherein the extracting information from the memo content based on the content modality corresponding to the memo content to obtain key information comprises:
    在所述备忘内容对应的所述内容模态包括文本模态的情况下,对所述备忘内容进行自然语言处理,得到文本关键信息;In a case where the content modality corresponding to the memo content includes a text modality, performing natural language processing on the memo content to obtain text key information;
    在所述备忘内容对应的所述内容模态包括视觉模态的情况下,对所述备忘内容进行图像识别处理,得到图像关键信息;In a case where the content modality corresponding to the memo content includes a visual modality, performing image recognition processing on the memo content to obtain image key information;
    在所述备忘内容对应的所述内容模态包括听觉模态的情况下,对所述备忘内容进行音频识别处理,得到音频关键信息。In the case that the content modality corresponding to the memo content includes an auditory modality, audio recognition processing is performed on the memo content to obtain audio key information.
  3. 根据权利要求2所述的方法,其中,所述对所述备忘内容进行自然语言处理,得到文本关键信息,包括如下至少一种方式:The method according to claim 2, wherein the performing natural language processing on the memo content to obtain text key information comprises at least one of the following methods:
    对所述备忘内容进行命名实体识别,得到实体信息;Performing named entity recognition on the memo content to obtain entity information;
    对所述备忘内容进行实体关系抽取,得到实体关系信息;Extracting entity relationships from the memo content to obtain entity relationship information;
    对所述备忘内容进行主题摘要抽取,得到主题信息;Extracting a subject summary from the memo content to obtain subject information;
    对所述备忘内容进行文本意图识别,得到意图类型,所述意图类型用于表征进行备忘记录的意图;Performing text intent recognition on the memo content to obtain an intent type, where the intent type is used to characterize the intent of recording the memo;
    对所述备忘内容进行因果推断分析,得到触发方式信息,其中,触发方式包括主动触发和被动触发,且在所述触发方式信息指示的触发方式为主动触发的情况下,所述触发方式信息中包含主动触发条件。Perform causal inference analysis on the memo content to obtain trigger mode information, wherein the trigger mode includes active triggering and passive triggering, and when the trigger mode indicated by the trigger mode information is active triggering, the trigger mode information contains active triggering conditions.
  4. 根据权利要求3所述的方法,其中,所述关键信息包含所述文本关键信息,且所述文本关键信息中包含所述触发方式信息;The method according to claim 3, wherein the key information includes the text key information, and the text key information includes the trigger mode information;
    所述在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容进行备忘提醒,包括:In the case where it is determined based on the key information that a memo reminder triggering condition is satisfied, performing a memo reminder based on the memo content includes:
    在所述触发方式信息指示的触发方式为主动触发,且满足所述主动触发条件的情况下,基于所述备忘内容进行主动备忘提醒;When the trigger mode indicated by the trigger mode information is active triggering and the active triggering condition is met, an active memo reminder is performed based on the memo content;
    在所述触发方式信息指示的触发方式为被动触发,且所述关键信息与提醒指令匹配的情况下,基于所述备忘内容进行被动备忘提醒。When the trigger mode indicated by the trigger mode information is a passive trigger and the key information matches the reminder instruction, a passive memo reminder is performed based on the memo content.
  5. 根据权利要求2所述的方法,其中,所述对所述备忘内容进行图像识别处理,得到图像关键信息,包括:The method according to claim 2, wherein the step of performing image recognition processing on the memo content to obtain image key information comprises:
    在所述备忘内容为图片的情况下,对图片进行光学字符识别,得到图片文本;和/或,对图片进行图像自然语言描述处理,得到图片描述文本;In the case where the memo content is a picture, performing optical character recognition on the picture to obtain picture text; and/or performing image natural language description processing on the picture to obtain picture description text;
    在所述备忘内容为视频的情况下,对视频进行视频理解,得到视频描述文本;When the memo content is a video, performing video understanding on the video to obtain a video description text;
    所述方法还包括: The method further comprises:
    对所述图片文本、所述图片描述文本以及所述视频描述文本中的至少一种进行自然语言处理,得到所述文本关键信息。Natural language processing is performed on at least one of the picture text, the picture description text, and the video description text to obtain the text key information.
  6. 根据权利要求2所述的方法,其中,所述对所述备忘内容进行音频识别提取,得到音频关键信息,包括:The method according to claim 2, wherein the step of performing audio recognition and extraction on the memo content to obtain audio key information comprises:
    对所述备忘内容进行自动语音识别,得到音频文本;和/或,对所述备忘内容进行音频特征提取,得到音频指纹;Performing automatic speech recognition on the memo content to obtain an audio text; and/or performing audio feature extraction on the memo content to obtain an audio fingerprint;
    所述方法还包括:The method further comprises:
    对所述音频文本进行自然语言处理,得到所述文本关键信息。Perform natural language processing on the audio text to obtain key information of the text.
  7. 根据权利要求1所述的方法,其中,所述获取备忘内容之后,所述方法还包括:The method according to claim 1, wherein after obtaining the memo content, the method further comprises:
    获取所述备忘内容对应的扩展内容;Obtaining the extended content corresponding to the memo content;
    所述对所述备忘内容以及所述关键信息进行关联存储,包括:The associative storage of the memo content and the key information includes:
    对所述备忘内容、所述关键信息以及所述扩展内容进行关联存储。The memo content, the key information and the extended content are stored in association.
  8. 根据权利要求7所述的方法,其中,所述获取所述备忘内容对应的扩展内容,包括:The method according to claim 7, wherein the step of obtaining the extended content corresponding to the memo content comprises:
    在所述备忘内容的所述内容模态为文本模态,且所述关键信息的信息量小于信息量阈值的情况下,获取听觉扩展内容或视觉扩展内容中的至少一种;When the content mode of the memo content is a text mode and the information volume of the key information is less than an information volume threshold, obtaining at least one of auditory extended content or visual extended content;
    所述在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容进行备忘提醒,包括:In the case where it is determined based on the key information that a memo reminder triggering condition is satisfied, performing a memo reminder based on the memo content includes:
    在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容以及所述扩展内容进行备忘提醒。In the case where it is determined based on the key information that the memo reminder triggering condition is met, a memo reminder is performed based on the memo content and the extended content.
  9. 根据权利要求7所述的方法,其中,所述获取所述备忘内容对应的扩展内容,包括:The method according to claim 7, wherein the step of obtaining the extended content corresponding to the memo content comprises:
    在所述备忘内容具有时效的情况下,获取所述备忘内容对应的所述扩展内容;In the case where the memo content is time-limited, obtaining the extended content corresponding to the memo content;
    所述在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容进行备忘提醒,包括:In the case where it is determined based on the key information that a memo reminder triggering condition is satisfied, performing a memo reminder based on the memo content includes:
    在基于所述关键信息确定满足备忘提醒触发条件,且所述备忘内容有效的情况下,基于所述备忘内容进行备忘提醒;If it is determined based on the key information that a memo reminder triggering condition is met and the memo content is valid, a memo reminder is made based on the memo content;
    所述方法还包括:The method further comprises:
    在基于所述关键信息确定满足备忘提醒触发条件,且所述备忘内容失效的情况下,基于所述扩展内容进行备忘提醒。When it is determined based on the key information that a memo reminder triggering condition is met and the memo content is invalid, a memo reminder is performed based on the extended content.
  10. 根据权利要求1所述的方法,其中,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    在存在备忘记录需求的情况下,获取时空信息,所述时空信息用于表征进行备忘记录时的时间以及空间状态;When there is a need for memo recording, obtaining time and space information, wherein the time and space information is used to represent the time and space state when the memo is recorded;
    所述对所述备忘内容以及所述关键信息进行关联存储,包括:The associative storage of the memo content and the key information includes:
    对所述备忘内容、所述关键信息以及所述时空信息进行关联存储;storing the memo content, the key information and the time and space information in association with each other;
    所述在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容进行备忘提醒,包括:In the case where it is determined based on the key information that a memo reminder triggering condition is satisfied, performing a memo reminder based on the memo content includes:
    在基于所述关键信息以及所述时空信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容进行备忘提醒。When it is determined based on the key information and the spatiotemporal information that a memo reminder triggering condition is met, a memo reminder is performed based on the memo content.
  11. 根据权利要求1所述的方法,其中,所述基于所述备忘内容对应的内容模态,对所述备忘内容进行信息抽取,得到关键信息之后,所述方法还包括:The method according to claim 1, wherein after extracting information from the memo content based on the content modality corresponding to the memo content to obtain key information, the method further comprises:
    对所述关键信息进行向量化编码,得到关键信息向量; Performing vector encoding on the key information to obtain a key information vector;
    对所述备忘内容以及所述关键信息向量进行关联存储;storing the memo content and the key information vector in association with each other;
    所述在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容进行备忘提醒,包括:In the case where it is determined based on the key information that a memo reminder triggering condition is satisfied, performing a memo reminder based on the memo content includes:
    在接收到提醒指令的情况下,对所述提醒指令进行向量化编码,得到指令向量;When receiving a reminder instruction, vectorize and encode the reminder instruction to obtain an instruction vector;
    在所述指令向量与所述关键信息向量的向量相似度大于阈值的情况下,确定满足备忘提醒触发条件,并基于所述备忘内容进行备忘提醒。When the vector similarity between the instruction vector and the key information vector is greater than a threshold, it is determined that a memo reminder triggering condition is met, and a memo reminder is performed based on the memo content.
  12. 根据权利要求11所述的方法,其中,所述在接收到提醒指令的情况下,对所述提醒指令进行向量化编码,得到指令向量,包括:The method according to claim 11, wherein, when receiving a reminder instruction, vectorizing and encoding the reminder instruction to obtain an instruction vector comprises:
    在接收到所述提醒指令的情况下,基于所述提醒指令对应的内容模态,对所述提醒指令进行信息抽取,得到指令关键信息;When the reminder instruction is received, extracting information from the reminder instruction based on the content mode corresponding to the reminder instruction to obtain key instruction information;
    对所述指令关键信息进行向量化编码,得到所述指令向量。The key information of the instruction is vectorized and encoded to obtain the instruction vector.
  13. 根据权利要求1所述的方法,其中,所述在存在备忘记录需求的情况下,获取备忘内容,包括:The method according to claim 1, wherein, when there is a need for memo recording, obtaining the memo content comprises:
    在接收到备忘记录语音指令的情况下,获取所述备忘记录语音指令所指示的所述备忘内容;或者,In case of receiving a memo recording voice instruction, obtaining the memo content indicated by the memo recording voice instruction; or,
    在获取到用户行为信息,且所述用户行为信息满足备忘条件的情况下,基于所述用户行为信息确定所述备忘内容。When the user behavior information is acquired and the user behavior information satisfies the memo condition, the memo content is determined based on the user behavior information.
  14. 根据权利要求1所述的方法,其中,所述在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容进行备忘提醒之后,所述方法还包括:The method according to claim 1, wherein, after performing a memo reminder based on the memo content when it is determined based on the key information that a memo reminder triggering condition is met, the method further comprises:
    响应于备忘删除指令,删除所述备忘内容。In response to the memo deletion instruction, the memo content is deleted.
  15. 一种备忘提醒装置,所述装置包括:A memo reminder device, comprising:
    获取模块,用于在存在备忘记录需求的情况下,获取备忘内容;The acquisition module is used to obtain the memo content when there is a need for memo recording;
    信息抽取模块,用于基于所述备忘内容对应的内容模态,对所述备忘内容进行信息抽取,得到关键信息,其中,不同内容模态下进行信息抽取的方式不同;An information extraction module, configured to extract information from the memo content based on the content mode corresponding to the memo content to obtain key information, wherein different information extraction methods are used in different content modes;
    存储模块,用于对所述备忘内容以及所述关键信息进行关联存储;A storage module, used for associating and storing the memo content and the key information;
    备忘提醒模块,用于在基于所述关键信息确定满足备忘提醒触发条件的情况下,基于所述备忘内容进行备忘提醒。The memo reminder module is used to make a memo reminder based on the memo content when it is determined based on the key information that a memo reminder triggering condition is met.
  16. 一种终端,所述终端包括处理器和存储器;所述存储器存储有至少一段程序,所述至少一段程序用于被所述处理器执行以实现如权利要求1至14任一所述的备忘提醒方法。A terminal comprises a processor and a memory; the memory stores at least one program, and the at least one program is used to be executed by the processor to implement the memo reminder method according to any one of claims 1 to 14.
  17. 一种计算机可读存储介质,所述存储介质存储有至少一段程序,所述至少一段程序用于被处理器执行以实现如权利要求1至14任一所述的备忘提醒方法。A computer-readable storage medium stores at least one program, wherein the at least one program is used to be executed by a processor to implement the memo reminder method according to any one of claims 1 to 14.
  18. 一种计算机程序产品,所述计算机程序产品包括计算机指令,所述计算机指令存储在计算机可读存储介质中;计算机设备的处理器从所述计算机可读存储介质读取所述计算机指令,所述处理器执行所述计算机指令,使得所述计算机设备实现如权利要求1至14任一所述的备忘提醒方法。 A computer program product, the computer program product comprising computer instructions, the computer instructions being stored in a computer-readable storage medium; a processor of a computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device implements the memo reminder method as described in any one of claims 1 to 14.
PCT/CN2023/117394 2022-10-12 2023-09-07 Memo reminding method and apparatus, and terminal and storage medium WO2024078210A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211249686.9A CN115526602A (en) 2022-10-12 2022-10-12 Memo reminding method, device, terminal and storage medium
CN202211249686.9 2022-10-12

Publications (1)

Publication Number Publication Date
WO2024078210A1 true WO2024078210A1 (en) 2024-04-18

Family

ID=84701934

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/117394 WO2024078210A1 (en) 2022-10-12 2023-09-07 Memo reminding method and apparatus, and terminal and storage medium

Country Status (2)

Country Link
CN (1) CN115526602A (en)
WO (1) WO2024078210A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115526602A (en) * 2022-10-12 2022-12-27 Oppo广东移动通信有限公司 Memo reminding method, device, terminal and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6564263B1 (en) * 1998-12-04 2003-05-13 International Business Machines Corporation Multimedia content description framework
CN101937529A (en) * 2010-07-30 2011-01-05 中山大学 Digital household memorandum reminder
CN106933807A (en) * 2017-03-20 2017-07-07 北京光年无限科技有限公司 Memorandum event-prompting method and system
CN109862190A (en) * 2019-03-26 2019-06-07 努比亚技术有限公司 Control method, device, mobile terminal and the storage medium of terminal message memorandum
CN111260899A (en) * 2020-02-27 2020-06-09 上海萃钛智能科技有限公司 Intelligent visual capture reminding AR device, reminding system and reminding method
CN115526602A (en) * 2022-10-12 2022-12-27 Oppo广东移动通信有限公司 Memo reminding method, device, terminal and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6564263B1 (en) * 1998-12-04 2003-05-13 International Business Machines Corporation Multimedia content description framework
CN101937529A (en) * 2010-07-30 2011-01-05 中山大学 Digital household memorandum reminder
CN106933807A (en) * 2017-03-20 2017-07-07 北京光年无限科技有限公司 Memorandum event-prompting method and system
CN109862190A (en) * 2019-03-26 2019-06-07 努比亚技术有限公司 Control method, device, mobile terminal and the storage medium of terminal message memorandum
CN111260899A (en) * 2020-02-27 2020-06-09 上海萃钛智能科技有限公司 Intelligent visual capture reminding AR device, reminding system and reminding method
CN115526602A (en) * 2022-10-12 2022-12-27 Oppo广东移动通信有限公司 Memo reminding method, device, terminal and storage medium

Also Published As

Publication number Publication date
CN115526602A (en) 2022-12-27

Similar Documents

Publication Publication Date Title
US11847151B2 (en) Disambiguating user intent in conversational interaction system for large corpus information retrieval
CN108432190B (en) Response message recommendation method and equipment thereof
CN106227815B (en) Multi-modal clue personalized application program function recommendation method and system
CN108959394B (en) Clustered search results
KR102247533B1 (en) Speech recognition apparatus and method thereof
US11720640B2 (en) Searching social media content
US9916396B2 (en) Methods and systems for content-based search
KR102402511B1 (en) Method and device for searching image
WO2019047849A1 (en) News processing method, apparatus, storage medium and computer device
CN107515900B (en) Intelligent robot and event memo system and method thereof
WO2020019220A1 (en) Method for displaying service information in preview interface, and electronic device
JP2013502637A (en) Metadata tagging system, image search method, device, and gesture tagging method applied thereto
WO2020044099A1 (en) Service processing method and apparatus based on object recognition
WO2024078210A1 (en) Memo reminding method and apparatus, and terminal and storage medium
US20140324858A1 (en) Information processing apparatus, keyword registration method, and program
CN113569037A (en) Message processing method and device and readable storage medium
US20150046170A1 (en) Information processing device, information processing method, and program
CN111708943B (en) Search result display method and device for displaying search result
CN111797249A (en) Content pushing method, device and equipment
CN108345608A (en) A kind of searching method, device and equipment
CN115114395A (en) Content retrieval and model training method and device, electronic equipment and storage medium
CN108304412A (en) A kind of cross-language search method and apparatus, a kind of device for cross-language search
KR20200083159A (en) Method and system for searching picture on user terminal
CN110309324A (en) A kind of searching method and relevant apparatus
CN110929137B (en) Article recommendation method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23876415

Country of ref document: EP

Kind code of ref document: A1