CN113157966B

CN113157966B - Display method and device and electronic equipment

Info

Publication number: CN113157966B
Application number: CN202110276230.0A
Authority: CN
Inventors: 史振杰
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2021-03-15
Filing date: 2021-03-15
Publication date: 2023-10-31
Anticipated expiration: 2041-03-15
Also published as: CN113157966A

Abstract

The application discloses a display method, a display device and electronic equipment, and belongs to the technical field of communication. The method can solve the problems of complicated steps and long time consumption in the chat information searching process. The method comprises the following steps: receiving a first input of a user; identifying user intention information corresponding to the first message in response to the first input; determining a target segment matched with the user intention information from the target voice message; the above target fragment is displayed. The method and the device are applied to the session scene.

Description

Display method and device and electronic equipment

Technical Field

The application belongs to the technical field of communication, and particularly relates to a display method, a display device and electronic equipment.

Background

With the development of electronic technology, users can communicate through social applications in electronic devices anytime and anywhere. Currently, social application supports two chat forms of text information and voice information, and the voice information is an important form of social chat by the characteristic that the voice information is close to true communication emotion and is convenient.

In the related art, when the chat information received by the electronic device is voice information, if the user wants to quickly acquire the required chat information from the voice information, the user needs to convert the voice information into text information through the electronic device, and then searches the required chat information in the text information. If the voice message is long, the conversion process of the voice message and the process of searching the chat message required by the user in the converted text message can consume a great deal of time.

Thus, the whole chat information searching process is complicated in steps and long in time consumption.

Disclosure of Invention

The embodiment of the application aims to provide a display method, a display device and electronic equipment, which can solve the problems of complicated steps and long time consumption in the chat information searching process.

In order to solve the technical problems, the application is realized as follows:

in a first aspect, an embodiment of the present application provides a display method, including: receiving a first input of a user; identifying user intention information corresponding to the first message in response to the first input; determining a target segment matched with the user intention information from the target voice message; the above target fragment is displayed.

In a second aspect, an embodiment of the present application provides a display apparatus, including: the device comprises a receiving module, an identification module, a determining module and a display module, wherein: the receiving module is used for receiving a first input of a user; the identifying module is used for responding to the first input received by the receiving module and identifying user intention information corresponding to the first message; the determining module is used for determining a target segment matched with the user intention information identified by the identifying module from the target voice message; the display module is configured to display the target fragment determined by the determination module.

In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, and a program or instruction stored on the memory and executable on the processor, the program or instruction implementing the steps of the method according to the first aspect when executed by the processor.

In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor perform the steps of the method according to the first aspect.

In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and where the processor is configured to execute a program or instructions to implement a method according to the first aspect.

In a sixth aspect, embodiments of the present application provide a computer program product stored on a non-volatile storage medium, the program product being executable by at least one processor to implement the method according to the first aspect.

In the embodiment of the application, the electronic equipment can receive the first input of the user, identify the user intention information corresponding to the first message, then determine the target segment matched with the user intention information from the target voice message, and display the target segment. According to the method, when the user wants to acquire the required information from the voice message, the user intention information can be acquired, and the chat information required by the user can be rapidly and accurately positioned and displayed from the longer voice message according to the user intention, so that the chat information searching efficiency is improved, the chat information searching steps are simplified, and the chat information searching time is saved.

Drawings

FIG. 1 is a flow chart of a display method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of an interface to which a display method according to an embodiment of the present application is applied;

FIG. 3 is a second schematic diagram of an interface applied by a display method according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a display device according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 6 is a second schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type, and are not limited to the number of objects, such as the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.

The display method provided by the embodiment of the application is described in detail below through specific embodiments and application scenes thereof with reference to the accompanying drawings.

The display method provided by the embodiment of the application can be applied to the session scene.

Taking a session in chat APP as an example. A plurality of messages are displayed in a session interface of the session, wherein one message is a received voice message a with a duration of 40s, a last message of the voice message a is a sent text message B, and the message content of the text message B is "where we will be in a meeting? ". If the user wants to learn the required location information from the voice message a, the user needs to press the voice message for a long time and click on the displayed "text transfer" option to convert the voice message into a text message, and then find the required location information from the text message (e.g., 503 conference room). As such, the process steps of viewing the critical information in the message are cumbersome and time consuming.

In the embodiment of the present application, if the user wants to learn the location information from the voice message a, the electronic device may identify, in response to the first input of the user, that the user intention information corresponding to the text message B is "query location", and then determine, from the voice message a, a message segment matching the user intention information of the "query location" and display the message segment to the user. Therefore, the electronic equipment can rapidly and accurately position and display the chat information required by the user from the longer voice message according to the actual requirement of the user, so that the searching efficiency of the chat information is improved, the searching steps of the chat information are simplified, and the searching time of the chat information is saved.

The embodiment of the application provides a display method which can be applied to electronic equipment, and a flow chart of the display method provided by the embodiment of the application is shown. As shown in fig. 1, the display method provided in the embodiment of the present application may include the following steps 101 to 104:

step 101: the display device receives a first input from a user.

In the embodiment of the present application, the first input is used to trigger the electronic device to execute the identification operation on the first message.

Alternatively, in the embodiment of the present application, the first input may be an input of the first message by the user.

Illustratively, the first input may be: the touch input of the user, or other feasibility inputs, are not limited in this embodiment of the present application. Further, the first input may be: click input, slide input, press input, etc. by the user. Further, the clicking operation may be any number of clicking operations. The above-described sliding operation may be a sliding operation in any direction, for example, an upward sliding, a downward sliding, a leftward sliding, a rightward sliding, or the like, which is not limited in the embodiment of the present application.

Step 102: the display device responds to the first input and identifies user intention information corresponding to the first message.

In the embodiment of the present application, the first message may be a session message. The first message may be a session message in a session interface (e.g., a session window) of the target application. Further, the target application may be a social APP, a shopping APP, an office APP, or an entertainment APP, which is not limited in the embodiment of the present application.

The first message may be a voice message or a text message.

In the embodiment of the application, the user intention information is used for representing the user intention corresponding to the message content of the first message. For example, the message content of the first message may correspond to at least one user intention.

Example 1, assume that the message content of the first message is: "where your are in a meeting? The user intention information corresponding to the first message is 'query location';

example 2, assume that the message content of the first message is: "where your meeting is what time? The user intention information corresponding to the first message is 'inquiry time and place';

example 3, assume that the message content of the first message is: "tomorrow you and who go out to eat? The user intention information corresponding to the first message is "query person".

For example, the display device may describe an intention type of the user intention represented by the user intention information according to an information type attribute corresponding to the user intention information. Further, the information type attribute may include at least one of: time, place, person, event, etc.

For example, in connection with example 2 above, the user intention information is "inquiry time and place", then the above information type attribute is "time" and "place", and the intention type characterizing the user intention is time and place; in connection with example 3 above, the user intention information is "query person", the information type attribute is "person", and the intention type representing the user intention is person.

Optionally, in the embodiment of the present application, the display device may perform semantic analysis on the first message to obtain user intention information corresponding to the first message.

Step 103: the display device determines a target segment from the target voice message that matches the user intent information.

In the embodiment of the present application, the target voice message may be a session message. The target voice message may be a session message in a session interface (e.g., a session window) of the target application. Further, the target application may be a social APP, a shopping APP, an office APP, or an entertainment APP, which is not limited in the embodiment of the present application.

In an embodiment of the present application, the target voice message may include at least one fragment.

It should be noted that the above segments are message segments in the target voice message.

In the embodiment of the application, the display device can screen one or more message fragments matched with the user intention information from the message fragments of the target voice message based on the user intention indicated by the user intention information corresponding to the first message.

Step 104: the display device displays the target segment.

In the embodiment of the application, the display device may display the target segment in the target area. The target area is, for example, a display area of an application interface where the target voice message is located, or the target area is a new display area, for example, a floating window.

For example, the display device may display the target segment in a predetermined pattern. Further, the predetermined pattern includes any one of the following: floating display, highlighting, dynamic display, etc.

For example, the display device may display the target segment at a predetermined position. Further, the predetermined position may be a display area around the target voice message. For example, the target segment is displayed in a region below the message region of the target voice message.

The target voice message is exemplified as the session message. As shown in fig. 2 (a), a voice message 1 and a text message 2 are displayed in the session interface a, and the message content of the text message 2 is: where your are in a meeting; after dragging the voice message 1 to the text message 2, as shown in (B) of fig. 2, the display device displays text information corresponding to a message segment matching the user intention information corresponding to the text message 2 in the voice message 1, namely, a "B-seat 504 meeting room", below the voice message 1 in the conversation interface a.

Therefore, the display device can intuitively display the key information required by the user in the voice message, and the man-machine interaction effect is improved.

In the display method provided by the embodiment of the application, the electronic equipment can receive the first input of the user, identify the user intention information corresponding to the first message, then determine the target segment matched with the user intention information from the target voice message, and display the target segment. According to the method, under the condition that the user wants to acquire the required information from the voice message, the user intention information can be acquired, the user can be quickly and accurately positioned from the longer voice message according to the user intention, and the chat information required by the user is displayed, so that the chat information searching efficiency is improved, the chat information searching steps are simplified, and the chat information searching time is saved.

Optionally, in the embodiment of the present application, the display device may also receive the first input of the user, and directly acquire and display all the segments in the target voice message, so as to intuitively display all the key information in the target voice message to the user.

Optionally, in an embodiment of the present application, the process of step 104 may include the following step 104a:

step 104a: the display device displays the segment identification corresponding to the target segment, and plays the target segment, or displays the text information corresponding to the target segment.

In one example, in a case where a target segment matching user intention information is determined from a target voice message, the display apparatus may play only the target segment without displaying text information corresponding to the target segment.

In another example, the display apparatus may play a target segment that matches user intention information in a case where the target segment is determined from a target voice message, and display text information corresponding to the target segment.

In still another example, in a case where a target segment matching the user intention information is determined from the target voice message, the display device may not play the target segment and may display only text information corresponding to the target segment.

Further alternatively, in the embodiment of the present application, in a case where the display device displays text information corresponding to the target segment, the display device may receive a second input of text information corresponding to the target segment by the user, and execute the target operation. Further, the above-described target operations include any one of the following:

(1) Intercepting a voice fragment corresponding to the target fragment and displaying a fragment identifier of the voice fragment;

(2) The method comprises the steps that text contents corresponding to text information are used as search keywords in electronic equipment or a server, and search operation is executed;

(3) The text information is subjected to an editing operation (e.g., copying).

Therefore, the user can conveniently perform related operations on the target keywords subsequently, and the man-machine interaction performance is improved.

Optionally, in an embodiment of the present application, the first input is an input for dragging the target voice message to the first message.

For example, the process of identifying the user intention information corresponding to the first message in the step 102 may include the following steps 102a and 102b:

step 102a: and the display device performs semantic analysis on the first message to obtain a semantic analysis result.

Step 102b: and the display device generates user intention information corresponding to the first message according to the semantic analysis result.

In one example, a user may trigger a display device to hover display a target voice message by a pressing operation (e.g., a long press) of the target voice message, and then drag the target voice message to a first message.

In another example, the user may trigger the display device to copy the target voice message by a pressing operation on the target voice message, hover-display the copied target voice message, and then drag the copied target voice message to the first message.

It should be noted that dragging the target voice message to the first message means that the target voice message is dragged to the first message, or that the target voice message is dragged to a first area, where the first area may be a display area with a predetermined size corresponding to the first message.

It will be appreciated that dragging the target voice message onto the first message, i.e., the overlap of the target voice message with the first message exceeds a threshold.

For example, the display device may perform semantic analysis on the first message to obtain semantic analysis information (i.e., the semantic analysis result) corresponding to the first message, and then generate user intention information corresponding to the first message according to the word characteristics of the word unit indicated by the semantic analysis information and the sentence characteristics of the word unit in the first message. For example, the message content of the first message is: where you are in a meeting, information "query meeting place" can be generated based on the word features and sentence features in the semantically parsed information of the message.

Optionally, in the embodiment of the present application, the semantic information corresponding to the target voice message includes N keywords, where N is a positive integer, and each keyword corresponds to one message segment in the target voice message.

Illustratively, the step 103 may include the following steps 103a and 103b:

step 103a: and determining a target keyword matched with the user intention information from the N keywords.

Step 103b: and determining the segment corresponding to the target keyword in the target voice message as a target segment.

For example, the display device may split the target voice message into a plurality of message fragments according to the message content in the target voice message. In particular, the splitting may be performed preferentially according to pauses in the target voice message.

Further, the display device may split the target voice message into a plurality of message fragments corresponding to each keyword according to at least one keyword in the semantic information corresponding to the target voice message.

Illustratively, the N keywords may be words of a particular part-of-speech feature (e.g., noun) in the target voice message. For example, the N keywords included in the semantic information corresponding to the target voice message may include any one of the following: name of person, place, organization, event, etc.

It should be noted that, in general, words with specific parts-of-speech features (such as nouns) in information are more likely to be key information in a message. Therefore, the keywords included in the semantic information corresponding to the target voice message can be regarded as the key information in the target voice message.

Alternatively, in the embodiment of the present application, the display device may perform semantic analysis on the target voice information to obtain an analysis result (i.e., the above semantic information), and then determine N keywords based on the analysis result.

For example, the display device may directly perform semantic analysis on the target voice message, or perform semantic analysis on a text message corresponding to the target voice message.

For example, the semantic parsing may include word segmentation, or named entity recognition.

In one example, in the case where the semantic analysis is word segmentation, the display device may perform word segmentation on a text message corresponding to the target voice message by using a character matching or understanding method, to obtain word features of word units in the text message (i.e., the semantic information described above), and then extract keywords in the text message based on the word features of the word units in the text message.

In another example, in the case where the semantic parsing is named entity recognition, the display device may acquire sentence features (e.g., part-of-speech features of a combined word, part-of-speech features of a post-combined word, sentence location features, and dependency features) of the target voice message, and then extract the named entity (i.e., the keyword) from the target voice message based on the sentence features (i.e., the semantic information) of the target voice message.

For example, assuming that the message content of the target voice message includes "we open day 9 points at meeting room of B-seat 504", the display device may obtain N keywords corresponding to the target voice message based on the two modes, where the N keywords are respectively: "tomorrow 9 th Point", "B seat 504 meeting Chamber" and "meeting".

For example, the display device may screen one or more target keywords matching the user intention information from the N keywords corresponding to the target voice message based on the user intention indicated by the user intention information corresponding to the first message.

For example, the display device may input the user intention information and the N keywords into the target neural network model, to obtain a target keyword that is most matched with the user intention information. Further, the target neural network model is obtained by training a training sample set which is output as a keyword most matching with each intention information, based on a plurality of keywords input as a large amount of intention information and corresponding to each intention information.

For example, assume that the semantic information of voice message a (i.e., the target voice message) is "tomorrow goes to Beijing business trip", and the semantic information includes keywords including: if the user intention information is "query location", it is determined that the target keyword matching the user intention information is "Beijing" from the keywords described above.

For example, in the case of determining the target keyword, the display device may determine, from the message segments of the target voice message, the segment corresponding to the target keyword based on the correspondence between the keyword and the segment of the target voice message, and take the segment as the target segment.

Further alternatively, in an embodiment of the present application, the process of step 103 may include the following step A1:

step A1: the display device extracts a target segment containing a target keyword from the target voice message.

For example, the target segment may be a voice segment including a target keyword in the target voice information.

For example, in the case of determining the target keyword from the target voice message, the display device may match the target keyword with the target message content of the target voice message, determine a message segment including the target keyword in the target message content of the target voice message, and then extract the message segment from the target voice message as the target segment.

In an exemplary case where a plurality of target keywords are determined from a target voice message, one target keyword corresponds to one target message segment, or a plurality of target keywords corresponds to one target message segment.

Further alternatively, in the embodiment of the present application, the step A1 may be replaced by the following step A2:

step A2: the display device generates a target segment containing the target keyword.

For example, the display device may automatically generate the target segment based on the target keyword extracted from the target voice message through a neural network algorithm. For example, the target keywords are "we", "504 meeting room" and "meeting", and the target segment may be "we meeting at 504 meeting room".

It can be understood that, in the case that the user intention information is a location query, the target segment may be "504 meeting room", or "meeting in 504 meeting room", that is, the target segments containing location keywords with different lengths may be selected. As illustrated above, the displayed target segment is more accurate when the "504 meeting room" is displayed, and the displayed target segment is more complete when the "meeting at 504 meeting room" is displayed.

Optionally, in an embodiment of the present application, the target segment includes a first target segment and a second target segment, where a feature type of a first keyword corresponding to the first target segment is different from a feature type of a second keyword corresponding to the second target segment.

For example, the process of step 104 may include the following step 104a1:

step 104a1: sequentially playing the first target segment and the second target segment; or displaying the first target segment in a first display mode and displaying the second target segment in a second display mode.

By way of example, the feature types corresponding to the keywords may include any of the following: the name of the person, the place name, the time noun, the organization name, the event name and the like can be determined according to actual requirements, and the embodiment of the application is not limited in any way.

For example, in the case that there are a plurality of target segments matched with the user intention information in the target voice message and the keyword types (i.e., the feature types) of the keywords corresponding to the target segments are different, the display device may sequentially play the target segments at predetermined intervals, for example, after the first target segment is played, pause for 2 seconds, and then continue playing the second target segment, so as to facilitate the user to distinguish the two target segments.

The first display mode and the second display mode are different from each other. Further, the first display mode may be different colors, different brightness, or different transparency.

For example, in the case where there are a plurality of target segments matching the user intention information in the target voice message and keyword types (i.e., the feature types) of keywords corresponding to the target segments are different, the display device may display each of the target segments in different display manners, so as to distinguish and display the target segments corresponding to the non-keyword types.

Further, the display device may preset display modes corresponding to different keyword types, and according to the display mode corresponding to each target segment. For example, assuming that the display modes corresponding to the keywords of the "name" type and the "place name" type are red text and green text, respectively, the display device displays the target segment corresponding to the keyword of the "name" type in red text and the target segment corresponding to the keyword of the "place name" type in green text.

The target voice message is exemplified as the session message. As shown in fig. 3 (a), a voice message 1 is displayed in the session interface a, after clicking the voice message 1, as shown in fig. 3 (b), the display device sequentially displays text information corresponding to the target segments of "me and min", "403 meeting room" and "meeting" in the session interface a, and displays the text information in red, yellow and blue fonts according to a predetermined style corresponding to the feature type of the preset keyword.

For example, the display device may add corresponding labels to the N keywords according to the feature type of each of the N keywords, where the labels are used to indicate the feature types of the keywords. For example, assuming that the labels corresponding to the keywords of the "person name" type and the "place name" type are "person name" and "place name" respectively, the display device adds the "person name" label to the keyword of the "person name" type and adds the "place name" label to the keyword of the "place name" type.

Optionally, in an embodiment of the present application, the target voice message and the first message are messages in the same session.

For example, the step 104 may include the following step 104b:

step 104b: and the display device displays the target fragment in a session interface of the session.

The session may be, for example, a session in any application with session functionality.

For example, the target message and the first message may be messages in the same session interface. The session interface is a session interface corresponding to the session. Further, the target voice message may be a session message received by the electronic device in the session, and the first message may be a session message sent by the electronic device in the session.

By way of example, the manner in which the display device may present the target segment in the session interface may include any of the following:

(1) The display means displays the target segment in a target display area in the session interface, for example, in a specific display area in the upper left corner of the session interface;

(2) The display device displays the target segment at the position of the target voice message in the session interface;

(3) The display means displays the target segment in a floating window in the session interface.

Therefore, under the condition that a user views the voice message of a certain session, the key information required to be viewed by the user in the voice message can be rapidly acquired based on the context scene of the session where the voice message is located, and the key information is displayed in the session interface where the voice message is located, so that the convenience of the user in viewing the key information in the message is greatly improved.

It should be noted that, in the display method provided by the embodiment of the present application, the execution body may be a display device, or a control module for executing the display method in the display device. In the embodiment of the present application, a display execution display method is taken as an example, and a display device provided in the embodiment of the present application is described.

An embodiment of the present application provides a display device, as shown in fig. 4, the display device 400 includes: a receiving module 401, an identifying module 402, a determining module 403 and a presenting module 404, wherein: the receiving module 401 is configured to receive a first input of a user; the identifying module 402 is configured to identify user intention information corresponding to a first message in response to the first input received by the receiving module 401; the determining module 403 is configured to determine, from a target voice message, a target segment that matches the user intention information identified by the identifying module 402; the displaying module 404 is configured to display the target segment determined by the determining module 403.

Optionally, in the embodiment of the present application, the display module 404 is specifically configured to play the target segment, or display text information corresponding to the target segment.

Optionally, in an embodiment of the present application, the first input is an input for dragging the target voice message to the first message; the identification module 402 is specifically configured to perform semantic analysis on the first message to obtain a semantic analysis result; and generating user intention information corresponding to the first message according to the semantic analysis result.

Optionally, in the embodiment of the present application, the semantic information corresponding to the target voice message includes N keywords; each keyword corresponds to a message segment in the target voice message; n is a positive integer; the determining module 404 is specifically configured to determine, from the N keywords, a target keyword that matches the user intention information identified by the identifying module 402; and determining a segment corresponding to the target keyword in the target voice message as a target segment.

Optionally, in an embodiment of the present application, the target segment includes a first target segment and a second target segment; the feature type of the first keyword corresponding to the first target segment is different from the feature type of the second keyword corresponding to the second target segment;

the display module 404 is specifically configured to sequentially display the first target segment and the target segment, or display the first target segment in a first display mode and display the second target segment in a second display mode.

Optionally, in the embodiment of the present application, the target voice message and the first message are messages in the same session; the displaying module 404 is specifically configured to display the target segment determined by the determining module 403 in a session interface of the session.

In the display device provided by the embodiment of the application, the electronic equipment can receive the first input of the user, identify the user intention information corresponding to the first message, then determine the target segment matched with the user intention information from the semantic information corresponding to the target voice message, and display the target segment. According to the method, when the user wants to acquire the required information from the voice message, the user intention information can be acquired, and the chat information required by the user can be rapidly and accurately positioned and displayed from the longer voice message according to the user intention, so that the chat information searching efficiency is improved, the chat information searching steps are simplified, and the chat information searching time is saved.

The display device in the embodiment of the application can be a device, and can also be a component, an integrated circuit or a chip in a terminal. The device may be a mobile electronic device or a non-mobile electronic device. By way of example, the mobile electronic device may be a cell phone, tablet computer, notebook computer, palm computer, vehicle mounted electronic device, wearable device, ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook or personal digital assistant (personal digital assistant, PDA), etc., and the non-mobile electronic device may be a server, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (TV), teller machine or self-service machine, etc., and embodiments of the present application are not limited in particular.

The display device in the embodiment of the application may be a device having an operating system. The operating system may be an Android operating system, an ios operating system, or other possible operating systems, and the embodiment of the present application is not limited specifically.

The display device provided in the embodiment of the present application can implement each process implemented by the embodiments of the methods of fig. 1 to 3, and in order to avoid repetition, a description is omitted here.

Optionally, as shown in fig. 5, an embodiment of the present application further provides an electronic device 700, including a processor 701, a memory 702, and a program or an instruction stored in the memory 702 and capable of running on the processor 701, where the program or the instruction implements each process of the above-mentioned embodiment of the display method when executed by the processor 701, and the process can achieve the same technical effect, so that repetition is avoided, and no further description is given here.

The electronic device in the embodiment of the application includes the mobile electronic device and the non-mobile electronic device.

Fig. 6 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.

The electronic device 100 includes, but is not limited to: radio frequency unit 101, network module 102, audio output unit 103, input unit 104, sensor 105, display unit 106, user input unit 107, interface unit 108, memory 109, and processor 110.

Those skilled in the art will appreciate that the electronic device 100 may further include a power source (e.g., a battery) for powering the various components, and that the power source may be logically coupled to the processor 110 via a power management system to perform functions such as managing charging, discharging, and power consumption via the power management system. The electronic device structure shown in fig. 6 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than shown, or may combine certain components, or may be arranged in different components, which are not described in detail herein.

Wherein, the user input unit 107 is configured to receive a first input of a user; the processor 110 is configured to identify user intention information corresponding to a first message in response to the first input received by the user input unit 107; the processor 110 is configured to determine a target segment matching the user intention information from a target voice message; the audio output unit 103 is configured to play the target message segment extracted by the processor 110, and the display unit 106 is configured to display text information corresponding to the target message segment extracted by the processor 110.

Optionally, in the embodiment of the present application, the audio output unit 103 is specifically configured to play the target segment, and the display unit 106 is specifically configured to display text information corresponding to the target segment.

Optionally, in an embodiment of the present application, the first input is an input for dragging the target voice message to the first message; the processor 110 is specifically configured to perform semantic analysis on the first message to obtain a semantic analysis result; and generating user intention information corresponding to the first message according to the semantic analysis result.

Optionally, in the embodiment of the present application, the semantic information corresponding to the target voice message includes N keywords; each keyword corresponds to a message segment in the target voice message; n is a positive integer; the processor 110 is specifically configured to determine a target keyword that matches the user intention information from the N keywords; and determining a segment corresponding to the target keyword in the target voice message as a target segment.

Optionally, in the embodiment of the present application, the target voice message and the first message are messages in the same session; the audio output unit 103 and/or the display unit 106 are specifically configured to display the target segment determined by the processor 110 in a session interface of the session.

In the electronic device provided by the embodiment of the application, the electronic device can receive the first input of the user, identify the user intention information corresponding to the first message, then determine the target segment matched with the user intention information from the semantic information corresponding to the target voice message, and display the target segment. According to the method, when the user wants to acquire the required information from the voice message, the user intention information can be acquired, and the chat information required by the user can be rapidly and accurately positioned and displayed from the longer voice message according to the user intention, so that the chat information searching efficiency is improved, the chat information searching steps are simplified, and the chat information searching time is saved.

It should be appreciated that in embodiments of the present application, the input unit 104 may include a graphics processor (Graphics Processing Unit, GPU) 1041 and a microphone 1042, the graphics processor 1041 processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 107 includes a touch panel 1071 and other input devices 1072. The touch panel 1071 is also referred to as a touch screen. The touch panel 1071 may include two parts of a touch detection device and a touch controller. Other input devices 1072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and so forth, which are not described in detail herein. Memory 109 may be used to store software programs as well as various data including, but not limited to, application programs and an operating system. The processor 110 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 110.

The embodiment of the application also provides a readable storage medium, on which a program or an instruction is stored, which when executed by a processor, implements each process of the above-described embodiment of the display method, and can achieve the same technical effects, so that repetition is avoided, and no further description is given here.

Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium such as a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.

The embodiment of the application further provides a chip, which comprises a processor and a communication interface, wherein the communication interface is coupled with the processor, and the processor is used for running programs or instructions to realize the processes of the embodiment of the display method, and can achieve the same technical effects, so that repetition is avoided, and the description is omitted here.

It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.

Embodiments of the present application provide a computer program product stored in a non-volatile storage medium, the program product being executable by at least one processor to implement a method as described in the first aspect.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present application.

The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are to be protected by the present application.

Claims

1. A display method, the method comprising:

displaying a session interface of a target session, wherein the session interface comprises a first message and a target voice message, the first message and the target voice message are messages sent by different users in the target session, and the target voice message is a reply message to the first message;

receiving a first input of a user to the first message, wherein the first input is input for dragging the target voice message to the first message;

identifying user intention information corresponding to the first message in response to the first input, wherein the user intention information is used for representing user intention corresponding to message content of the first message;

Determining a target segment matched with the user intention information from the target voice message;

displaying the target fragment on the session interface;

the semantic information corresponding to the target voice message comprises N keywords; each keyword corresponds to a message segment in the target voice message; n is a positive integer;

the determining a target segment from the target voice message, which is matched with the user intention information, comprises the following steps:

determining target keywords matched with the user intention information from the N keywords;

and determining the segment corresponding to the target keyword in the target voice message as a target segment.

2. The method of claim 1, wherein said displaying said target fragment comprises:

and playing the target fragment, or displaying text information corresponding to the target fragment.

3. The method of claim 1, wherein the step of determining the position of the substrate comprises,

the identifying the user intention information corresponding to the first message includes:

carrying out semantic analysis on the first message to obtain a semantic analysis result;

and generating user intention information corresponding to the first message according to the semantic analysis result.

4. The method of claim 1, wherein the target segment comprises a first target segment and a second target segment; the feature type of the first keyword corresponding to the first target segment is different from the feature type of the second keyword corresponding to the second target segment;

the displaying the target fragment comprises:

sequentially playing the first target segment and the second target segment;

or displaying the first target segment in a first display mode, and displaying the second target segment in a second display mode.

5. A display device, the device comprising: the device comprises a receiving module, an identification module, a determining module and a display module, wherein:

the display module is used for displaying a session interface of a target session, the session interface comprises a first message and a target voice message, the first message and the target voice message are messages sent by different users in the target session, and the target voice message is a reply message to the first message;

the receiving module is used for receiving a first input of a user on the first message, wherein the first input is input for dragging the target voice message to the first message;

The identifying module is used for responding to the first input received by the receiving module and identifying user intention information corresponding to the first message, wherein the user intention information is used for representing user intention corresponding to message content of the first message;

the determining module is used for determining a target segment matched with the user intention information identified by the identifying module from the target voice message;

the display module is further used for displaying the target fragment determined by the determination module;

the determining module is specifically configured to determine, from the N keywords, a target keyword that matches the user intention information identified by the identifying module; and determining the segment corresponding to the target keyword in the target voice message as a target segment.

6. The apparatus of claim 5, wherein the device comprises a plurality of sensors,

the display module is specifically configured to play the target segment, or display text information corresponding to the target segment.

7. The apparatus of claim 5, wherein the device comprises a plurality of sensors,

the identification module is specifically configured to perform semantic analysis on the first message to obtain a semantic analysis result; and generating user intention information corresponding to the first message according to the semantic analysis result.

8. The apparatus of claim 5, wherein the target segments comprise a first target segment and a second target segment; the feature type of the first keyword corresponding to the first target segment is different from the feature type of the second keyword corresponding to the second target segment;

the display module is specifically configured to sequentially play the first target segment and the target segment, or display the first target segment in a first display manner and display the second target segment in a second display manner.

9. An electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the display method of any of claims 1-4.

10. A readable storage medium, characterized in that the readable storage medium has stored thereon a program or instructions which, when executed by a processor, implement the steps of the display method according to any of claims 1-4.