CN112702638A - Information processing method, device and system and control method of video playing equipment - Google Patents

Information processing method, device and system and control method of video playing equipment Download PDF

Info

Publication number
CN112702638A
CN112702638A CN201911014049.1A CN201911014049A CN112702638A CN 112702638 A CN112702638 A CN 112702638A CN 201911014049 A CN201911014049 A CN 201911014049A CN 112702638 A CN112702638 A CN 112702638A
Authority
CN
China
Prior art keywords
instruction
target
target object
image
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911014049.1A
Other languages
Chinese (zh)
Inventor
卓著
王雯
陈谦
徐秋云
雷赟
李亚丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201911014049.1A priority Critical patent/CN112702638A/en
Publication of CN112702638A publication Critical patent/CN112702638A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/47815Electronic shopping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42204User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor

Abstract

The application discloses an information processing method, device and system and a control method of video playing equipment. Wherein, the method comprises the following steps: analyzing a control instruction received by the equipment, and detecting whether the control instruction contains a preset target instruction or not; determining that the control instruction contains a target instruction, and intercepting the content currently played by the equipment to obtain a screen capture image; and determining a target object to be pushed according to the screen capture image. The invention solves the technical problem that the television in the related technology can not identify the product in the played content, so that the product information can not be provided for the user.

Description

Information processing method, device and system and control method of video playing equipment
Technical Field
The present application relates to the field of intelligent home appliances, and in particular, to a method, an apparatus, a system for processing information, and a method for controlling a video playback device.
Background
With the continuous improvement of living standard, the shopping mode of people is also continuously changed. When seeing the same clothes of stars or the items of the self-mental apparatus on the television, many users can go to a shopping website to search so as to find out a proper target product. However, most product searches are time and energy consuming and are not suitable for modern fast-paced life; in addition, for the old, the retrieval keywords can not be extracted accurately, if a heart instrument article, such as a kitchen ware with complete functions, is found on a television, the heart instrument article can not be found on a shopping website, and the shopping experience is influenced.
Aiming at the technical problem that the television in the related art cannot identify the product in the played content, so that the product information cannot be provided for the user, an effective solution is not provided at present.
Content of application
The embodiment of the application provides an information processing method, an information processing device, an information processing system and a control method of video playing equipment, and aims to at least solve the technical problem that a television in the related technology cannot identify a product in played content, so that product information cannot be provided for a user.
According to an aspect of an embodiment of the present application, there is provided an information processing method, including: analyzing a control instruction received by the equipment, and detecting whether the control instruction contains a preset target instruction or not; determining that the control instruction contains a target instruction, and intercepting the content currently played by the equipment to obtain a screen capture image; and determining a target object to be pushed according to the screen capture image.
According to another aspect of the embodiments of the present application, there is also provided a method for controlling a video playback device, including: receiving a control instruction containing a target instruction in the process of playing video information by equipment; displaying a screenshot image obtained by intercepting the content currently played by the equipment; and displaying the target object to be pushed, wherein the target object to be pushed is determined at least according to the screen capture image.
According to another aspect of the embodiments of the present application, there is also provided an apparatus for processing information, including: the analysis module is used for analyzing the control instruction received by the equipment and detecting whether the control instruction contains a preset target instruction or not; the intercepting module is used for determining that the control instruction comprises a target instruction and intercepting the content currently played by the equipment to obtain a screen-shot image; and the determining module is used for determining the target object to be pushed according to the screen capture image.
According to another aspect of the embodiments of the present application, there is also provided a storage medium, where the storage medium includes a stored program, and when the program runs, a device on which the storage medium is located is controlled to execute any one of the above-mentioned information processing methods.
According to another aspect of the embodiments of the present application, there is also provided a processor, configured to execute a program, where the program executes a method for processing any one of the above information.
According to another aspect of the embodiments of the present application, there is also provided an information processing system, including: a processor; and a memory coupled to the processor for providing instructions to the processor for processing the following processing steps: analyzing a control instruction received by the equipment, and detecting whether the control instruction contains a preset target instruction or not; determining that the control instruction contains a target instruction, and intercepting the content currently played by the equipment to obtain a screen capture image; and determining a target object to be pushed according to the screen capture image.
According to another aspect of the embodiments of the present application, there is provided a sound box with a screen, including: a display screen; and the processor analyzes the control instruction received by the sound box, detects whether the control instruction contains a preset target instruction, intercepts the currently played content of the sound box to obtain a screen capture image if the control instruction contains the target instruction, and determines a target object to be pushed at least according to the screen capture image.
In the embodiment of the application, a control instruction received by equipment is analyzed, and whether the control instruction contains a preset target instruction is detected; determining that the control instruction contains a target instruction, and intercepting the content currently played by the equipment to obtain a screen capture image; and determining a target object to be pushed according to the screen capture image. Compared with the prior art, the scheme is based on the control instruction of the user, the video or the image which is played on the television is subjected to screenshot, the image target detection and positioning technology is utilized to carry out commodity detection and positioning on the image, convenient and fast buying and selling services are provided for the user, the purpose of optimizing shopping experience is achieved, and the technical problem that product information cannot be provided for the user due to the fact that the television in the related technology cannot identify products in the played content is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a hardware configuration block diagram of a computer device (or a mobile device) for implementing an information processing method according to embodiment 1 of the present application;
fig. 2 is a schematic diagram of a computer device (or mobile device) as a client terminal according to embodiment 1 of the present application;
fig. 3 is a flowchart of an alternative information processing method according to embodiment 1 of the present application;
FIG. 4 is a flow chart of an alternative method for processing a purchase of a same money for TV shopping according to the embodiment 1 of FIG. 3 of the present application;
fig. 5 is a flowchart of an alternative control method for a video playback device according to embodiment 2 of the present application;
FIG. 6 is a schematic view of an alternative information processing apparatus according to embodiment 3 of the present application;
fig. 7 is a schematic diagram of an alternative control device of a video playback device according to embodiment 4 of the present application;
FIG. 8 is a block diagram of an alternative computer device according to embodiment 5 of the present application; and
fig. 9 is a schematic view of a sound box with a screen according to embodiment 7 of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, some terms or terms appearing in the description of the embodiments of the present application are applicable to the following explanations:
image Object Detection and Location: the method refers to scanning and searching targets in images and videos (a series of images), is image segmentation based on geometric features and statistical features of the targets, combines the segmentation and the identification of the targets into a whole, and is an important index of the whole system in accuracy and real-time performance.
Automatic Speech Recognition (Automatic Speech Recognition): speech recognition, for short, aims to convert the vocabulary content of human speech into a computer-readable character sequence.
Spoken Language Understanding (Spoken Language Understanding): the goal is to convert the text transcribed by speech recognition into a semantic representation.
Search by Image (Search by Image) based on a given picture, a set of similar pictures that are the same and similar are found in the database.
Television Voice Shopping (Television Voice Shopping): the voice shopping is completed by means of equipment such as a smart television and the like.
Example 1
In accordance with an embodiment of the present application, there is provided a method embodiment of information processing, it should be noted that the steps illustrated in the flowchart of the figure may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.
The method provided by the first embodiment of the present application may be executed in a mobile terminal, a computer device, or a similar computing device. Fig. 1 shows a hardware configuration block diagram of a computer apparatus (or mobile apparatus) for implementing a method of information processing. As shown in fig. 1, computer device 10 (or mobile device 10) may include one or more (shown as 102a, 102b, … …, 102 n) processors 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA, etc.), memory 104 for storing data, and a transmission module 106 for communication functions. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a BUS (BUS) port (which may be included as one of the ports of the I/O interface), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the electronic device. For example, computer device 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
It should be noted that the one or more processors 102 and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuitry may be a single, stand-alone processing module, or incorporated in whole or in part into any of the other elements in the computer device 10 (or mobile device). As referred to in the embodiments of the application, the data processing circuit acts as a processor control (e.g. selection of a variable resistance termination path connected to the interface).
The memory 104 can be used for storing software programs and modules of application software, such as program instructions/data storage devices corresponding to the information processing method in the embodiment of the present application, and the processor 102 executes various functional applications and data processing, i.e., the method for implementing the information processing of the application software, by executing the software programs and modules stored in the memory 104. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, memory 104 may further include memory located remotely from processor 102, which may be connected to computer device 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission module 106 is used to receive or transmit data via a network. Specific examples of such networks may include wireless networks provided by the communications provider of computer device 10. In one example, the transmission module 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission module 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computer device 10 (or mobile device).
Fig. 1 shows a block diagram of a hardware structure, which may be taken as an exemplary block diagram of not only the computer device 10 (or the mobile device) described above, but also a server, and in an alternative embodiment, fig. 2 shows an embodiment of using the computer device 10 (or the mobile device) shown in fig. 1 described above as a client terminal in a block diagram. As shown in FIG. 2, computer device 10 (or mobile device) may be connected via a data network connection or electronically to one or more servers 66. In an alternative embodiment, the computer device 10 (or mobile device) may be a mobile computing device or the like. The data network connection may be a local area network connection, a wide area network connection, an internet connection, or other type of data network connection. Computer device 10 (or mobile device) may execute to connect to a network service executed by a server (e.g., a secure server) or a group of servers. A web server is a network-based user service such as social networking, cloud resources, email, online payment, or other online applications.
Under the above operating environment, the present application provides a method of processing information as shown in fig. 3. Fig. 3 is a flowchart of an alternative method for processing information according to embodiment 1 of the present application. As shown in fig. 3, the method may include the steps of:
step S302, analyzing the control instruction received by the device, and detecting whether the control instruction includes a preset target instruction.
In an alternative, the device may be a television, a computer, a mobile terminal, or the like; the control instruction can be a voice instruction, a gesture instruction, a remote control instruction and the like. When the control instruction is a voice instruction, the device operating the embodiment can be configured with a voice receiver for picking up voice information of a user; when the control instruction is a gesture instruction, the equipment can be used for shooting gesture images of a user by a front-mounted image acquisition device, such as a camera; when the control command is a remote control command, the device may be configured with a wireless receiver for receiving a remote control signal.
It should be noted that the target command may be a keyword in a voice command, such as "i want to buy" or "like money", and the gesture command may be a series of fixed actions pre-stored in the device. In any control instruction, the control instruction needs to be matched with a target instruction preset by the device to judge whether the control instruction received by the device meets the condition for starting the next operation, and further to eliminate an instruction sent by a user by mistake or an instruction sent by a third party without permission.
For example, when a user sees his idol on a television, the user may say "i want to buy VR glasses of a certain scientist" to the television, may make a gesture to freeze the screen, such as making a ring finger, or may freeze the television screen using a remote controller. And after receiving the control instruction, the television matches the control instruction with a preset target instruction to judge whether the next operation is started.
And step S304, determining that the control instruction comprises a target instruction, and intercepting the content currently played by the equipment to obtain a screen capture image.
In the above steps, when the device determines that the control instruction includes a preset target instruction, that is, the control instruction meets a preset condition, the device starts a screen capture program, and captures a screen of the content currently played by the device to obtain a captured image, where the captured image is an image including a related commodity.
And step S306, determining a target object to be pushed according to the screen capture image at least.
In an alternative, the method for determining the target object to be pushed may be a graph searching method; the target object may be one or a series of commodity images including commodities that the user desires to purchase, and the current screen capture image is entirely or partially covered in the form of a list.
The method is characterized in that images are searched by images, namely, a given image is used for finding the same and similar image set in a database based on a deep learning technology, so that the convenience of searching and searching the images is improved.
It is easy to note that the search engine in the related art generally has a function of searching a graph with a graph. However, the product search function combining the search engine and the tv has no specific technical solution, which brings great inconvenience to the user for shopping. In order to optimize shopping experience and meet the strong shopping demands of a plurality of users for buying the same type of star, the same type of television drama and the same type of movie, the embodiment provides the same type shopping function based on a control instruction on the smart television, captures a video or an image played by the television, determines a target object to be pushed according to the captured image, and provides convenient same type shopping service for the users.
In an alternative embodiment, the smart television performs the above steps as an example. When watching a television play played on the smart television, a user finds that tableware used by a master is very exquisite in the play, and then sends a voice control instruction 'i want to buy the tableware of a certain television play' to the television. After receiving the voice command, the smart television detects that the voice command contains a preset target command 'I want to buy', then carries out screen capture on the currently played content, forwards the captured screen capture image to the server, and the server carries out calculation in the background, determines one or more tableware which are the same as or similar to the tableware contained in the screen capture image, returns the tableware to the television in a list form, and covers the tableware on the currently played television picture for the user to select and confirm.
In the above embodiment, the control instruction received by the device is first analyzed, and whether the control instruction includes a preset target instruction is detected; under the condition that the control instruction is determined to contain the target instruction, intercepting the content currently played by the equipment to obtain a screen capture image; and then determining a target object to be pushed according to the screen shot image. Compared with the prior art, the scheme is based on the control instruction of the user, the video or the image which is played on the television is subjected to screenshot, the product image which is similar to the screenshot product is determined by using the image searching technology, convenient and fast buying and buying services are provided for the user, the purpose of optimizing shopping experience is achieved, and the technical problem that the product in the played content cannot be identified by the television in the related technology, so that the product information cannot be provided for the user is solved.
Optionally, the control instruction includes voice information, and the step S302 is to analyze the control instruction received by the device, and detect whether the control instruction includes a preset target instruction, and may include the following steps:
step S3021, performing voice recognition on the voice information to obtain text information corresponding to the voice information.
In an alternative, the control command may be a voice command, and the voice command includes voice information.
In the above steps, the device operating this embodiment is configured with a voice receiver, which is used to pick up the voice information of the user, perform voice recognition on the voice information, and convert it into corresponding text information.
In an alternative embodiment, still taking the example of a smart television, the television set may include a sound pickup device, such as a microphone, for collecting control commands of the voice type uttered by the user.
In another optional embodiment, the television may further communicate with the mobile terminal through a communication method such as bluetooth, and the voice-type control instruction may be received by the mobile terminal, and then the control instruction may be sent to the television to perform the subsequent steps.
Speech recognition, also called automatic speech recognition, aims at converting the vocabulary content in the speech of a user into a computer-readable character sequence, i.e. text information. Unlike speaker recognition and speaker verification, the latter attempts to recognize or verify the speaker who uttered the speech rather than the vocabulary content contained therein.
Step S3022, performing semantic analysis on the text information based on the spoken language understanding model to obtain an analysis result, wherein the spoken language understanding model is used for obtaining semantic representation of the voice information based on the text information.
In an alternative, the spoken language understanding model may be a model that converts textual information transcribed by speech recognition into a semantic representation.
It should be noted that the exact meaning of a word in a text message is not important, but is important in the semantic information conveyed by the text, i.e. the intention of the user to speak a voice message. Such as "buy the same money". SLUs are challenging for several reasons, such as speech recognition errors, ambiguities, disfluencies, etc., so users use word criteria as much as possible when issuing speech commands, avoiding spoken language.
Step S3023, determining whether the voice information includes a target semantic according to the analysis result, where the target semantic is a semantic representing a target instruction.
In an alternative, the target semantics may be semantic information corresponding to the target instruction.
It is easy to notice that the above scheme adopts voice instruction to carry out TV voice shopping, and the voice shopping experience is accomplished with the help of equipment such as smart televisions, and the user only needs to sit on the sofa and sends out voice instruction and just can operate the TV set, does not need other actions, has greatly made things convenient for the user.
Optionally, before the step S302 analyzes the control instruction received by the device, an alternative scheme provided in this embodiment may further include the following steps:
step S3011, receiving a trigger instruction, pausing currently played content of the device according to the trigger instruction, and detecting a control instruction.
In an alternative, the triggering instruction is used to pause the content currently played by the device, and the triggering instruction may still be a voice instruction, a gesture instruction, a remote control instruction, or the like.
In an alternative embodiment, in order to prevent the television picture from being played too fast and skipping the picture containing the target object, the user may issue a trigger command, and the device may pause the currently played content after receiving the trigger command, and then detect a control command issued by the user later.
Or step S3012, detecting the control command in real time, pausing the content currently played by the device when the control command is detected, and performing analysis on the control command received by the device.
In another optional embodiment, the device may always detect the control instruction without the user sending a trigger instruction in advance, and if the control instruction is detected, further determine whether the control instruction includes a preset target instruction.
Optionally, before determining the target object to be pushed according to the screen capture image in step S304, the alternative scheme provided by this embodiment may further include the following steps:
in step S3031, a determination instruction is detected, where the determination instruction is used to determine the screen capture image.
After the content currently played by the device is intercepted to obtain the screenshot image and before the control instruction received by the device is analyzed, the screenshot image can be displayed, and whether the currently displayed screenshot image is the image which is expected by the user and contains the target object or not is determined based on the determination instruction.
In an alternative, the determination instruction is used to determine whether the captured screenshot image is an image that is desired by the user and contains the target object, and the determination instruction may be fed back by the user based on a form of "yes" or "no".
Step S3032, if the determination instruction is received, the step of intercepting the content currently played by the equipment to obtain the screen capture image is carried out.
In an alternative, the screen shot image may be an image containing a target object desired by the user. In this case, the user sends a determination instruction, and the device enters a step of capturing the content currently played by the device to obtain a screenshot image.
Step S3033, if a position adjustment instruction is received, adjusting a current playing position of the device according to the position adjustment instruction, where the position adjustment instruction includes a forward instruction or a backward instruction.
In an alternative embodiment, after the user sends a voice command of "i want to buy a water cup of a star", if the television picture is played too fast and jumps to the next frame picture, the captured screenshot image does not necessarily contain the target object desired by the user. Based on the situation, the user does not send a determination instruction, and the screenshot is backed for a preset time through a back instruction, and then the determination instruction is sent when the screenshot is adjusted to the position.
Optionally, the step S306 of determining the target object to be pushed according to at least the screen capture image may specifically include:
step S3061, triggering a preset application to start.
In an alternative, the preset application may be an application program having a shopping function such as search, payment, and the like.
Step S3062, searching is performed according to the screenshot image through a preset application to obtain a search result, where the search result includes image information and attribute information of the target object.
In an alternative, the method for determining the target object to be pushed may be a graph searching method; the image information may be a picture of the target object, and the attribute information may be a brand, a style, an applicable group, and the like of the target object.
In the above steps, the preset application searches according to the screenshot image to obtain a series of search results, and the search results are displayed on the device in a list form and include the image information and the attribute information of the target object. The user can select the target commodity again through the control instruction, for example, the nth commodity is selected, and therefore the commodity main page is entered.
It is readily noted that if multiple items are included in the screenshot, many other items related to the multiple items may be included in the search results. Therefore, the control instruction may include commodity information specifying a commodity, and before the step S306 determines the target object to be pushed according to at least the screen capture image, the present embodiment may further include the following steps:
s3051, positioning in the screenshot image according to the commodity information of the specified commodity to obtain a target area of the screenshot image; the step S306 of determining the target object to be pushed according to at least the screen shot image includes: and determining a target object to be pushed according to the target area of the screen shot image.
In an alternative, the commodity information of the specified commodity may limit the target commodity to a specific one, and prevent unnecessary introduction of other commodities when the target object to be pushed is determined. For example, the commodity information may include the kind, color, user, and the like of the commodity.
In an alternative, the method for locating the specified commodity may be an image target detection and location method. The image target detection and positioning refers to scanning and searching targets in images and videos (a series of images), is image segmentation based on geometric features and statistical features of the targets, combines the segmentation and identification of the targets into a whole, and is an important index of the whole system in terms of accuracy and real-time performance.
Still taking "i want to buy a water cup of a certain star" sent by the user as an example, if the embodiment does not adopt the image target detection and positioning technology, the television may determine a target object to be pushed according to the whole screen capture image; after the image target detection and positioning technology is adopted, the television only locks a target area of the appointed commodity, and a target object to be pushed is determined according to the target area in the screenshot image. Obviously, the latter is more efficient than the former push method.
Optionally, the commodity information of the commodity specified in the step S3051 includes at least one of the following items: commodity name, commodity type, and commodity use subject.
In one alternative, the names of the commodities can be specific brands, such as a thermal cup for a chef, a chamomile perfume and the like; the commodity types can be the types of commodities such as water cups, perfumes, glasses and the like; the commodity can be used by a user of the commodity, such as a star or a pet.
Optionally, after determining the target object to be pushed according to at least the screen capture image in step S306, an alternative solution provided in this embodiment may further include step S3063 of displaying the search result, where the step may specifically include:
step S30631, a floating interface is displayed on the current screen of the device.
Step S30632, the search result is displayed in the floating interface.
In an alternative, the floating interface can be partially or completely covered on the current screen, and can also have a specified transparency.
Optionally, after determining the target object to be pushed according to at least the screen capture image in step S306, an alternative solution provided in this embodiment may further include step S3064 of displaying the search result, where the step may specifically include:
step S30641, jump to a display interface corresponding to a preset application, and display the search result through the display interface.
In an alternative, the display interface may be an interface in a preset application, and the preset application may also be an application program having shopping functions such as search and payment.
In the above steps, the television determines the target object to be pushed according to the screen capture image, and actually, the television determines the target object to be pushed through a preset application. Therefore, after the screen capture image is captured by the television, the current picture played by the television can jump to a display interface corresponding to the preset application, and the search result is displayed by the display interface.
Optionally, the search result further includes an identifier of the target object, and after the search result is displayed in step S3063 or step S3064, the alternative scheme provided in this embodiment may further include the following steps:
step S30651, receiving a selection instruction, where the selection instruction includes an identifier of at least one target object.
In step S30652, the target product is determined based on the selection instruction.
In an alternative, the identifier may be attribute information that can distinguish the target product for brand, color, size, and the like; the selection instruction may be used to determine the target item.
In an alternative embodiment, the display interface of the television displays a plurality of commodity images which are the same as or similar to the target object, and at this time, the user needs to select a commodity which is suitable for the user according to the actual situation of the user, such as a dress of the same style which is suitable for the size of the user, glasses which are suitable for the skin color of the user, and the like.
Optionally, after the step S30652 of determining the target product according to the selection instruction, the alternative scheme provided by this embodiment may further include the following steps:
step S30653, determines whether a predetermined application has been registered.
In step S30654, when a preset application is registered, an information mark for resource distribution is displayed.
In an alternative, the preset application for logging in may indicate that the logged-in user has the right to purchase the commodity; the login mode can be two-dimensional code scanning login, account name password login and the like.
In the above steps, after the user determines the target commodity or adds the target commodity into the shopping cart, the preset application may determine whether the user logs in the current application. If the user is not logged in, the user is required to scan the code first to log in or register. If the user logs in, the television picture directly jumps to an order confirmation page, and then the user confirms the order address and the amount. After the user confirms to place the order, the user can scan the code through the payment software to pay or pay through the bound bank card account, and therefore the transaction is completed.
Fig. 4 is a flow chart of an alternative method for processing the same money for tv shopping according to the present embodiment. As shown in fig. 4, the television system performing the whole process flow can be divided into four parts: a dialogue understanding system, a television control system, an image object detection and positioning system and a television shopping system. When the user says 'I want to buy the same money', the dialogue understanding system analyzes that the shopping intention of the user is 'buy the same money', and then the screen capture function is started through the television control system. When a user says that the user wants to buy the watch on the hand of a certain star, the dialogue understanding system analyzes that the shopping intention of the user is 'buy the same money', the figure is 'the certain star' and the product name is 'the watch', the commodity information of the commodity is specified by the user, the image target detection and positioning system starts to work, and the certain star in the screenshot image and the watch on the hand of the certain star are identified. And then, the television shopping system searches related commodities of the same type according to the picture searching function by the picture and the watch picture of a certain star, and returns the search result to the television picture for displaying. At this time, the user selects a favorite brand according to the preference of the user, for example, the user sends a voice to 'buy the first', the dialogue understanding system analyzes that the shopping intention is 'the first', the commodity is locked immediately, the commodity enters a main page of the commodity, and the user further selects a commodity model, a size, a color and the like suitable for the user according to the characteristics of the user and places an order. If the user does not log in the TV shopping system, the TV shopping system enters a code scanning login page to allow the user to scan the code to log in, and if the user logs in the TV shopping system, the user enters an order confirmation page to further confirm the information of the consignee such as the order address, the order amount and the like.
It is easy to note that the above embodiments meet the strong shopping demands of many users for "star" and "tv show" and "movie" and provide a voice interaction based shopping function on the smart tv. Based on a voice interaction function on the smart television, video or images which are played by the television are captured, commodity detection and positioning are carried out on the images by using an image detection and positioning technology, people in the images are identified by using a face identification technology, and then a picture searching function of a television shopping system is used for providing convenient and fast buying and buying money services for users.
In the embodiment of the application, a control instruction received by equipment is analyzed, and whether the control instruction contains a preset target instruction is detected; under the condition that the control instruction is determined to contain the target instruction, intercepting the content currently played by the equipment to obtain a screen capture image; and then determining a target object to be pushed according to the screen shot image. Compared with the prior art, the scheme analyzes the voice instruction of the user through the voice recognition technology and the spoken language understanding model, captures the video or image being played on the television, locks the target object by utilizing the image target detection and positioning technology, determines the product image which is the same as or similar to the target object through the image searching technology, and finally provides payment purchase service for the user through preset application.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method of the embodiments of the present application.
Example 2
According to an embodiment of the present application, there is also provided a method for controlling a video playback device, as shown in fig. 5, the method may include the following steps:
step S502, in the process of playing the video information by the device, receiving a control instruction containing a target instruction.
In an alternative, the device may be a television, a computer, a mobile terminal, or the like; the control instruction can be a voice instruction, a gesture instruction, a remote control instruction and the like. When the control instruction is a voice instruction, the device operating the embodiment can be configured with a voice receiver for picking up voice information of a user; when the control instruction is a gesture instruction, the equipment can be used for shooting gesture images of a user by a front-mounted image acquisition device, such as a camera; when the control command is a remote control command, the device may be configured with a wireless receiver for receiving a remote control signal.
It should be noted that the target command may be a keyword in a voice command, such as "i want to buy" or "like money", and the gesture command may be a series of fixed actions pre-stored in the device. In any control instruction, the control instruction needs to be matched with a target instruction preset by the device to judge whether the control instruction received by the device meets the condition for starting the next operation, and further to eliminate an instruction sent by a user by mistake or an instruction sent by a third party without permission.
And step S504, displaying a screen capture image obtained by capturing the content currently played by the equipment.
In the above steps, when the device determines that the control instruction includes a preset target instruction, that is, the control instruction meets a preset condition, the device starts a screen capture program, captures a screen of the content currently played by the device to obtain a captured image, where the captured image is an image including a related commodity, and displays the captured image.
Step S506, displaying the target object to be pushed, wherein the target object to be pushed is determined at least according to the screen capture image.
In an alternative, the method for determining the target object to be pushed may be a graph searching method; the target object may be one or a series of commodity images including commodities that the user desires to purchase, and the current screen capture image is entirely or partially covered in the form of a list.
The method is characterized in that images are searched by images, namely, a given image is used for finding the same and similar image set in a database based on a deep learning technology, so that the convenience of searching and searching the images is improved.
It is easy to note that the search engine in the related art generally has a function of searching a graph with a graph. However, the product search function combining the search engine and the tv has no specific technical solution, which brings great inconvenience to the user for shopping. In order to optimize shopping experience and meet the strong shopping demands of a plurality of users for buying the same type of star, the same type of television drama and the same type of movie, the embodiment provides the same type shopping function based on a control instruction on the smart television, captures a video or an image played by the television, determines a target object to be pushed according to the captured image, and provides convenient same type shopping service for the users.
In the embodiment of the application, firstly, a control instruction containing a target instruction is received in the process of playing video information by equipment; then, displaying a screenshot image obtained by intercepting the content currently played by the equipment; and finally, displaying the target object to be pushed, wherein the target object to be pushed is determined according to the screen capture image. Compared with the prior art, the scheme is based on the control instruction of the user, the video or the image which is played on the television is subjected to screenshot, the product image which is similar to the screenshot product is determined by using the image searching technology, convenient and fast buying and buying services are provided for the user, the purpose of optimizing shopping experience is achieved, and the technical problem that the product in the played content cannot be identified by the television in the related technology, so that the product information cannot be provided for the user is solved.
Optionally, before the screen capture image obtained by capturing the content currently played by the device is displayed in step S504, an alternative scheme provided in this embodiment may further include the following steps:
step S5031, determining whether the control command includes a preset target command;
in step S5032, if the determination result is yes, a screenshot image obtained by capturing the content currently played by the device is displayed.
It should be noted that only when the control instruction includes a preset target instruction, the screenshot image obtained by capturing the content currently played by the device is displayed, so that an instruction sent by a user by mistake or an instruction sent by a third party without authority can be eliminated.
Optionally, the control instruction includes voice information, and the step S5031 determines whether the control instruction includes a preset target instruction, which may specifically include:
step S50311, performing voice recognition on the voice information to obtain text information corresponding to the voice information.
In an alternative, the control command may be a voice command, and the voice command includes voice information.
In the above steps, the device operating this embodiment is configured with a voice receiver, which is used to pick up the voice information of the user, perform voice recognition on the voice information, and convert it into corresponding text information.
Speech recognition, also called automatic speech recognition, aims at converting the vocabulary content in the speech of a user into a computer-readable character sequence, i.e. text information. Unlike speaker recognition and speaker verification, the latter attempts to recognize or verify the speaker who uttered the speech rather than the vocabulary content contained therein.
Step S50312, performing semantic analysis on the text information based on the spoken language understanding model to obtain an analysis result, where the spoken language understanding model is used to obtain semantic representation of the voice information based on the text information.
In an alternative, the spoken language understanding model may be a model that converts textual information transcribed by speech recognition into a semantic representation.
It should be noted that the exact meaning of a word in a text message is not important, but is important in the semantic information conveyed by the text, i.e. the intention of the user to speak a voice message. Such as "buy the same money". SLUs are challenging for several reasons, such as speech recognition errors, ambiguities, disfluencies, etc., so users use word criteria as much as possible when issuing speech commands, avoiding spoken language.
Step S50313, determining whether the voice information contains a target semantic according to the analysis result, wherein the target semantic is a semantic representing a target instruction.
In an alternative, the target semantics may be semantic information corresponding to the target instruction.
It is easy to notice that the above scheme adopts voice instruction to carry out TV voice shopping, and the voice shopping experience is accomplished with the help of equipment such as smart televisions, and the user only needs to sit on the sofa and sends out voice instruction and just can operate the TV set, does not need other actions, has greatly made things convenient for the user.
Optionally, before receiving the control instruction containing the target instruction in step S502, the alternative provided in this embodiment may further include the following steps:
step S5011, receiving a trigger instruction, pausing currently played content of the device according to the trigger instruction, and detecting a control instruction.
In an alternative, the triggering instruction is used to pause the content currently played by the device, and the triggering instruction may still be a voice instruction, a gesture instruction, a remote control instruction, or the like.
In the above steps, in order to prevent the television picture from being played too fast and skipping the picture containing the target object, the user may first send a trigger instruction, and after receiving the trigger instruction, the device may first pause the currently played content, and then detect a control instruction sent by the user later.
Or step S5012, detecting the control command in real time, pausing the content currently played by the device when the control command is detected, and performing a step of analyzing the control command received by the device.
In the above steps, the device may always detect the control instruction without the user sending a trigger instruction in advance, and if the control instruction is detected, further determine whether the control instruction includes a preset target instruction.
Optionally, before the target object to be pushed is displayed in step S506, the alternative scheme provided in this embodiment may further include the following steps:
in step S5051, a determination instruction is detected, where the determination instruction is used to determine a screen capture image.
After the content currently played by the device is intercepted to obtain the screenshot image and before the control instruction received by the device is analyzed, the screenshot image can be displayed, and whether the currently displayed screenshot image is the image which is expected by the user and contains the target object or not is determined based on the determination instruction.
In an alternative, the determination instruction is used to determine whether the captured screenshot image is an image that is desired by the user and contains the target object, and the determination instruction may be fed back by the user based on a form of "yes" or "no".
Step S5052, if the determination instruction is received, the step of capturing the content currently played by the device to obtain a screenshot image is performed.
In an alternative, the screen shot image may be an image containing a target object desired by the user. In this case, the user sends a determination instruction, and the device enters a step of capturing the content currently played by the device to obtain a screenshot image.
In step S5053, if a position adjustment instruction is received, the position currently played by the device is adjusted according to the position adjustment instruction, where the position adjustment instruction includes a forward instruction or a backward instruction.
Optionally, the step S506 of displaying the target object to be pushed may specifically include:
in step S5061, a preset application is triggered to start.
In an alternative, the preset application may be an application program having a shopping function such as search, payment, and the like.
Step 5062, searching according to the screenshot image through a preset application to obtain a search result, wherein the search result comprises image information of the target object and attribute information of the target object;
step S5063, the search result is displayed.
In an alternative, the method for determining the target object to be pushed may be a graph searching method; the image information may be a picture of the target object, and the attribute information may be a brand, a style, an applicable group, and the like of the target object.
In the above steps, the preset application searches according to the screenshot image to obtain a series of search results, and the search results are displayed on the device in a list form and include the image information and the attribute information of the target object. The user can select the target commodity again through the control instruction, for example, the nth commodity is selected, and therefore the commodity main page is entered.
It is readily noted that if multiple items are included in the screenshot, many other items related to the multiple items may be included in the search results. Therefore, the control instruction may include commodity information of the specified commodity, and before the target object to be pushed is displayed in step S506, the alternative provided in this embodiment may further include the following steps:
step S5054, positioning is carried out in the screen shot image according to the commodity information of the specified commodity, and a target area of the screen shot image is obtained; the step S506 of determining the target object to be pushed according to at least the screen shot image includes: and determining a target object to be pushed according to the target area of the screen shot image.
In an alternative, the commodity information of the specified commodity may limit the target commodity to a specific one, and prevent unnecessary introduction of other commodities when the target object to be pushed is determined. For example, the commodity information may include the kind, color, user, and the like of the commodity.
In an alternative, the method for locating the specified commodity may be an image target detection and location method. The image target detection and positioning refers to scanning and searching targets in images and videos (a series of images), is image segmentation based on geometric features and statistical features of the targets, combines the segmentation and identification of the targets into a whole, and is an important index of the whole system in terms of accuracy and real-time performance.
Optionally, the commodity information of the specified commodity in the step S5054 includes at least one of: commodity name, commodity type, and commodity use subject.
In one alternative, the names of the commodities can be specific brands, such as a thermal cup for a chef, a chamomile perfume and the like; the commodity types can be the types of commodities such as water cups, perfumes, glasses and the like; the commodity can be used by a user of the commodity, such as a star or a pet.
Optionally, the step S506 of displaying the target object to be pushed may specifically include:
step S5064, displaying a floating interface on the current screen of the device.
In step S5065, the image information and the attribute information of the target object are displayed on the floating interface.
In an alternative, the floating interface can be partially or completely covered on the current screen, and can also have a specified transparency.
Optionally, the step S506 of displaying the target object to be pushed may specifically include:
step S5066, jumping to a display interface corresponding to a preset application, and displaying the image information and the attribute information of the target object through the display interface.
In an alternative, the display interface may be an interface in a preset application, and the preset application may also be an application program having shopping functions such as search and payment.
In the above steps, the television determines the target object to be pushed according to the screen capture image, and actually, the television determines the target object to be pushed through a preset application. Therefore, after the screen capture image is captured by the television, the current picture played by the television can jump to a display interface corresponding to the preset application, and the search result is displayed by the display interface.
Optionally, after the image information and the attribute information of the target object are displayed in step S5065 or step S5066, the alternative provided in this embodiment may further include the following steps:
step S5067, displaying the target product, wherein the target product is determined according to the selection instruction when the attribute information includes the identifier of the target object, and the selection instruction includes the identifier of at least one target object.
In an alternative, the identifier may be attribute information that can distinguish the target product for brand, color, size, and the like; the selection instruction may be used to determine the target item.
Optionally, after the target product is displayed in step S5067, the alternative provided in this embodiment may further include the following steps:
in step S5068, it is determined whether a predetermined application is registered.
In step S5069, when a predetermined application is registered, an information mark for resource distribution is displayed.
In an alternative, the preset application for logging in may indicate that the logged-in user has the right to purchase the commodity; the login mode can be two-dimensional code scanning login, account name password login and the like.
In the above steps, after the user determines the target commodity or adds the target commodity into the shopping cart, the preset application may determine whether the user logs in the current application. If the user is not logged in, the user is required to scan the code first to log in or register. If the user logs in, the television picture directly jumps to an order confirmation page, and then the user confirms the order address and the amount. After the user confirms to place the order, the user can scan the code through the payment software to pay or pay through the bound bank card account, and therefore the transaction is completed.
It should be noted that, reference may be made to the relevant description in embodiment 1 for optional or preferred embodiments of this embodiment, but the present invention is not limited to the disclosure in embodiment 1, and is not described herein again.
Example 3
According to an embodiment of the present application, there is also provided an apparatus for processing information, as shown in fig. 6, the apparatus 600 includes: an analysis module 602, a truncation module 604, and a first determination module 606.
The analysis module 602 is configured to analyze a control instruction received by the device, and detect whether the control instruction includes a preset target instruction; the intercepting module 604 is configured to determine that the control instruction includes a target instruction, and intercept a content currently played by the device to obtain a screenshot image; a first determining module 606, configured to determine a target object to be pushed according to at least the screen capture image.
Optionally, the control instruction includes voice information, and the analysis module may include: the recognition module is used for carrying out voice recognition on the voice information to obtain text information corresponding to the voice information; the analysis submodule is used for carrying out semantic analysis on the text information based on the spoken language understanding model to obtain an analysis result, wherein the spoken language understanding model is used for obtaining semantic representation of the voice information based on the text information; and the second determining module is used for determining whether the voice information contains target semantics according to the analysis result, wherein the target semantics are the semantics representing the target instruction.
Optionally, the apparatus may further include: the trigger module is used for receiving a trigger instruction before analyzing the control instruction received by the equipment, pausing the content currently played by the equipment according to the trigger instruction and detecting the control instruction; or the first detection module is used for detecting the control instruction in real time, pausing the content currently played by the equipment when the control instruction is detected, and performing the step of analyzing the control instruction received by the equipment.
Optionally, the apparatus may further include: the second detection module is used for detecting a determination instruction before determining a target object to be pushed according to the screen capture image, wherein the determination instruction is used for determining the screen capture image; the intercepting module is used for intercepting the content currently played by the equipment to obtain a screen-shot image if a determination instruction is received; and the adjusting module is used for adjusting the current playing position of the equipment according to the position adjusting instruction if the position adjusting instruction is received, wherein the position adjusting instruction comprises a forward instruction or a backward instruction.
Optionally, the determining module may include: the starting module is used for triggering the preset application to start; and the searching module is used for searching according to the screen capture image through a preset application to obtain a searching result, wherein the searching result comprises the image information and the attribute information of the target object.
Optionally, the control instruction includes commodity information of a specified commodity, and the apparatus may further include: the positioning module is used for positioning in the screen capture image according to the commodity information of the specified commodity before the target object to be pushed is determined at least according to the screen capture image to obtain a target area of the screen capture image; the first determining module may include a first determining sub-module, configured to determine a target object to be pushed according to a target area of the screen capture image.
Optionally, the commodity information of the specified commodity includes at least one of: commodity name, commodity type, and commodity use subject.
Optionally, the apparatus may further include: the floating display module is configured to display a search result after determining a target object to be pushed according to at least the screen capture image, where the floating display module may specifically include: the first display module is used for displaying a floating interface on a current screen of the equipment; and the second display module is used for displaying the search result in the suspension interface.
Optionally, the apparatus may further include: the skip display module is configured to display a search result after determining a target object to be pushed according to at least the screenshot image, where the skip display module may specifically include: and the third display module is used for jumping to a display interface corresponding to the preset application and displaying the search result through the display interface.
Optionally, the search result further includes an identifier of the target object, and the apparatus may further include: the selection module is used for receiving a selection instruction after the search result is displayed, wherein the selection instruction comprises an identifier of at least one target object; and the determining submodule is used for determining the target commodity according to the selection instruction.
Optionally, the apparatus may further include: the judging module is used for judging whether a preset application is logged in or not after the target commodity is determined according to the selection instruction; and the fourth display module is used for displaying the information identifier for resource circulation under the condition that the preset application is logged in.
It should be noted here that the analysis module 602, the interception module 604 and the first determination module 606 correspond to steps S302 to S306 in embodiment 1, and the three modules are the same as the corresponding steps in the implementation example and application scenario, but are not limited to the disclosure in embodiment 1. It should be noted that the above modules may be operated in the computer device 10 provided in embodiment 1 as a part of the apparatus.
Example 4
According to an embodiment of the present application, there is also provided an apparatus for controlling a video playback device, as shown in fig. 7, the apparatus 700 includes: a receiving module 702, a first display module 704, and a second display module 706.
The receiving module 702 is configured to receive a control instruction including a target instruction in a process of playing video information by a device; a first display module 704, configured to display a screenshot image obtained by capturing a content currently played by a device; the second display module 706 is configured to display a target object to be pushed, where the target object to be pushed is determined at least according to the screen capture image.
Optionally, the apparatus may further include: the first judgment module is used for judging whether the control instruction contains a preset target instruction or not before displaying a screen capture image obtained by capturing the content currently played by the equipment; and the first display sub-module is used for displaying a screen capture image obtained by capturing the content currently played by the equipment if the judgment result is yes.
Optionally, the control instruction includes voice information, and the determining module may include: the recognition module is used for carrying out voice recognition on the voice information to obtain text information corresponding to the voice information; the analysis module is used for carrying out semantic analysis on the text information based on the spoken language understanding model to obtain an analysis result, wherein the spoken language understanding model is used for obtaining semantic representation of the voice information based on the text information; and the determining module is used for determining whether the voice information contains target semantics according to the analysis result, wherein the target semantics are the semantics representing the target instruction.
Optionally, the apparatus may further include: the trigger module is used for receiving a trigger instruction before receiving a control instruction containing a target instruction, pausing the currently played content of the equipment according to the trigger instruction and detecting the control instruction; or the pause module is used for detecting the control instruction in real time, pausing the content currently played by the equipment when the control instruction is detected, and performing the step of analyzing the control instruction received by the equipment.
Optionally, the apparatus may further include: the device comprises a detection module, a display module and a display module, wherein the detection module is used for detecting a determination instruction before a target object to be pushed is displayed, and the determination instruction is used for determining a screen capture image; the screen capture module is used for intercepting the content currently played by the equipment to obtain a screen capture image if a determination instruction is received; and the adjusting module is used for adjusting the current playing position of the equipment according to the position adjusting instruction if the position adjusting instruction is received, wherein the position adjusting instruction comprises a forward instruction or a backward instruction.
Optionally, the second display module may include: the starting module is used for triggering the preset application to start; the searching module is used for searching according to the screenshot image through a preset application to obtain a searching result, wherein the searching result comprises image information of the target object and attribute information of the target object; and the second display submodule is used for displaying the search result.
Optionally, the control instruction includes commodity information of a specified commodity, and the apparatus may further include: the positioning module is used for positioning in the screen capture image according to the commodity information of the specified commodity before the target object to be pushed is displayed to obtain a target area of the screen capture image; the second display module may further include: and the determining submodule is used for determining a target object to be pushed according to the target area of the screen capture image.
Optionally, the commodity information of the specified commodity includes at least one of: commodity name, commodity type, and commodity use subject.
Optionally, the second display module may include: the suspension display module is used for displaying a suspension interface on a current screen of the equipment; and the suspension display submodule is used for displaying the image information and the attribute information of the target object in the suspension interface.
Optionally, the second display module may include: and the skipping module is used for skipping to a display interface corresponding to a preset application and displaying the image information and the attribute information of the target object through the display interface.
Optionally, the apparatus may further include: and the third display module is used for displaying the target commodity, wherein the target commodity is determined according to the selection instruction under the condition that the attribute information comprises the identification of the target object, and the selection instruction comprises the identification of at least one target object.
Optionally, the apparatus may further include: the second judgment module is used for judging whether a preset application is logged in or not after the target commodity is displayed; and the fourth display module is used for displaying the information identifier for resource circulation under the condition that the preset application is logged in.
It should be noted here that the receiving module 702, the first display module 704 and the second display module 706 correspond to steps S202 to S206 in embodiment 2, and the three modules are the same as the corresponding steps in the implementation example and application scenario, but are not limited to the disclosure in embodiment 2. It should be noted that the above modules may be operated in the computer device 10 provided in embodiment 1 as a part of the apparatus.
Example 5
Embodiments of the present application may provide a computer device that may be any one of a group of computer devices. Optionally, in this embodiment, the computer device may also be replaced with a terminal device such as a mobile terminal.
Optionally, in this embodiment, the computer device may be located in at least one network device of a plurality of network devices of a computer network.
In this embodiment, the computer device described above may execute program codes of the following steps in the method of information processing of an application program: analyzing a control instruction received by the equipment, and detecting whether the control instruction contains a preset target instruction or not; determining that the control instruction contains a target instruction, and intercepting the content currently played by the equipment to obtain a screen capture image; and determining a target object to be pushed according to the screen capture image.
Optionally, fig. 8 is a block diagram of a computer device according to an embodiment of the present application. As shown in fig. 8, the computer apparatus a may include: one or more (only one shown) processors 102 and memory 104.
The memory may be used to store software programs and modules, such as program instructions/modules corresponding to the information processing method and apparatus in the embodiments of the present application, and the processor executes various functional applications and data processing by running the software programs and modules stored in the memory, that is, implementing the information processing method. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory may further include memory located remotely from the processor, which may be connected to the computer device a via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor can call the information and the application program stored in the memory through the transmission module to execute the following steps: analyzing a control instruction received by the equipment, and detecting whether the control instruction contains a preset target instruction or not; determining that the control instruction contains a target instruction, and intercepting the content currently played by the equipment to obtain a screen capture image; and determining a target object to be pushed according to the screen capture image.
Optionally, the processor may further execute the program code of the following steps: the control instruction includes voice information, analyzes the control instruction received by the device, detects whether the control instruction contains a preset target instruction, and includes: carrying out voice recognition on the voice information to obtain text information corresponding to the voice information; performing semantic analysis on the text information based on a spoken language understanding model to obtain an analysis result, wherein the spoken language understanding model is used for obtaining semantic representation of the voice information based on the text information; and determining whether the voice information contains target semantics according to the analysis result, wherein the target semantics are semantics representing target instructions.
Optionally, the processor may further execute the program code of the following steps: before analyzing the control instruction received by the device, the method may further include: receiving a trigger instruction, pausing the currently played content of the equipment according to the trigger instruction, and detecting a control instruction; or detecting the control instruction in real time, pausing the content currently played by the equipment when the control instruction is detected, and performing analysis on the control instruction received by the equipment.
Optionally, the processor may further execute the program code of the following steps: before determining the target object to be pushed according to at least the screen capture image, the method may further include: detecting a determination instruction, wherein the determination instruction is used for determining a screen capture image; if a determination instruction is received, a step of intercepting the content currently played by the equipment to obtain a screen capture image is carried out; and if a position adjusting instruction is received, adjusting the current playing position of the equipment according to the position adjusting instruction, wherein the position adjusting instruction comprises a forward instruction or a backward instruction.
Optionally, the processor may further execute the program code of the following steps: determining a target object to be pushed according to at least the screen shot image, comprising: triggering a preset application to start; searching according to the screen capture image through a preset application to obtain a search result, wherein the search result comprises image information and attribute information of the target object.
Optionally, the processor may further execute the program code of the following steps: the control instruction comprises commodity information of a specified commodity, and before the target object to be pushed is determined at least according to the screen capture image, the method can further comprise the following steps: positioning in the screen shot image according to the commodity information of the specified commodity to obtain a target area of the screen shot image; determining a target object to be pushed according to at least the screen shot image comprises: and determining a target object to be pushed according to the target area of the screen shot image.
Optionally, the processor may further execute the program code of the following steps: the commodity information of the specified commodity includes at least one of: commodity name, commodity type, and commodity use subject.
Optionally, the processor may further execute the program code of the following steps: after determining the target object to be pushed according to at least the screen shot image, the method may further include: displaying the search result, wherein the step of displaying the search result comprises: displaying a floating interface on a current screen of the device; and displaying the search result in the suspension interface.
Optionally, the processor may further execute the program code of the following steps: after determining the target object to be pushed according to at least the screen shot image, the method may further include: displaying the search result, wherein the step of displaying the search result comprises: and jumping to a display interface corresponding to a preset application, and displaying the search result through the display interface.
Optionally, the processor may further execute the program code of the following steps: the search result further includes an identification of the target object, and after displaying the search result, the method may further include: receiving a selection instruction, wherein the selection instruction comprises an identifier of at least one target object; and determining the target commodity according to the selection instruction.
Optionally, the processor may further execute the program code of the following steps: after determining the target product according to the selection instruction, the method may further include: judging whether a preset application is logged in; and displaying an information identifier for resource circulation when the preset application is logged in.
It can be understood by those skilled in the art that the structure shown in fig. 8 is only an illustration, and the computer device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 8 does not limit the structure of the electronic device. For example, computer device 10 may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 8, or have a different configuration than shown in FIG. 8.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
Example 6
Embodiments of the present application also provide a storage medium. Optionally, in this embodiment, the storage medium may be configured to store program codes executed by the information processing method provided in the first or second embodiment.
Optionally, in this embodiment, the storage medium may be located in any one of computer devices in a computer device group in a computer network, or in any one of mobile terminals in a mobile terminal group.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: analyzing a control instruction received by the equipment, and detecting whether the control instruction contains a preset target instruction or not; determining that the control instruction contains a target instruction, and intercepting the content currently played by the equipment to obtain a screen capture image; and determining a target object to be pushed according to the screen capture image.
In the embodiment of the application, by running a program code stored in a storage medium, a voice instruction of a user is analyzed based on a voice recognition technology and a spoken language understanding model, a video or an image which is being played on a television is captured, a target object is locked by utilizing an image target detection and positioning technology, a product image which is the same as or similar to the target object is determined by utilizing an image searching technology, and finally, a payment purchase service is provided for the user through a preset application, so that the user is greatly facilitated, a convenient and fast purchase service is provided for the user, the purpose of optimizing shopping experience is achieved, and the technical problem that the television in the related technology cannot recognize a product in played content and cannot provide product information for the user is solved.
Example 7
According to an embodiment of the present application, there is also provided an information processing system, including:
a processor; and
a memory coupled to the processor for providing instructions to the processor for processing the following processing steps: analyzing a control instruction received by the equipment, and detecting whether the control instruction contains a preset target instruction or not; determining that the control instruction contains a target instruction, and intercepting the content currently played by the equipment to obtain a screen capture image; and determining a target object to be pushed according to the screen capture image.
It should be noted that, reference may be made to the relevant description in embodiment 1 for alternative or preferred embodiments of this embodiment, and details are not described here again.
Example 8
According to an embodiment of the present application, there is also provided a sound box with a screen, as shown in fig. 9, the sound box includes:
a display screen 90;
and the processor 92 is used for analyzing the control instruction received by the sound box, detecting whether the control instruction contains a preset target instruction, intercepting the currently played content of the sound box to obtain a screen capture image if the control instruction contains the target instruction, and determining a target object to be pushed at least according to the screen capture image.
In an alternative, the control command may be a voice command, a gesture command, a remote control command, or the like. When the control instruction is a voice instruction, the sound box of the embodiment can be configured with a voice receiver for picking up voice information of a user by operating the sound box; when the control instruction is a gesture instruction, the sound box can be used for shooting gesture images of a user by a front-mounted image acquisition device, such as a camera; when the control command is a remote control command, the sound box can be provided with a wireless receiver for receiving a remote control signal.
It should be noted that the target command may be a keyword in a voice command, such as "i want to buy" or "like money", and the gesture command may be a series of fixed actions pre-stored in a speaker box. In any control instruction, the control instruction needs to be matched with a target instruction preset by the sound box to judge whether the control instruction received by the sound box meets the condition for starting the next operation, and further to eliminate an instruction sent by a user by mistake or an instruction sent by a third party without permission.
For example, the sound box plays a video on the display screen, and when a user sees his idol, the user can say "i want to buy VR glasses of a certain scientist" to the television, and can also make a gesture for screen freeze, for example, make a ring finger, and can also freeze the television screen by using a remote controller. And after receiving the control instruction, the television matches the control instruction with a preset target instruction to judge whether the next operation is started.
And when the sound box determines that the control instruction contains a preset target instruction, namely the control instruction meets a preset condition, starting a screen capture program by the sound box, and capturing the currently played content of the sound box to obtain a screen capture image, wherein the screen capture image is an image containing related commodities.
In an alternative, the method for determining the target object to be pushed may be a graph searching method; the target object may be one or a series of commodity images including commodities that the user desires to purchase, and the current screen capture image is entirely or partially covered in the form of a list.
The method is characterized in that images are searched by images, namely, a given image is used for finding the same and similar image set in a database based on a deep learning technology, so that the convenience of searching and searching the images is improved.
It is easy to note that the search engine in the related art generally has a function of searching a graph with a graph. However, the product search function combining the search engine and the tv has no specific technical solution, which brings great inconvenience to the user for shopping. In order to optimize shopping experience and meet the strong shopping demands of a plurality of users for buying the same type of star, the same type of television drama and the same type of movie, the embodiment provides the same type shopping function based on a control instruction on the smart television, captures a video or an image played by the television, determines a target object to be pushed according to the captured image, and provides convenient same type shopping service for the users.
In an alternative embodiment, the smart television performs the above steps as an example. When watching a television play played on the smart television, a user finds that tableware used by a master is very exquisite in the play, and then sends a voice control instruction 'i want to buy the tableware of a certain television play' to the television. After receiving the voice command, the smart television detects that the voice command contains a preset target command 'I want to buy', then carries out screen capture on the currently played content, forwards the captured screen capture image to the server, and the server carries out calculation in the background, determines one or more tableware which are the same as or similar to the tableware contained in the screen capture image, returns the tableware to the television in a list form, and covers the tableware on the currently played television picture for the user to select and confirm.
In the above embodiment, the control instruction received by the sound box is analyzed first, and whether the control instruction includes a preset target instruction is detected; under the condition that the control instruction is determined to contain the target instruction, intercepting the content currently played by the sound box to obtain a screen capture image; and then determining a target object to be pushed according to the screen shot image. Compared with the prior art, the scheme is based on the control instruction of the user, the video or the image which is played on the television is subjected to screenshot, the product image which is similar to the screenshot product is determined by using the image searching technology, convenient and fast buying and buying services are provided for the user, the purpose of optimizing shopping experience is achieved, and the technical problem that the product in the played content cannot be identified by the television in the related technology, so that the product information cannot be provided for the user is solved.
It should be noted that the processor in this embodiment may also execute other steps in embodiment 1, which is not described herein again.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present application, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims (20)

1. A method for processing information, comprising:
analyzing a control instruction received by equipment, and detecting whether the control instruction contains a preset target instruction or not;
determining that the control instruction contains the target instruction, and intercepting the content currently played by the equipment to obtain a screen capture image;
and determining a target object to be pushed according to the screen capture image.
2. The method of claim 1, wherein the control command comprises voice information, analyzing the control command received by the device, and detecting whether the control command contains a preset target command comprises:
performing voice recognition on the voice information to obtain text information corresponding to the voice information;
performing semantic analysis on the text information based on a spoken language understanding model to obtain an analysis result, wherein the spoken language understanding model is used for obtaining semantic representation of the voice information based on the text information;
and determining whether the voice information contains target semantics according to the analysis result, wherein the target semantics are semantics representing the target instructions.
3. The method of claim 1, wherein prior to analyzing the control instructions received by the device, the method further comprises:
receiving a trigger instruction, pausing the content currently played by the equipment according to the trigger instruction, and detecting the control instruction; or
And detecting a control instruction in real time, pausing the content currently played by the equipment when the control instruction is detected, and analyzing the control instruction received by the equipment.
4. The method of claim 1, wherein prior to determining a target object to be pushed based at least on the screenshot image, the method further comprises:
detecting a determination instruction, wherein the determination instruction is used for determining the screen capture image;
if a determination instruction is received, a step of intercepting the content currently played by the equipment to obtain a screen capture image is carried out;
and if a position adjusting instruction is received, adjusting the current playing position of the equipment according to the position adjusting instruction, wherein the position adjusting instruction comprises a forward instruction or a backward instruction.
5. The method of claim 1, wherein determining a target object to be pushed based at least on the screenshot image comprises:
triggering a preset application to start;
and searching according to the screenshot image through the preset application to obtain a search result, wherein the search result comprises image information and attribute information of the target object.
6. The method according to claim 1, wherein the control instruction includes article information specifying an article, and before determining the target object to be pushed based on at least the screen shot image, the method further comprises:
positioning in the screen shot image according to the commodity information of the specified commodity to obtain a target area of the screen shot image;
determining a target object to be pushed according to at least the screen shot image comprises: and determining a target object to be pushed according to the target area of the screen capture image.
7. The method according to claim 6, wherein the commodity information of the specified commodity includes at least one of: commodity name, commodity type, and commodity use subject.
8. The method of claim 5, wherein after determining a target object to be pushed based at least on the screenshot image, the method further comprises: displaying the search result, wherein displaying the search result comprises:
displaying a hover interface on a current screen of the device;
and displaying the search result in the suspension interface.
9. The method of claim 5, wherein after determining a target object to be pushed based at least on the screenshot image, the method further comprises: displaying the search result, wherein displaying the search result comprises:
and jumping to a display interface corresponding to the preset application, and displaying the search result through the display interface.
10. The method of claim 8 or 9, wherein the search results further include an identification of the target object, and wherein after displaying the search results, the method further comprises:
receiving a selection instruction, wherein the selection instruction comprises an identifier of at least one target object;
and determining the target commodity according to the selection instruction.
11. The method of claim 10, wherein after determining a target item according to the selection instruction, the method further comprises:
judging whether the preset application is logged in or not;
and displaying an information identifier for resource circulation under the condition that the preset application is logged in.
12. A method for controlling a video playback device, comprising:
receiving a control instruction containing a target instruction in the process of playing video information by equipment;
displaying a screenshot image obtained by intercepting the content currently played by the equipment;
and displaying a target object to be pushed, wherein the target object to be pushed is determined at least according to the screen capture image.
13. The method of claim 12, wherein displaying the target object to be pushed comprises:
displaying a hover interface on a current screen of the device;
and displaying the image information and the attribute information of the target object in the suspension interface.
14. The method of claim 12, wherein displaying the target object to be pushed comprises:
and jumping to a display interface corresponding to a preset application, and displaying the image information and the attribute information of the target object through the display interface.
15. The method according to claim 13 or 14,
and displaying a target commodity, wherein the target commodity is determined according to a selection instruction under the condition that the attribute information comprises the identification of the target object, and the selection instruction comprises at least one identification of the target object.
16. An apparatus for processing information, comprising:
the analysis module is used for analyzing the control instruction received by the equipment and detecting whether the control instruction contains a preset target instruction or not;
the intercepting module is used for determining that the control instruction contains the target instruction and intercepting the content currently played by the equipment to obtain a screen-shot image;
and the determining module is used for determining a target object to be pushed according to the screen capture image.
17. A storage medium, characterized in that the storage medium includes a stored program, wherein, when the program runs, a device in which the storage medium is located is controlled to execute the information processing method according to any one of claims 1 to 11.
18. A processor, characterized in that the processor is configured to run a program, wherein the program is configured to execute the information processing method according to any one of claims 1 to 11 when running.
19. A system for processing information, comprising:
a processor; and
a memory coupled to the processor for providing instructions to the processor for processing the following processing steps:
analyzing a control instruction received by equipment, and detecting whether the control instruction contains a preset target instruction or not;
determining that the control instruction contains the target instruction, and intercepting the content currently played by the equipment to obtain a screen capture image;
and determining a target object to be pushed according to the screen capture image.
20. A sound box with a screen is characterized by comprising:
a display screen;
and the processor analyzes the control instruction received by the sound box, detects whether the control instruction contains a preset target instruction, intercepts the currently played content of the sound box to obtain a screen capture image if the control instruction contains the target instruction, and determines a target object to be pushed at least according to the screen capture image.
CN201911014049.1A 2019-10-23 2019-10-23 Information processing method, device and system and control method of video playing equipment Pending CN112702638A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911014049.1A CN112702638A (en) 2019-10-23 2019-10-23 Information processing method, device and system and control method of video playing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911014049.1A CN112702638A (en) 2019-10-23 2019-10-23 Information processing method, device and system and control method of video playing equipment

Publications (1)

Publication Number Publication Date
CN112702638A true CN112702638A (en) 2021-04-23

Family

ID=75505204

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911014049.1A Pending CN112702638A (en) 2019-10-23 2019-10-23 Information processing method, device and system and control method of video playing equipment

Country Status (1)

Country Link
CN (1) CN112702638A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104699800A (en) * 2015-03-19 2015-06-10 深圳市米家互动网络有限公司 Picture information searching method and system, remote controller and display terminal
CN108174303A (en) * 2017-12-29 2018-06-15 北京陌上花科技有限公司 A kind of data processing method and device for video-frequency playing content
CN109348275A (en) * 2018-10-30 2019-02-15 百度在线网络技术(北京)有限公司 Method for processing video frequency and device
CN109388319A (en) * 2018-10-19 2019-02-26 广东小天才科技有限公司 A kind of screenshot method, screenshot device, storage medium and terminal device
CN110059207A (en) * 2019-04-04 2019-07-26 Oppo广东移动通信有限公司 Processing method, device, storage medium and the electronic equipment of image information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104699800A (en) * 2015-03-19 2015-06-10 深圳市米家互动网络有限公司 Picture information searching method and system, remote controller and display terminal
CN108174303A (en) * 2017-12-29 2018-06-15 北京陌上花科技有限公司 A kind of data processing method and device for video-frequency playing content
CN109388319A (en) * 2018-10-19 2019-02-26 广东小天才科技有限公司 A kind of screenshot method, screenshot device, storage medium and terminal device
CN109348275A (en) * 2018-10-30 2019-02-15 百度在线网络技术(北京)有限公司 Method for processing video frequency and device
CN110059207A (en) * 2019-04-04 2019-07-26 Oppo广东移动通信有限公司 Processing method, device, storage medium and the electronic equipment of image information

Similar Documents

Publication Publication Date Title
US11288303B2 (en) Information search method and apparatus
CN107613400B (en) Method and device for realizing voice barrage
CN110149549B (en) Information display method and device
CN109829064B (en) Media resource sharing and playing method and device, storage medium and electronic device
CN105657535A (en) Audio recognition method and device
KR20160104635A (en) Methods, systems, and media for generating search results based on contextual information
CN105874451A (en) Methods, systems, and media for presenting supplemental information corresponding to on-demand media content
KR101511297B1 (en) Apparatus and method for generating information about object and, server for shearing information
CN103929666B (en) A kind of continuous speech exchange method and device
US10628469B2 (en) Information processing method and electronic device
CN112653902A (en) Speaker recognition method and device and electronic equipment
CN105847874A (en) Live broadcasting device and live broadcasting terminal
CN105898413A (en) Television program recommendation method, television and recommendation server
CN109509472A (en) Method, apparatus and system based on voice platform identification background music
CN103369126A (en) Song requesting method
CN107809654A (en) System for TV set and TV set control method
CN108574878B (en) Data interaction method and device
EP2869546B1 (en) Method and system for providing access to auxiliary information
US20160292766A1 (en) Devices And Methods For Acquiring Data Comparison Information
CN112752134B (en) Video processing method and device, storage medium and electronic device
CN111161729A (en) Voice interaction method and device for intelligent self-service equipment
CN112702638A (en) Information processing method, device and system and control method of video playing equipment
KR20200024538A (en) Method of recommending of information related to an image searching and service device thereof
CN112073738B (en) Information processing method and device
CN114090896A (en) Information display method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210423