WO2020088048A1 - 处理信息的方法和装置 - Google Patents
处理信息的方法和装置 Download PDFInfo
- Publication number
- WO2020088048A1 WO2020088048A1 PCT/CN2019/101686 CN2019101686W WO2020088048A1 WO 2020088048 A1 WO2020088048 A1 WO 2020088048A1 CN 2019101686 W CN2019101686 W CN 2019101686W WO 2020088048 A1 WO2020088048 A1 WO 2020088048A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video
- feature vector
- presentation
- target
- historical
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/74—Browsing; Visualisation therefor
Definitions
- the embodiments of the present application relate to the field of computer technology, such as a method and device for processing information.
- the video used to present to a user may include similar videos. Since the user browses similar videos in a short time, it may cause the user's disgust, so the display frequency of the similar videos needs to be controlled.
- the embodiments of the present application propose a method and device for processing information.
- an embodiment of the present application provides a method for processing information, including: acquiring a video for historical presentation corresponding to a target user, wherein the video for historical presentation is used for output to the target user within a historical time period
- the target user terminal ’s video for the target user to browse; extract video frames from the historical presentation video, and input the extracted video frames into a pre-trained vector conversion model to obtain a feature vector; based on the feature vector, determine the The feature vector corresponding to the video for historical presentation; determine the candidate feature vector whose similarity to the feature vector corresponding to the video for historical presentation is greater than or equal to a preset threshold from a predetermined set of candidate feature vectors; The candidate feature vector whose similarity is greater than or equal to the preset threshold is used as the target feature vector, where the candidate feature vector in the candidate feature vector set corresponds to the video for presentation in the predetermined video set for presentation; the target is selected from the video set for presentation The presentation video corresponding to the feature vector.
- an embodiment of the present application provides an apparatus for processing information, including: a video acquisition unit configured to acquire a video for historical presentation corresponding to a target user, wherein the video for historical presentation is during a historical time period The video output to the target user terminal used by the target user and browsed by the target user; the vector generation unit is configured to extract video frames from the video for historical presentation of prime numbers, and input the extracted video frames to a pre-trained vector In the conversion model, a feature vector is obtained; a first determination unit is configured to determine a feature vector corresponding to the video for historical presentation based on the feature vector; a second determination unit is configured to select candidate feature vectors from a predetermined A candidate feature vector whose similarity to the feature vector corresponding to the historical presentation video is greater than or equal to a preset threshold is determined in the set, and the candidate feature vector having the similarity greater than or equal to the preset threshold is used as the target feature vector, wherein , The candidate feature vectors in the set of candidate feature vectors correspond to the predetermined rendering
- an embodiment of the present application provides a server, including: at least one processor; a storage device on which at least one program is stored, and when the at least one program is executed by the at least one processor, such that At least one processor implements the method of any embodiment of the foregoing method for processing information.
- an embodiment of the present application provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, the method of any one of the above-mentioned methods for processing information is implemented.
- FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present application can be applied;
- FIG. 2 is a flowchart of an embodiment of a method for processing information according to the present application
- FIG. 3 is a schematic diagram of an application scenario of a method for processing information according to an embodiment of the present application
- FIG. 4 is a flowchart of still another embodiment of a method for processing information according to the present application.
- FIG. 5 is a schematic structural diagram of an embodiment of an apparatus for processing information according to the present application.
- FIG. 6 is a schematic structural diagram of a computer system suitable for implementing the server of the embodiment of the present application.
- FIG. 1 shows an exemplary system architecture 100 of an embodiment of an information processing method or an apparatus for processing information to which the present application can be applied.
- the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105.
- the network 104 is a medium used to provide a communication link between the terminal devices 101, 102, 103 and the server 105.
- the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.
- the user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages, and so on.
- Various communication client applications such as video processing software, web browser applications, search applications, social platform software, instant communication tools, etc., can be installed on the terminal devices 101, 102, and 103.
- the terminal devices 101, 102, and 103 may be hardware or software.
- the terminal devices 101, 102, and 103 may be various electronic devices with display screens and supporting video processing, including but not limited to smartphones, tablets, e-book readers, and motion picture expert compression standard audio level 3 (Moving Pictures Experts Group Audio Layer III, MP3) players, motion picture experts compressed standard audio level 4 (Moving Pictures Experts Group Audio Layer IV, MP4) players, laptops and desktop computers, etc.
- the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, multiple software or software modules used to provide distributed services), or as a single software or software module. There is no specific limit here.
- the server 105 may be a server that provides various services, for example, a back-end server that supports the video for presentation displayed on the terminal devices 101, 102, and 103.
- the background server can acquire the historical presentation video displayed on the terminal device, and analyze and process the acquired historical presentation video and other data to obtain the processing result (for example, the presentation video corresponding to the target feature vector).
- the method for processing information provided by the embodiments of the present application is generally executed by the server 105, and accordingly, the device for processing information is generally provided in the server 105.
- the server can be hardware or software.
- the server can be implemented as a distributed server cluster composed of multiple servers, or as a single server.
- the server is software, it can be implemented as multiple software or software modules (for example, multiple software or software modules for providing distributed services), or as a single software or software module. There is no specific limit here.
- the numbers of terminal devices, networks, and servers in FIG. 1 are only schematic. According to the implementation needs, there can be any number of terminal devices, networks and servers.
- the above system architecture may not include the network and terminal equipment, but only include the server.
- FIG. 2 shows a flow 200 of an embodiment of a method for processing information according to the present application.
- the method for processing information includes steps 201 to 205.
- step 201 the historical presentation video corresponding to the target user is obtained.
- the execution subject of the information processing method may obtain the video for historical presentation corresponding to the target user through a wired connection or a wireless connection.
- the video for historical presentation may be a video that is output to the target user terminal used by the target user within the historical time period for the target user to browse.
- the historical time period may be a preset time period, for example, March 2018; or it may be the time when the video for presentation is output to the target user terminal used by the target user for the first time, and the last time used by the target user
- the time when the target user terminal outputs the video for presentation is the time period at which the time ends.
- the target user is a user who is to determine a presentation video similar to its corresponding historical presentation video from the collection of presentation videos.
- the video set for presentation may be a predetermined video set.
- the video for presentation is the video output to the terminal connected to the communication to be presented to the user.
- the above-mentioned execution subject can obtain the historical presentation video corresponding to the target user stored in advance locally; or, the above-mentioned execution subject can acquire the historical presentation video corresponding to the target user sent by the target user terminal.
- the above-mentioned execution subject can obtain the historical presentation video corresponding to at least one target user, and further, for each historical presentation video in the acquired at least one historical presentation video, the subsequent steps can be performed 202-step 205.
- step 202 video frames are extracted from the video for historical presentation, and the extracted video frames are input into a pre-trained vector conversion model to obtain a feature vector.
- the above-mentioned execution subject may extract video frames from the video for historical presentation, and input the extracted video frames into a pre-trained vector conversion model to obtain a feature vector.
- the obtained feature vector may be used to characterize the input video frame.
- the video is essentially a sequence of video frames arranged in chronological order.
- the above-mentioned execution subject may use various methods to extract video frames from the historical presentation video, for example, a random extraction method may be used to extract the video frames from the historical presentation video; or, the video corresponding to the historical presentation video may be extracted In the frame sequence, the video frames sorted at preset positions are extracted.
- the vector conversion model is a model for extracting the features of the video frame, which can be used to characterize the correspondence between the video frame and the feature vector corresponding to the video frame.
- the vector conversion model may include a structure for extracting image features (such as a convolutional layer), and of course may include other structures (such as a pooling layer).
- the above-mentioned execution subject may extract at least two video frames from the video for historical presentation, and input the at least two video frames to the pre-trained vector conversion model to obtain at least two Feature vector.
- This implementation manner can provide support for the subsequent use of at least two feature vectors to determine the feature vector corresponding to the video for historical presentation. Using at least two feature vectors to determine the feature vector corresponding to the video for historical presentation can improve the determined feature vector. accuracy.
- step 203 based on the feature vector, the feature vector corresponding to the video for historical presentation is determined.
- the execution subject may determine the feature vector corresponding to the video for historical presentation.
- the feature vector corresponding to the video for historical presentation can be used to characterize the feature of the video for historical presentation.
- the above-mentioned execution subject may determine the feature vector corresponding to the video for historical presentation by various methods.
- the execution subject may directly determine the feature vector as the feature vector corresponding to the video for historical presentation, or the execution subject may process the feature vector (for example, multiply by Preset value), and the processed feature vector is determined as the feature vector corresponding to the video for historical presentation; when the obtained feature vector includes at least two, the above-mentioned execution subject may process the obtained at least two feature vectors (For example, the mean value calculation is performed), and the processing result is determined as the feature vector corresponding to the video for historical presentation.
- the above-mentioned execution subject may sum the at least two feature vectors, and use the summed result as the feature vector corresponding to the video for historical presentation.
- the feature vector corresponding to the video for historical presentation is determined by the feature vector corresponding to at least two video frames, and reference data is added to improve the accuracy of the determined feature vector corresponding to the video for historical presentation degree.
- a candidate feature vector whose similarity to the feature vector corresponding to the video for historical presentation is greater than or equal to a preset threshold is determined from a predetermined set of candidate feature vectors, and the candidate whose similarity is greater than or equal to the preset threshold is selected
- the feature vector is used as the target feature vector.
- the execution subject may determine from the predetermined set of candidate feature vectors that the similarity to the feature vector corresponding to the video for historical presentation is greater than Or a candidate feature vector equal to a preset threshold is used as the target feature vector.
- the preset threshold may be a value preset by a technician.
- the above-mentioned execution subject may adopt various methods to determine candidate feature vectors whose similarity to the feature vector corresponding to the video for historical presentation is greater than or equal to a preset threshold from the predetermined set of candidate feature vectors as the target feature vector, for example,
- the above-mentioned execution subject can perform similarity calculation on the feature vector corresponding to the video for historical presentation and the candidate feature vector in the candidate feature vector set to obtain the calculation result, and compare the calculation result with a preset threshold, and then determine whether it is greater than or The candidate feature vector corresponding to the calculation result equal to the preset threshold is used as the target feature vector.
- the candidate feature vector set is a predetermined set, and the candidate feature vector in the candidate feature vector set corresponds to the presentation video in the presentation video set.
- the candidate feature vector is used to characterize the feature of the video for presentation corresponding to the candidate feature vector in the video set for presentation.
- a video frame can be extracted from the presentation video, and the extracted video frame can be input into the vector conversion model to obtain the feature vector corresponding to the video frame, Then, the feature vector corresponding to the video frame is used to determine the candidate feature vector corresponding to the video for presentation, and then the determined candidate feature vector corresponding to the video for presentation is used to form a candidate feature vector set.
- the candidate feature vector set may be obtained by the following generation step: based on the target rendering video and the initial candidate feature vector set, the following determining step is performed: extracting the video frame from the target rendering video, converting The extracted video frames are input into the above vector conversion model to obtain the feature vector corresponding to the video frames of the target presentation video. Based on the feature vector corresponding to the video frame of the video for target presentation, the feature vector corresponding to the video for target presentation is determined. Then, the feature vector corresponding to the target presentation video is used as a candidate feature vector, and added to a predetermined initial candidate feature vector set to generate a set of added candidate feature vectors. Determine whether a new video for presentation is obtained. In response to determining that no new video for presentation is acquired, the added candidate feature vector set is determined as the candidate feature vector set.
- the target presentation video is the presentation video acquired in advance by the execution subject of the above-mentioned generation step.
- the initial candidate feature vector set may be a set without added candidate feature vectors, or may be a set with added candidate feature vectors.
- the above generating step may further include: in response to determining that a new presentation video is acquired, using the new presentation video as the target presentation video, and using the added candidate feature vector set as the initial Candidate feature vector set, continue to perform the above determination step.
- the acquired feature vector corresponding to the presentation video can be added to the initial candidate feature vector set in real time to generate a candidate feature vector set, which improves the comprehensiveness of the candidate feature vector set.
- the execution body of the above generation step may be the same as or different from the execution body of the method for processing information. If they are the same, the execution subject of the generating step may store the candidate feature vector set locally after obtaining the candidate feature vector set. If different, the execution subject of the generating step may send the candidate feature vector set to the execution subject of the method for processing information after obtaining the candidate feature vector set.
- step 205 the video for presentation corresponding to the target feature vector is selected from the video set for presentation.
- the execution subject may select the video for presentation corresponding to the target feature vector from the video set for presentation.
- the target feature vector is a feature vector whose similarity to the feature vector corresponding to the historical presentation video is greater than or equal to a preset threshold, and the feature vector corresponding to the presentation video can be used to characterize the presentation video Feature, the video for presentation selected based on this step is similar to the video for historical presentation.
- the above-mentioned execution subject may further perform the following steps: First, the above-mentioned execution subject may determine the target user utilization target The browsing time of the video for presenting browsing history on the user terminal (for example, September 1, 2018). Then, after a preset time period (for example, one month) from the browsing time, the above-mentioned execution subject can output the selected video for presentation to the target user terminal.
- a preset time period for example, one month
- the execution subject may not output the selected presentation video to the target user terminal.
- the selected presentation video After the preset duration, the selected presentation video The output, in this way, can control the presentation frequency of similar presentation videos, which helps to enhance the user experience and improve the diversity of information processing.
- FIG. 3 is a schematic diagram of an application scenario of the method for processing information according to this embodiment.
- the server 301 can obtain the historical presentation video 303 corresponding to the target user sent by the target user terminal 302, where the historical presentation video 303 is output by the server 301 to the target user terminal within the past day 302, the video used by the target user to browse. Then, the server 301 can extract the video frame 3031 and the video frame 3032 from the historical presentation video 303, and input the extracted video frame 3031 and video frame 3032 into the pre-trained vector conversion model 304, respectively, to obtain the corresponding video frame 3031. Feature vector 3051, and feature vector 3052 corresponding to video frame 3032.
- the server 301 may determine the feature vector 306 corresponding to the video 303 for historical presentation based on the obtained feature vector 3051 and feature vector 3052.
- the server 301 may acquire a predetermined candidate feature vector set 307, and determine from the candidate feature vector set 307 that the similarity of the feature vector 306 corresponding to the historical presentation video 303 is greater than or equal to a preset threshold (eg, "10" )
- a preset threshold eg, "10"
- the server 301 can obtain the above-mentioned presentation video set 309 and select the presentation video 3091 corresponding to the target feature vector 308 from the presentation video set 309.
- the method provided by the above embodiment of the present application effectively uses the candidate feature vector set to determine the presentation video similar to the historical presentation video corresponding to the target user, which is helpful for subsequent processing of the determined presentation video, such as , Control the presentation time of the determined presentation video, improve the pertinence and diversity of information processing.
- FIG. 4 shows a flow 400 of yet another embodiment of a method of processing information.
- the flow 400 of the method for processing information includes steps 401 to 405.
- step 401 the historical presentation video corresponding to the target user is obtained.
- the execution subject of the information processing method may obtain the video for historical presentation corresponding to the target user through a wired connection or a wireless connection.
- step 402 video frames are extracted from the video for historical presentation, and the extracted video frames are input into a pre-trained vector conversion model to obtain a feature vector.
- the above-mentioned execution subject may extract video frames from the historical presentation video, and input the extracted video frames into a pre-trained vector conversion model to obtain a feature vector.
- the obtained feature vector may be used to characterize the input video frame.
- step 403 based on the feature vector, the feature vector corresponding to the video for historical presentation is determined.
- the execution subject may determine the feature vector corresponding to the video for historical presentation.
- the feature vector corresponding to the video for historical presentation can be used to characterize the feature of the video for historical presentation.
- step 404 the vector retrieval engine corresponding to the candidate feature vector set is used to retrieve the feature vector corresponding to the video for historical presentation, and the retrieved candidate feature vector is determined as the target feature vector.
- the execution subject may use a predetermined set of candidate feature vectors to construct a vector retrieval engine, and then may use a vector search engine corresponding to the set of candidate feature vectors to retrieve the feature vector corresponding to the video for historical presentation, And determining the retrieved candidate feature vector as the target feature vector.
- a search engine refers to a system that uses specific computer programs to collect information according to a certain strategy, organizes and processes the information, and provides a search service.
- the vector retrieval engine refers to an engine that takes candidate feature vectors as retrieval targets.
- the above-mentioned execution subject can use the candidate feature vector set to construct the vector retrieval engine in various ways, for example, using the IVFADC algorithm.
- the conditions satisfied by the retrieval result of the vector retrieval engine can be set, so that the retrieved candidate feature vector and history
- the similarity of the feature vector corresponding to the video for presentation is greater than or equal to a preset threshold.
- the candidate feature vector set may be generated by the above-mentioned executive body, or may be generated by another electronic device and sent to the above-mentioned execution body.
- the candidate feature vector set is updated (that is, the feature vector corresponding to the new presentation video is added)
- the above-mentioned execution subject may update the vector retrieval engine.
- step 405 the video for presentation corresponding to the target feature vector is selected from the video set for presentation.
- the execution subject may select the video for presentation corresponding to the target feature vector from the video set for presentation.
- step 401, step 402, step 403, and step 405 are respectively the same as step 201, step 202, step 203, and step 205 in the foregoing embodiment, and the above descriptions of step 201, step 202, step 203, and step 205 also apply Step 401, step 402, step 403 and step 405 will not be repeated here.
- the flow 400 of the method for processing information in this embodiment highlights the step of determining the target feature vector by using the vector retrieval engine. Therefore, this embodiment provides another solution for obtaining a target feature vector, and using a vector retrieval engine for information processing can obtain a faster processing speed and improve the efficiency of information processing.
- the present application provides an embodiment of an apparatus for processing information, which corresponds to the method embodiment shown in FIG. 2, and the apparatus may specifically Used in various electronic devices.
- the information processing apparatus 500 of this embodiment includes: a video acquisition unit 501, a vector generation unit 502, a first determination unit 503, a second determination unit 504, and a video selection unit 505.
- the video obtaining unit 501 is configured to obtain the video for historical presentation corresponding to the target user, wherein the video for historical presentation is a video output to the target user terminal used by the target user for viewing by the target user within a historical time period;
- vector The generating unit 502 is configured to extract video frames from the video for historical presentation, and input the extracted video frames into a pre-trained vector conversion model to obtain a feature vector;
- the first determining unit 503 is configured to determine based on the feature vector Feature vector corresponding to the video for historical presentation;
- the second determining unit 504 is configured to determine candidate features with a similarity to the feature vector corresponding to the video for historical presentation from a predetermined set of candidate feature vectors greater than or equal to a preset threshold Vector, using the candidate feature vector whose similarity is greater than or equal to
- the video acquisition unit 501 of the apparatus 500 for processing information may acquire the video for historical presentation corresponding to the target user through a wired connection or a wireless connection.
- the video for historical presentation may be a video that is output to the target user terminal used by the target user within the historical time period and used for browsing by the target user.
- the target user is a user who is to determine a presentation video similar to its corresponding historical presentation video from the collection of presentation videos.
- the video set for presentation may be a predetermined video set.
- the video for presentation is a video for output to a terminal connected to the communication connection for presentation to the user.
- the vector generation unit 502 can extract video frames from the historical presentation video, and input the extracted video frames into a pre-trained vector conversion model to obtain features vector.
- the obtained feature vector may be used to characterize the input video frame.
- the vector conversion model is a model for extracting the features of the video frame, which can be used to characterize the correspondence between the video frame and the feature vector corresponding to the video frame.
- the vector conversion model may include a structure for extracting image features (such as a convolutional layer), and of course may include other structures (such as a pooling layer).
- the first determining unit 503 may determine the feature vector corresponding to the video for historical presentation.
- the feature vector corresponding to the video for historical presentation can be used to characterize the feature of the video for historical presentation.
- the second determining unit 503 may determine the feature vector corresponding to the video for historical presentation from a predetermined set of candidate feature vectors Candidate feature vectors whose similarity is greater than or equal to a preset threshold, and the candidate feature vectors whose similarity is greater than or equal to a preset threshold are used as target feature vectors.
- the preset threshold may be a value preset by a technician.
- the candidate feature vector set is a predetermined set, and the candidate feature vector in the candidate feature vector set corresponds to the presentation video in the presentation video set.
- the candidate feature vector is used to characterize the feature of the video for presentation corresponding to the candidate feature vector in the video set for presentation.
- the video selection unit 505 may select the video for presentation corresponding to the target feature vector from the video set for presentation.
- the candidate feature vector set corresponds to the vector retrieval engine; and the second determination unit 504 may be configured to: use the vector retrieval engine corresponding to the candidate feature vector set to correspond to the historical presentation video The feature vector is retrieved, and the candidate feature vector retrieved is determined as the target feature vector.
- the apparatus 500 may further include: a time determination unit (not shown in the figure) configured to determine the browsing time of the target user ’s browsing history presentation video using the target user terminal; the video output unit ( (Not shown in the figure), configured to output the selected video for presentation to the target user terminal after a preset time from the browsing time.
- a time determination unit (not shown in the figure) configured to determine the browsing time of the target user ’s browsing history presentation video using the target user terminal
- the video output unit (Not shown in the figure), configured to output the selected video for presentation to the target user terminal after a preset time from the browsing time.
- the vector generation unit 502 may be configured to: extract at least two video frames from the video for historical presentation, and input the extracted at least two video frames to a pre-trained vector conversion, respectively In the model, at least two feature vectors are obtained.
- the first determining unit 503 may be configured to sum at least two feature vectors, and use the summed result as the feature vector corresponding to the video for historical presentation.
- the candidate feature vector set may be obtained by the following generation step: based on the target presentation video and the initial candidate feature vector set, the following determination step is performed: extract the video frame from the target presentation video, and The extracted video frame is input into a vector conversion model to obtain the feature vector corresponding to the video frame of the target presentation video; based on the feature vector corresponding to the video frame of the target presentation video, the feature vector corresponding to the target presentation video is determined; Use the feature vector corresponding to the target presentation video as a candidate feature vector and add it to a predetermined set of initial candidate feature vectors to generate a set of added candidate feature vectors; determine whether a new presentation video is acquired; in response to determining that it is not acquired In the new presentation video, the candidate feature vector set after addition is determined as the candidate feature vector set.
- the generating step may further include: in response to determining that a new presentation video is acquired, using the new presentation video as the target presentation video, and using the added candidate feature vector set as the initial candidate feature Vector collection, continue to perform the determination step.
- the apparatus 500 provided by the above embodiment of the present application effectively uses the candidate feature vector set to determine the presentation video similar to the historical presentation video corresponding to the target user, which is helpful for subsequent processing of the determined presentation video, For example, controlling the presentation time of the determined presentation video improves the pertinence and diversity of information processing.
- FIG. 6 shows a schematic structural diagram of a computer system 600 suitable for implementing the server of the embodiment of the present application.
- the server shown in FIG. 6 is only an example, and should not bring any limitation to the functions and usage scope of the embodiments of the present application.
- the computer system 600 includes a central processing unit (Central Processing Unit, CPU) 601, which can be loaded into a random unit according to a program stored in a read-only memory (Read-Only Memory, ROM) 602 or from the storage section 608
- the program in the memory (Random Access Memory) 603 is accessed to perform various appropriate actions and processes.
- RAM 603 various programs and data necessary for the operation of the system 600 are also stored.
- the CPU 601, ROM 602, and RAM 603 are connected to each other through a bus 604.
- An input / output (Input / Output, I / O) interface 605 is also connected to the bus 604.
- the following components are connected to the I / O interface 605: an input section 606 including a keyboard, a mouse, etc .; an output section 607 including a cathode ray tube (Cathode Ray Tube, CRT), liquid crystal display (Liquid Crystal Display, LCD), etc.
- a storage section 608 including a hard disk, etc .; and a communication section 609 including a network interface card such as a local area network (Local Area Network, LAN) card, modem, etc.
- the communication section 609 performs communication processing via a network such as the Internet.
- the driver 610 is also connected to the I / O interface 605 as needed.
- a removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is installed on the drive 610 as necessary, so that the computer program read out therefrom is installed into the storage section 608 as needed.
- the process described above with reference to the flowchart may be implemented as a computer software program.
- embodiments of the present disclosure include a computer program product that includes a computer program carried on a computer-readable medium, the computer program containing program code for performing the method shown in the flowchart.
- the computer program may be downloaded and installed from the network through the communication section 609, or installed from at least one of two ways from the removable medium 611.
- CPU central processing unit
- the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
- the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination of the above.
- Computer readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable removable Programmable read-only memory (Erasable Programmable Read-Only Memory, EPROM or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any of the above The right combination.
- the computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
- the computer-readable signal medium may include a data signal that is propagated in a baseband or as part of a carrier wave, in which a computer-readable program code is carried.
- This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- the computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in combination with an instruction execution system, apparatus, or device. .
- the program code contained on the computer-readable medium may be transmitted on any appropriate medium, including but not limited to: wireless, wire, optical cable, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the foregoing.
- each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, and the module, program segment, or part of the code contains at least one Execute instructions.
- the functions noted in the block may occur out of the order noted in the figures. For example, two blocks represented in succession may actually be executed in parallel, and they may sometimes be executed in reverse order, depending on the functions involved.
- each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts can be implemented with dedicated hardware-based systems that perform specified functions or operations Or, it can be realized by a combination of dedicated hardware and computer instructions.
- the units described in the embodiments of the present application may be implemented in software or hardware.
- the described unit may also be provided in the processor.
- a processor includes a video acquisition unit, a vector generation unit, a first determination unit, a second determination unit, and a video selection unit.
- the names of these units do not constitute a limitation on the unit itself under certain circumstances.
- the video acquisition unit can also be described as a “unit for acquiring video for historical presentation”.
- the present application also provides a computer-readable medium, which may be contained in the server described in the foregoing embodiments; or may exist alone without being assembled into the server.
- the above computer-readable medium carries at least one program, and when the above at least one program is executed by the server, the server is caused to: acquire the video for historical presentation corresponding to the target user, wherein the video for historical presentation is output during the historical time period Video for the target user terminal used by the target user for browsing by the target user; extract video frames from the video for historical presentation, and input the extracted video frames into a pre-trained vector conversion model to obtain feature vectors;
- the obtained feature vector determines the feature vector corresponding to the video for historical presentation; the candidate feature vector whose similarity to the feature vector corresponding to the video for historical presentation is greater than or equal to a preset threshold is determined from a predetermined set of candidate feature vectors
- the target feature vector wherein the candidate feature vector in the candidate feature vector set corresponds to the video for presentation in the predetermined video set for presentation; the video for presentation
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本申请实施例公开了一种处理信息的方法和装置。该方法包括:获取目标用户所对应的历史呈现用视频;从所述历史呈现用视频中提取视频帧,并将所提取的视频帧输入至预先训练的向量转化模型中,获得特征向量;基于特征向量,确定所述历史呈现用视频所对应的特征向量;从预先确定的候选特征向量集合中确定与所示历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值的候选特征向量,将相似度大于或等于预设阈值的候选特征向量作为目标特征向量,其中,候选特征向量集合中的候选特征向量对应预先确定的呈现用视频集合中的呈现用视频;从呈现用视频集合中选取目标特征向量所对应的呈现用视频。
Description
本申请要求在2018年10月31日提交中国专利局、申请号为201811289810.8的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
本申请实施例涉及计算机技术领域,例如处理信息的方法和装置。
随着科技的发展,人们已经可以使用手机等电子设备浏览视频。通常,用于呈现给某个用户的视频可能包括相似的视频。由于用户在短时间内浏览相似的视频,可能引起用户的反感,所以需要对相似的视频的显示频率进行控制。
发明内容
本申请实施例提出了处理信息的方法和装置。
第一方面,本申请实施例提供了一种处理信息的方法,包括:获取目标用户所对应的历史呈现用视频,其中,所述历史呈现用视频为在历史时间段内输出给目标用户所使用的目标用户终端供目标用户浏览的视频;从历史呈现用视频中提取视频帧,并将所提取的视频帧输入至预先训练的向量转化模型中,获得特征向量;基于所述特征向量,确定所述历史呈现用视频所对应的特征向量;从预先确定的候选特征向量集合中确定与所述历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值的候选特征向量,将所述相似度大于或等于预设阈值的候选特征向量作为目标特征向量,其中,候选特征向量集合中的候选特征向量对应预先确定的呈现用视频集合中的呈现用视频;从呈现用视频集合中选取目标特征向量所对应的呈现用视频。
第二方面,本申请实施例提供了一种处理信息的装置,包括:视频获取单元,被配置成获取目标用户所对应的历史呈现用视频,其中,所述历史呈现用视频为在历史时间段内输出给目标用户所使用的目标用户终端且供目标用户浏览的视频;向量生成单元,被配置成从素数历史呈现用视频中提取视频帧,并将所提取的视频帧输入至预先训练的向量转化模型中,获得特征向量;第一确定单元,被配置成基于所述特征向量,确定所述历史呈现用视频所对应的特征向量;第二确定单元,被配置成从预先确定的候选特征向量集合中确定与所述历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值的候选特征向量,将所述相似度大于或等于预设阈值的候选特征向量作为目标特征向量,其中,候选特征向量集合中的候选特征向量对应预先确定的呈现用视频集合中的 呈现用视频;视频选取单元,被配置成从呈现用视频集合中选取目标特征向量所对应的呈现用视频。
第三方面,本申请实施例提供了一种服务器,包括:至少一个处理器;存储装置,其上存储有至少一个程序,当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现上述处理信息的方法中任一实施例的方法。
第四方面,本申请实施例提供了一种计算机可读介质,其上存储有计算机程序,所述程序被处理器执行时实现上述处理信息的方法中任一实施例的方法。
图1是本申请的一个实施例可以应用于其中的示例性系统架构图;
图2是根据本申请的处理信息的方法的一个实施例的流程图;
图3是根据本申请实施例的处理信息的方法的一个应用场景的示意图;
图4是根据本申请的处理信息的方法的又一个实施例的流程图;
图5是根据本申请的处理信息的装置的一个实施例的结构示意图;
图6是适于用来实现本申请实施例的服务器的计算机系统的结构示意图。
下面结合附图和实施例对本申请作详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。
图1示出了可以应用本申请的处理信息的方法或处理信息的装置的实施例的示例性系统架构100。
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例如视频处理软件、网页浏览器应用、搜索类应用、社交平台软件、即时通信工具等。
终端设备101、102、103可以是硬件,也可以是软件。当终端设备101、102、103为硬件时,可以是具有显示屏并且支持视频处理的各种电子设备, 包括但不限于智能手机、平板电脑、电子书阅读器、动态影像专家压缩标准音频层面3(Moving Picture Experts Group Audio Layer III,MP3)播放器、动态影像专家压缩标准音频层面4(Moving Picture Experts Group Audio Layer IV,MP4)播放器、膝上型便携计算机和台式计算机等等。当终端设备101、102、103为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块(例如用来提供分布式服务的多个软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。
服务器105可以是提供各种服务的服务器,例如对终端设备101、102、103上显示的呈现用视频提供支持的后台服务器。后台服务器可以获取终端设备上显示的历史呈现用视频,并对所获取的历史呈现用视频等数据进行分析等处理,获得处理结果(例如目标特征向量所对应的呈现用视频)。
需要说明的是,本申请实施例所提供的处理信息的方法一般由服务器105执行,相应地,处理信息的装置一般设置于服务器105中。
需要说明的是,服务器可以是硬件,也可以是软件。当服务器为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器。当服务器为软件时,可以实现成多个软件或软件模块(例如用来提供分布式服务的多个软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。在获得目标特征向量所对应的呈现用视频的过程中所使用的数据不需要从远程获取的情况下,上述系统架构可以不包括网络和终端设备,而只包括服务器。
继续参考图2,图2示出了根据本申请的处理信息的方法的一个实施例的流程200。该处理信息的方法,包括步骤201至步骤205。
在步骤201中,获取目标用户所对应的历史呈现用视频。
在本实施例中,处理信息的方法的执行主体(例如图1所示的服务器)可以通过有线连接方式或者无线连接方式获取目标用户所对应的历史呈现用视频。其中,历史呈现用视频可以为在历史时间段内输出给目标用户所使用的目标用户终端供目标用户浏览的视频。历史时间段可以为预先设置的时间段,例如2018年3月份;也可以为以首次向目标用户所使用的目标用户终端输出呈现用视频的时间为时间起点,以最近一次向目标用户所使用的目标用户终端输出呈现用视频的时间为时间终点的时间段。
目标用户为待从呈现用视频集合中确定与其所对应的历史呈现用视频相似的呈现用视频的用户。呈现用视频集合可以为预先确定的视频集合。呈现用视频为输出给通信连接的终端、以呈现给用户的视频。
实践中,上述执行主体可以获取预先存储于本地的、目标用户所对应的历史呈现用视频;或者,上述执行主体可以获取目标用户终端发送的、目标用户所对应的历史呈现用视频。
可以理解的是,在这里,上述执行主体可以获取至少一个目标用户所对应的历史呈现用视频,进而,可以针对所获取的至少一个历史呈现用视频中的每个历史呈现用视频,执行后续步骤202-步骤205。
在步骤202中,从历史呈现用视频中提取视频帧,并将所提取的视频帧输入至预先训练的向量转化模型中,获得特征向量。
在本实施例中,基于步骤201中得到的历史呈现用视频,上述执行主体可以从历史呈现用视频中提取视频帧,以及将所提取的视频帧输入预先训练的向量转化模型,获得特征向量。其中,所获得的特征向量可以用于表征所输入的视频帧的特征。
可以理解,视频实质上是一个按照时间的先后顺序排列的视频帧序列。进而,上述执行主体可以采用各种方法从历史呈现用视频中提取视频帧,例如可以采用随机提取的方法,从历史呈现用视频中提取视频帧;或者,可以从历史呈现用视频所对应的视频帧序列中,提取排序在预设位置的视频帧。
在本实施例中,向量转化模型为用于提取视频帧的特征的模型,可以用于表征视频帧与视频帧所对应的特征向量的对应关系。例如,由于视频帧实质上是图像,进而向量转化模型可以包括用于提取图像特征的结构(例如卷积层),当然还可以包括其他结构(例如池化层)。
需要说明的是,训练获得向量转化模型的方法是目前广泛研究和应用的公知技术,此处不再赘述。
在本实施例的一些示例实现方式中,上述执行主体可以从历史呈现用视频中提取至少两个视频帧,并将至少两个视频帧分别输入至预先训练的向量转化模型中,获得至少两个特征向量。本实现方式可以为后续利用至少两个特征向量确定历史呈现用视频所对应的特征向量提供支持,利用至少两个特征向量确定历史呈现用视频所对应的特征向量,可以提高所确定的特征向量的准确性。
在步骤203中,基于特征向量,确定历史呈现用视频所对应的特征向量。
在本实施例中,基于步骤202所获得的特征向量,上述执行主体可以确定历史呈现用视频所对应的特征向量。历史呈现用视频所对应的特征向量可以用于表征历史呈现用视频的特征。例如,上述执行主体可以通过各种方法确定历史呈现用视频所对应的特征向量。
作为示例,当所获得的特征向量仅包括一个时,上述执行主体可以直接将该特征向量确定为历史呈现用视频所对应的特征向量,或者,上述执行主 体可以对该特征向量进行处理(例如乘以预设数值),并将处理后的特征向量确定为历史呈现用视频所对应的特征向量;当所获得的特征向量包括至少两个时,上述执行主体可以对所获得的至少两个特征向量进行处理(例如进行均值计算),并将处理结果确定为历史呈现用视频所对应的特征向量。
在本实施例的一些示例实现方式中,当特征向量包括至少两个时,上述执行主体可以对至少两个特征向量进行求和,将求和结果作为历史呈现用视频所对应的特征向量。
在本实现方式中,通过至少两个视频帧所对应的特征向量确定历史呈现用视频所对应的特征向量,增加了参考数据,可以提高确定出的、历史呈现用视频所对应的特征向量的准确度。
在步骤204中,从预先确定的候选特征向量集合中确定与历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值的候选特征向量,将相似度大于或等于预设阈值的候选特征向量作为目标特征向量。
在本实施例中,基于步骤203中获得的历史呈现用视频所对应的特征向量,上述执行主体可以从预先确定的候选特征向量集合中确定与历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值的候选特征向量作为目标特征向量。其中,预设阈值可以为技术人员预先设置的数值。例如,上述执行主体可以采用各种方法从预先确定的候选特征向量集合中确定与历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值的候选特征向量作为目标特征向量,例如,上述执行主体可以分别对历史呈现用视频所对应的特征向量和候选特征向量集合中的候选特征向量进行相似度计算,获得计算结果,并将计算结果与预设阈值进行比较,进而确定出大于或等于预设阈值的计算结果所对应的候选特征向量作为目标特征向量。
需要说明的是,相似度计算是目前广泛研究和应用的公知技术,此处不再赘述。
在本实施例中,候选特征向量集合为预先确定的集合,候选特征向量集合中的候选特征向量对应呈现用视频集合中的呈现用视频。例如,对于候选特征向量集合中的候选特征向量,该候选特征向量用于表征呈现用视频集合中与该候选特征向量相对应的呈现用视频的特征。
实践中,对于上述呈现用视频集合中的每个呈现用视频,可以从该呈现用视频中提取视频帧,并将所提取的视频帧输入上述向量转化模型,获得视频帧所对应的特征向量,然后利用视频帧所对应的特征向量确定呈现用视频所对应的候选特征向量,进而利用所确定的、呈现用视频所对应的候选特征向量组成候选特征向量集合。
在本实施例的一些示例实现方式中,候选特征向量集合可以通过以下生 成步骤获得:基于目标呈现用视频和初始候选特征向量集合,执行以下确定步骤:从目标呈现用视频中提取视频帧,将所提取的视频帧输入至上述向量转化模型中,获得目标呈现用视频的视频帧所对应的特征向量。基于目标呈现用视频的视频帧所对应的特征向量,确定目标呈现用视频所对应的特征向量。接着将目标呈现用视频所对应的特征向量作为候选特征向量,添加到预先确定的初始候选特征向量集合中,生成添加后候选特征向量集合。确定是否获取到新的呈现用视频。响应于确定未获取到新的呈现用视频,将添加后候选特征向量集合确定为候选特征向量集合。
其中,目标呈现用视频为上述生成步骤的执行主体预先获取到的呈现用视频。初始候选特征向量集合可以为未添加候选特征向量的集合,也可以为添加过候选特征向量的集合。
在本实施例的一些示例实现方式中,上述生成步骤还可以包括:响应于确定获取到新的呈现用视频,使用新的呈现用视频作为目标呈现用视频,使用添加后候选特征向量集合作为初始候选特征向量集合,继续执行上述确定步骤。
通过本实现方式,可以实时将获取到的呈现用视频所对应的特征向量添加到初始候选特征向量集合中,生成候选特征向量集合,提高了候选特征向量集合的全面性。
需要说明的是,上述生成步骤的执行主体可以与处理信息的方法的执行主体相同或者不同。如果相同,则生成步骤的执行主体可以在获得候选特征向量集合后,将候选特征向量集合存储在本地。如果不同,则生成步骤的执行主体可以在获得候选特征向量集合后,将候选特征向量集合发送给处理信息的方法的执行主体。
在步骤205中,从呈现用视频集合中选取目标特征向量所对应的呈现用视频。
在本实施例中,基于步骤204中获得的目标特征向量,上述执行主体可以从呈现用视频集合中选取目标特征向量所对应的呈现用视频。
可以理解的是,由于目标特征向量为与历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值的特征向量,且呈现用视频所对应的特征向量可以用于表征呈现用视频的特征,故基于本步骤选取出的呈现用视频为与历史呈现用视频相似的呈现用视频。
在本实施例的一些示例实现方式中,在从呈现用视频集合中选取目标特征向量所对应的呈现用视频之后,上述执行主体还可以执行以下步骤:首先,上述执行主体可以确定目标用户利用目标用户终端浏览历史呈现用视频的浏览时间(例如2018年9月1日)。然后,在距离浏览时间预设时长(例如一 个月)后,上述执行主体可以将所选取的呈现用视频输出给目标用户终端。
在本实现方式中,在上述预设时长所对应的时间范围内,上述执行主体可以不将所选取的呈现用视频输出给目标用户终端,预设时长后,再对所选取的呈现用视频进行输出,以此,可以控制相似的呈现用视频的呈现频率,有助于增强用户体验,提高了信息处理的多样性。
继续参见图3,图3是根据本实施例的处理信息的方法的应用场景的一个示意图。在图3的应用场景中,服务器301可以获取目标用户终端302发送的、目标用户所对应的历史呈现用视频303,其中,历史呈现用视频303为在过去一天内,服务器301输出给目标用户终端302的、用于目标用户浏览的视频。而后,服务器301可以从历史呈现用视频303中提取视频帧3031和视频帧3032,以及将所提取的视频帧3031和视频帧3032分别输入预先训练的向量转化模型304,获得视频帧3031所对应的特征向量3051,以及视频帧3032所对应的特征向量3052。接着,服务器301可以基于所获得的特征向量3051和特征向量3052,确定历史呈现用视频303所对应的特征向量306。接着,服务器301可以获取预先确定的候选特征向量集合307,以及从候选特征向量集合307中确定与历史呈现用视频303所对应的特征向量306的相似度大于或等于预设阈值(例如“10”)的候选特征向量作为目标特征向量308,其中,候选特征向量集合307中的候选特征向量对应预先确定的呈现用视频集合中的呈现用视频。最后,服务器301可以获取上述呈现用视频集合309,以及从呈现用视频集合309中选取目标特征向量308所对应的呈现用视频3091。
本申请的上述实施例提供的方法有效利用候选特征向量集合,确定出了与目标用户所对应的历史呈现用视频相似的呈现用视频,有助于后续对所确定的呈现用视频进行处理,比如,控制所确定的呈现用视频的呈现时间,提高了信息处理的针对性和多样性。
参考图4,其示出了处理信息的方法的又一个实施例的流程400。该处理信息的方法的流程400,包括步骤401至步骤405。
在步骤401中,获取目标用户所对应的历史呈现用视频。
在本实施例中,处理信息的方法的执行主体(例如图1所示的服务器)可以通过有线连接方式或者无线连接方式获取目标用户所对应的历史呈现用视频。
在步骤402中,从历史呈现用视频中提取视频帧,并将所提取的视频帧输入至预先训练的向量转化模型中,获得特征向量。
在本实施例中,基于步骤401中得到的历史呈现用视频,上述执行主体 可以从历史呈现用视频中提取视频帧,以及将所提取的视频帧输入预先训练的向量转化模型,获得特征向量。其中,所获得的特征向量可以用于表征所输入的视频帧的特征。
在步骤403中,基于特征向量,确定历史呈现用视频所对应的特征向量。
在本实施例中,基于步骤402所获得的特征向量,上述执行主体可以确定历史呈现用视频所对应的特征向量。历史呈现用视频所对应的特征向量可以用于表征历史呈现用视频的特征。
在步骤404中,利用候选特征向量集合所对应的向量检索引擎,对历史呈现用视频所对应的特征向量进行检索,以及将检索出的候选特征向量确定为目标特征向量。
在本实施例中,上述执行主体可以利用预先确定的候选特征向量集合构建向量检索引擎,进而可以利用候选特征向量集合所对应的向量检索引擎,对历史呈现用视频所对应的特征向量进行检索,以及将检索出的候选特征向量确定为目标特征向量。
实践中,检索引擎(Search Engine)是指根据一定的策略、运用特定的计算机程序搜集信息,在对信息进行组织和处理后,提供检索服务的系统。在这里,向量检索引擎指的是以候选特征向量作为检索目标的引擎。例如,上述执行主体可以利用候选特征向量集合,采用各种方式构建向量检索引擎,例如采用IVFADC算法构建。
可以理解,在构建向量检索引擎时,可以设置向量检索引擎的检索结果所满足的条件(即检索结果与检索对象的相似度大于或等于预设阈值),以使得检索出的候选特征向量与历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值。
需要说明的是,实践中,候选特征向量集合可以由上述执行主体生成,也可以由其他电子设备生成并发送给上述执行主体。当候选特征向量集合更新(即添加新的呈现用视频所对应的特征向量)时,上述执行主体可以更新向量检索引擎。
在步骤405中,从呈现用视频集合中选取目标特征向量所对应的呈现用视频。
在本实施例中,基于步骤404中获得的目标特征向量,上述执行主体可以从呈现用视频集合中选取目标特征向量所对应的呈现用视频。
上述步骤401、步骤402、步骤403、步骤405分别与前述实施例中的步骤201、步骤202、步骤203、步骤205一致,上文针对步骤201、步骤202、步骤203和步骤205的描述也适用于步骤401、步骤402、步骤403和步骤405,此处不再赘述。
从图4中可以看出,与图2对应的实施例相比,本实施例中的处理信息的方法的流程400突出了利用向量检索引擎确定目标特征向量的步骤。由此,本实施例提供了又一种获得目标特征向量的方案,且利用向量检索引擎进行信息处理,可以获得更快的处理速度,提高了信息处理的效率。
参考图5,作为对上述各图所示方法的实现,本申请提供了一种处理信息的装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图5所示,本实施例的处理信息的装置500包括:视频获取单元501、向量生成单元502、第一确定单元503、第二确定单元504和视频选取单元505。其中,视频获取单元501被配置成获取目标用户所对应的历史呈现用视频,其中,历史呈现用视频为在历史时间段内输出给目标用户所使用的目标用户终端供目标用户浏览的视频;向量生成单元502被配置成从历史呈现用视频中提取视频帧,并将所提取的视频帧输入至预先训练的向量转化模型中,获得特征向量;第一确定单元503被配置成基于特征向量,确定历史呈现用视频所对应的特征向量;第二确定单元504被配置成从预先确定的候选特征向量集合中确定与历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值的候选特征向量,将所述相似度大于或等于预设阈值的候选特征向量作为目标特征向量,其中,候选特征向量集合中的候选特征向量对应预先确定的呈现用视频集合中的呈现用视频;视频选取单元505被配置成从呈现用视频集合中选取目标特征向量所对应的呈现用视频。
在本实施例中,处理信息的装置500的视频获取单元501可以通过有线连接方式或者无线连接方式获取目标用户所对应的历史呈现用视频。其中,历史呈现用视频可以为在历史时间段内输出给目标用户所使用的目标用户终端的、用于目标用户浏览的视频。目标用户为待从呈现用视频集合中确定与其所对应的历史呈现用视频相似的呈现用视频的用户。呈现用视频集合可以为预先确定的视频集合。呈现用视频为用于输出给通信连接的终端、以呈现给用户的视频。
在本实施例中,基于视频获取单元501得到的历史呈现用视频,向量生成单元502可以从历史呈现用视频中提取视频帧,以及将所提取的视频帧输入预先训练的向量转化模型,获得特征向量。其中,所获得的特征向量可以用于表征所输入的视频帧的特征。
在本实施例中,向量转化模型为用于提取视频帧的特征的模型,可以用于表征视频帧与视频帧所对应的特征向量的对应关系。例如,由于视频帧实质上是图像,进而向量转化模型可以包括用于提取图像特征的结构(例如卷积层),当然还可以包括其他结构(例如池化层)。
在本实施例中,基于向量生成单元502所获得的特征向量,第一确定单元503可以确定历史呈现用视频所对应的特征向量。历史呈现用视频所对应的特征向量可以用于表征历史呈现用视频的特征。
在本实施例中,基于第一确定单元503获得的历史呈现用视频所对应的特征向量,第二确定单元503可以从预先确定的候选特征向量集合中确定与历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值的候选特征向量,将相似度大于或等于预设阈值的候选特征向量作为目标特征向量。其中,预设阈值可以为技术人员预先设置的数值。
在本实施例中,候选特征向量集合为预先确定的集合,候选特征向量集合中的候选特征向量对应呈现用视频集合中的呈现用视频。对于候选特征向量集合中的候选特征向量,该候选特征向量用于表征呈现用视频集合中与该候选特征向量相对应的呈现用视频的特征。
在本实施例中,基于第二确定单元504获得的目标特征向量,视频选取单元505可以从呈现用视频集合中选取目标特征向量所对应的呈现用视频。
在本实施例的一些实现方式中,候选特征向量集合对应向量检索引擎;以及第二确定单元504可以被配置成:利用候选特征向量集合所对应的向量检索引擎,对历史呈现用视频所对应的特征向量进行检索,并将检索出的候选特征向量确定为目标特征向量。
在本实施例的一些实现方式中,装置500还可以包括:时间确定单元(图中未示出),被配置成确定目标用户利用目标用户终端浏览历史呈现用视频的浏览时间;视频输出单元(图中未示出),被配置成在距离浏览时间预设时长后,将所选取的呈现用视频输出给目标用户终端。
在本实施例的一些实现方式中,向量生成单元502可以被配置成:从历史呈现用视频中提取至少两个视频帧,并将所提取的至少两个视频帧分别输入至预先训练的向量转化模型中,获得至少两个特征向量。
在本实施例的一些实现方式中,第一确定单元503可以被配置成:对至少两个特征向量进行求和,将求和结果作为历史呈现用视频所对应的特征向量。
在本实施例的一些实现方式中,候选特征向量集合可以通过以下生成步骤获得:基于目标呈现用视频和初始候选特征向量集合,执行以下确定步骤:从目标呈现用视频中提取视频帧,将所提取的视频帧输入至向量转化模型中,获得目标呈现用视频的视频帧所对应的特征向量;基于目标呈现用视频的视频帧所对应的特征向量,确定目标呈现用视频所对应的特征向量;将目标呈现用视频所对应的特征向量作为候选特征向量,添加到预先确定的初始候选特征向量集合中,生成添加后候选特征向量集合;确定是否获取到新的呈现 用视频;响应于确定未获取到新的呈现用视频,将添加后候选特征向量集合确定为候选特征向量集合。
在本实施例的一些实现方式中,生成步骤还可以包括:响应于确定获取到新的呈现用视频,使用新的呈现用视频作为目标呈现用视频,使用添加后候选特征向量集合作为初始候选特征向量集合,继续执行确定步骤。
可以理解的是,该装置500中记载的诸单元与参考图2描述的方法中的各个步骤相对应。由此,上文针对方法描述的操作、特征同样适用于装置500及其中包含的单元,在此不再赘述。
本申请的上述实施例提供的装置500有效利用候选特征向量集合,确定出了与目标用户所对应的历史呈现用视频相似的呈现用视频,有助于后续对所确定的呈现用视频进行处理,比如,控制所确定的呈现用视频的呈现时间,提高了信息处理的针对性和多样性。
下面参考图6,其示出了适于用来实现本申请实施例的服务器的计算机系统600的结构示意图。图6示出的服务器仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图6所示,计算机系统600包括中央处理单元(Central Processing Unit,CPU)601,其可以根据存储在只读存储器(Read-Only Memory,ROM)602中的程序或者从存储部分608加载到随机访问存储器(Random Access Memory,RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有系统600操作所需的各种程序和数据。CPU 601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(Input/Output,I/O)接口605也连接至总线604。
以下部件连接至I/O接口605:包括键盘、鼠标等的输入部分606;包括诸如阴极射线管(Cathode Ray Tube,CRT)、液晶显示器(Liquid Crystal Display,LCD)等以及扬声器等的输出部分607;包括硬盘等的存储部分608;以及包括诸如局域网(Local Area Network,LAN)卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器610上,以便于从其上读出的计算机程序根据需要被安装入存储部分608。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分609从网络上被下载和安装,或从可拆卸介质611被安装这两种方式中 的至少一种被安装。在该计算机程序被中央处理单元(CPU)601执行时,执行本申请的方法中限定的上述功能。需要说明的是,本申请所述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。
附图中的流程图和框图,图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含至少一个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本申请实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括视频获取单元、向量生成单元、第一确定单元、第二确定单元和视频选取单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,视频获取单元还可以被描述为“获取历史呈 现用视频的单元”。
作为另一方面,本申请还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的服务器中所包含的;也可以是单独存在,而未装配入该服务器中。上述计算机可读介质承载有至少一个程序,当上述至少一个程序被该服务器执行时,使得该服务器:获取目标用户所对应的历史呈现用视频,其中,历史呈现用视频为在历史时间段内输出给目标用户所使用的目标用户终端的、用于目标用户浏览的视频;从历史呈现用视频中提取视频帧,以及将所提取的视频帧输入预先训练的向量转化模型,获得特征向量;基于所获得的特征向量,确定历史呈现用视频所对应的特征向量;从预先确定的候选特征向量集合中确定与历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值的候选特征向量作为目标特征向量,其中,候选特征向量集合中的候选特征向量对应预先确定的呈现用视频集合中的呈现用视频;从呈现用视频集合中选取目标特征向量所对应的呈现用视频。
Claims (16)
- 一种处理信息的方法,包括:获取目标用户所对应的历史呈现用视频,其中,所述历史呈现用视频为在历史时间段内输出给所述目标用户所使用的目标用户终端且供所述目标用户浏览的视频;从所述历史呈现用视频中提取视频帧,并将所提取的视频帧输入至预先训练的向量转化模型中,获得特征向量;基于所述特征向量,确定所述历史呈现用视频所对应的特征向量;从预先确定的候选特征向量集合中确定与所述历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值的候选特征向量,将所述相似度大于或等于预设阈值的候选特征向量作为目标特征向量,其中,候选特征向量集合中的候选特征向量对应预先确定的呈现用视频集合中的呈现用视频;从所述呈现用视频集合中选取所述目标特征向量所对应的呈现用视频。
- 根据权利要求1所述的方法,其中,所述候选特征向量集合对应向量检索引擎;以及所述从预先确定的候选特征向量集合中确定与所述历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值的候选特征向量,将所述相似度大于或等于预设阈值的候选特征向量作为目标特征向量,包括:利用所述候选特征向量集合所对应的向量检索引擎,对所述历史呈现用视频所对应的特征向量进行检索,并将检索出的候选特征向量确定为目标特征向量。
- 根据权利要求1所述的方法,在所述从所述呈现用视频集合中选取所述目标特征向量所对应的呈现用视频之后,还包括:确定所述目标用户利用目标用户终端浏览所述历史呈现用视频的浏览时间;在距离所述浏览时间预设时长后,将所选取的呈现用视频输出给所述目标用户终端。
- 根据权利要求1所述的方法,其中,所述从所述历史呈现用视频中提取视频帧,并将所提取的视频帧输入至预先训练的向量转化模型中,获得特征向量,包括:从所述历史呈现用视频中提取至少两个视频帧,并将所述至少两个视频帧分别输入至预先训练的向量转化模型中,获得至少两个特征向量。
- 根据权利要求4所述的方法,其中,所述基于所述特征向量,确定所述历史呈现用视频所对应的特征向量,包括:对所述至少两个特征向量进行求和,将求和结果作为所述历史呈现用视频所对应的特征向量。
- 根据权利要求1-5中任一项所述的方法,其中,所述候选特征向量集合通过以下生成步骤获得:基于目标呈现用视频和初始候选特征向量集合,执行以下确定步骤:从目标呈现用视频中提取视频帧,将所提取的视频帧输入至所述向量转化模型中,获得目标呈现用视频的视频帧所对应的特征向量;基于目标呈现用视频的视频帧所对应的特征向量,确定目标呈现用视频所对应的特征向量;将目标呈现用视频所对应的特征向量作为候选特征向量,添加到预先确定的初始候选特征向量集合中,生成添加后候选特征向量集合;确定是否获取到新的呈现用视频;响应于确定未获取到新的呈现用视频,将添加后候选特征向量集合确定为候选特征向量集合。
- 根据权利要求6所述的方法,所述生成步骤还包括:响应于确定获取到新的呈现用视频,使用新的呈现用视频作为目标呈现用视频,使用添加后候选特征向量集合作为初始候选特征向量集合,继续执行所述确定步骤。
- 一种处理信息的装置,包括:视频获取单元,被配置成获取目标用户所对应的历史呈现用视频,其中,所述历史呈现用视频为在历史时间段内输出给所述目标用户所使用的目标用户终端且供所述目标用户浏览的视频;向量生成单元,被配置成从所述历史呈现用视频中提取视频帧,并将所提取的视频帧输入至预先训练的向量转化模型中,获得特征向量;第一确定单元,被配置成基于所述特征向量,确定所述历史呈现用视频所对应的特征向量;第二确定单元,被配置成从预先确定的候选特征向量集合中确定与所述历史呈现用视频所对应的特征向量的相似度大于或等于预设阈值的候选特征向量,将所述相似度大于或等于预设阈值的候选特征向量作为目标特征向量,其中,候选特征向量集合中的候选特征向量对应预先确定的呈现用视频集合中的呈现用视频;视频选取单元,被配置成从所述呈现用视频集合中选取所述目标特征向量所对应的呈现用视频。
- 根据权利要求8所述的装置,其中,所述候选特征向量集合对应向量检索引擎;以及所述第二确定单元被配置成:利用所述候选特征向量集合所对应的向量检索引擎,对所述历史呈现用视频所对应的特征向量进行检索,并将检索出的候选特征向量确定为目标特征向量。
- 根据权利要求8所述的装置,还包括:时间确定单元,被配置成确定所述目标用户利用目标用户终端浏览所述历史呈现用视频的浏览时间;视频输出单元,被配置成在距离所述浏览时间预设时长后,将所选取的呈现用视频输出给所述目标用户终端。
- 根据权利要求8所述的装置,其中,所述向量生成单元被配置成:从所述历史呈现用视频中提取至少两个视频帧,并将所述至少两个视频帧分别输入至预先训练的向量转化模型中,获得至少两个特征向量。
- 根据权利要求11所述的装置,其中,所述第一确定单元被配置成:对所述至少两个特征向量进行求和,将求和结果作为所述历史呈现用视频所对应的特征向量。
- 根据权利要求8-12之一所述的装置,其中,所述候选特征向量集合通过以下生成步骤获得:基于目标呈现用视频和初始候选特征向量集合,执行以下确定步骤:从目标呈现用视频中提取视频帧,将所提取的视频帧输入至所述向量转化模型中,获得目标呈现用视频的视频帧所对应的特征向量;基于目标呈现用视频的视频帧所对应的特征向量,确定目标呈现用视频所对应的特征向量;将目标呈现用视频所对应的特征向量作为候选特征向量添加到预先确定的初始候选特征向量集合中,生成添加后候选特征向量集合;确定是否获取到新的呈现用视频;响应于确定未获取到新的呈现用视频,将添加后候选特征向量集合确定为候选特征向量集合。
- 根据权利要求13所述的装置,所述生成步骤还包括:响应于确定获取到新的呈现用视频,使用新的呈现用视频作为目标呈现用视频,使用添加后候选特征向量集合作为初始候选特征向量集合,继续执行所述确定步骤。
- 一种服务器,包括:至少一个处理器;存储装置,其上存储有至少一个程序,当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-7中任一所述的方法。
- 一种计算机可读介质,其上存储有计算机程序,其中,所述程序被处理器执行时实现如权利要求1-7中任一所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811289810.8 | 2018-10-31 | ||
CN201811289810.8A CN109446379A (zh) | 2018-10-31 | 2018-10-31 | 用于处理信息的方法和装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020088048A1 true WO2020088048A1 (zh) | 2020-05-07 |
Family
ID=65549590
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/101686 WO2020088048A1 (zh) | 2018-10-31 | 2019-08-21 | 处理信息的方法和装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109446379A (zh) |
WO (1) | WO2020088048A1 (zh) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109446379A (zh) * | 2018-10-31 | 2019-03-08 | 北京字节跳动网络技术有限公司 | 用于处理信息的方法和装置 |
CN112182290A (zh) * | 2019-07-05 | 2021-01-05 | 北京字节跳动网络技术有限公司 | 一种信息处理方法、装置和电子设备 |
CN111836064B (zh) * | 2020-07-02 | 2022-01-07 | 北京字节跳动网络技术有限公司 | 一种直播内容识别方法及装置 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130343597A1 (en) * | 2012-06-26 | 2013-12-26 | Aol Inc. | Systems and methods for identifying electronic content using video graphs |
CN105141903A (zh) * | 2015-08-13 | 2015-12-09 | 中国科学院自动化研究所 | 一种基于颜色信息的在视频中进行目标检索的方法 |
CN107016592A (zh) * | 2017-03-08 | 2017-08-04 | 美的集团股份有限公司 | 基于应用引导页的家电设备推荐方法和装置 |
CN107577737A (zh) * | 2017-08-25 | 2018-01-12 | 北京百度网讯科技有限公司 | 用于推送信息的方法和装置 |
CN109446379A (zh) * | 2018-10-31 | 2019-03-08 | 北京字节跳动网络技术有限公司 | 用于处理信息的方法和装置 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150073931A1 (en) * | 2013-09-06 | 2015-03-12 | Microsoft Corporation | Feature selection for recommender systems |
CN106407401A (zh) * | 2016-09-21 | 2017-02-15 | 乐视控股(北京)有限公司 | 一种视频推荐方法及装置 |
CN106547908B (zh) * | 2016-11-25 | 2020-03-17 | 三星电子(中国)研发中心 | 一种信息推送方法和系统 |
CN107105349A (zh) * | 2017-05-17 | 2017-08-29 | 东莞市华睿电子科技有限公司 | 一种视频推荐方法 |
CN108307240B (zh) * | 2018-02-12 | 2019-10-22 | 北京百度网讯科技有限公司 | 视频推荐方法和装置 |
-
2018
- 2018-10-31 CN CN201811289810.8A patent/CN109446379A/zh active Pending
-
2019
- 2019-08-21 WO PCT/CN2019/101686 patent/WO2020088048A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130343597A1 (en) * | 2012-06-26 | 2013-12-26 | Aol Inc. | Systems and methods for identifying electronic content using video graphs |
CN105141903A (zh) * | 2015-08-13 | 2015-12-09 | 中国科学院自动化研究所 | 一种基于颜色信息的在视频中进行目标检索的方法 |
CN107016592A (zh) * | 2017-03-08 | 2017-08-04 | 美的集团股份有限公司 | 基于应用引导页的家电设备推荐方法和装置 |
CN107577737A (zh) * | 2017-08-25 | 2018-01-12 | 北京百度网讯科技有限公司 | 用于推送信息的方法和装置 |
CN109446379A (zh) * | 2018-10-31 | 2019-03-08 | 北京字节跳动网络技术有限公司 | 用于处理信息的方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
CN109446379A (zh) | 2019-03-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102342604B1 (ko) | 뉴럴 네트워크 생성 방법 및 장치 | |
CN109460514B (zh) | 用于推送信息的方法和装置 | |
CN108830235B (zh) | 用于生成信息的方法和装置 | |
WO2020087979A1 (zh) | 生成模型的方法和装置 | |
WO2020088048A1 (zh) | 处理信息的方法和装置 | |
JP2020042784A (ja) | インテリジェント端末操作用の方法及び装置 | |
CN111738010B (zh) | 用于生成语义匹配模型的方法和装置 | |
CN109862100B (zh) | 用于推送信息的方法和装置 | |
WO2020093724A1 (zh) | 生成信息的方法和装置 | |
CN110046571B (zh) | 用于识别年龄的方法和装置 | |
US20220385739A1 (en) | Method and apparatus for generating prediction information, electronic device, and computer readable medium | |
CN110866040A (zh) | 用户画像生成方法、装置和系统 | |
CN108600780B (zh) | 用于推送信息的方法、电子设备、计算机可读介质 | |
JP7504192B2 (ja) | 画像を検索するための方法及び装置 | |
CN112800276A (zh) | 视频封面确定方法、装置、介质及设备 | |
CN109710939B (zh) | 用于确定主题的方法和装置 | |
CN113450434A (zh) | 一种生成动态图像的方法和装置 | |
CN111027495A (zh) | 用于检测人体关键点的方法和装置 | |
CN110598049A (zh) | 用于检索视频的方法、装置、电子设备和计算机可读介质 | |
WO2020078049A1 (zh) | 用户信息处理方法和装置、服务器及可读介质 | |
CN111949860B (zh) | 用于生成相关度确定模型的方法和装置 | |
CN110309425B (zh) | 用于存储数据的方法和装置 | |
CN111125501B (zh) | 用于处理信息的方法和装置 | |
CN113609397A (zh) | 用于推送信息的方法和装置 | |
CN111949819A (zh) | 用于推送视频的方法和装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19880424 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 22/06/2021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19880424 Country of ref document: EP Kind code of ref document: A1 |