CN117591697A - Text recommendation method and system based on artificial intelligence and video processing - Google Patents
Text recommendation method and system based on artificial intelligence and video processing Download PDFInfo
- Publication number
- CN117591697A CN117591697A CN202410078311.3A CN202410078311A CN117591697A CN 117591697 A CN117591697 A CN 117591697A CN 202410078311 A CN202410078311 A CN 202410078311A CN 117591697 A CN117591697 A CN 117591697A
- Authority
- CN
- China
- Prior art keywords
- book
- paragraph
- user
- target
- interest
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 26
- 238000012545 processing Methods 0.000 title claims abstract description 23
- 230000004044 response Effects 0.000 claims description 6
- 238000001514 detection method Methods 0.000 claims description 5
- 230000008030 elimination Effects 0.000 claims description 5
- 238000003379 elimination reaction Methods 0.000 claims description 5
- 230000009471 action Effects 0.000 description 9
- 230000008569 process Effects 0.000 description 9
- 230000008921 facial expression Effects 0.000 description 8
- 238000003062 neural network model Methods 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 230000006399 behavior Effects 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000004424 eye movement Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- FFRBMBIXVSCUFS-UHFFFAOYSA-N 2,4-dinitro-1-naphthol Chemical compound C1=CC=C2C(O)=C([N+]([O-])=O)C=C([N+]([O-])=O)C2=C1 FFRBMBIXVSCUFS-UHFFFAOYSA-N 0.000 description 1
- 235000009355 Dianthus caryophyllus Nutrition 0.000 description 1
- 240000006497 Dianthus caryophyllus Species 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7844—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/73—Querying
- G06F16/735—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0499—Feedforward networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Library & Information Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a text recommendation method and a system based on artificial intelligence and video processing, and relates to the technical field of text recommendation; if the book reading operation of the user is detected, a front camera is opened to acquire a video when the user reads the book, and a mobile phone screen is recorded to acquire a screen recorded video; inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book; determining a target paragraph based on text content corresponding to a plurality of initial interest paragraphs in the book; generating paragraph description images based on the text content of the target paragraphs by using a generated countermeasure network; inputting the paragraph description image into a cover determination model to obtain a target book cover; and recommending the target book corresponding to the target book cover to the user, wherein the method can accurately recommend the book text suitable for the user.
Description
Technical Field
The invention relates to the technical field of text recommendation, in particular to a text recommendation method and system based on artificial intelligence and video processing.
Background
With the continuous progress of technology, artificial intelligence has become a hot topic in today's society. In the field of book recommendation, conventional recommendation methods generally recommend based on factors such as historical reading records of users, popularity of books, and the like. However, these methods often fail to accurately meet the personalized needs of the user. Firstly, although the recommendation method based on the historical reading record of the user can reflect the reading interest of the user to a certain extent, the reading interest of the user is continuously changed, the historical record can only reflect the past condition, the future reading requirement can not be predicted, and the recommendation result is often low in accuracy. Secondly, although the recommending method based on the popularity of books can recommend some popular books, the popularity does not necessarily represent the preference degree of the user to the books, so the recommending accuracy of the method is low, and the personalized requirements of the user cannot be accurately met.
How to accurately recommend book text suitable for users is a current urgent problem to be solved.
Disclosure of Invention
The invention mainly solves the technical problem of how to accurately recommend the book text suitable for the user.
According to a first aspect, the present invention provides a text recommendation method based on artificial intelligence and video processing, comprising: detecting whether a user starts a book reading operation; if the book reading operation of the user is detected, a front camera is opened to acquire a video when the user reads the book, and a mobile phone screen is recorded to acquire a screen recorded video; inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book; determining a target paragraph based on text content corresponding to a plurality of initial interest paragraphs in the book; generating paragraph description images based on the text content of the target paragraphs by using a generated countermeasure network; inputting the paragraph description image into a cover determination model to obtain a target book cover; and recommending the target book corresponding to the target book cover to a user.
Further, the interest paragraph determination model is a transducer model, the input of the interest paragraph determination model is a video when the user reads a book and the screen recorded video, and the output of the interest paragraph determination model is a plurality of initial interest paragraphs in the book.
Still further, the method further comprises: acquiring uninteresting operation of a user on the target book; in response to the uninteresting operation of the user on the target book, removing the target paragraph from the initial interest paragraphs to obtain multiple removed paragraphs; generating a plurality of paragraph description images by using the generating countermeasure network based on the text content of the plurality of paragraphs after the elimination; acquiring a paragraph description image selected by a user, wherein the paragraph description image selected by the user is a paragraph description image selected by the user from a plurality of paragraph description images; inputting the paragraph description image selected by the user into the cover determination model to obtain a book cover to be recommended; and recommending the books to be recommended corresponding to the book covers to be recommended to the user.
Still further, the detecting whether the user starts the book reading operation includes: it is detected whether the user clicks the start reading button.
Still further, the input of the generated countermeasure network is the text content of the target paragraph, and the output of the generated countermeasure network is the paragraph description image.
According to a second aspect, the present invention provides a text recommendation system based on artificial intelligence and video processing, comprising: the detection module is used for detecting whether a user starts book reading operation or not;
the acquisition module is used for opening the front camera to acquire the video when the user reads the book and recording the mobile phone screen to acquire the screen recorded video if the user is detected to open the book reading operation;
an initial paragraph determination module for inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book;
a target paragraph determining module, configured to determine a target paragraph based on text contents corresponding to a plurality of initial interest paragraphs in the book;
a paragraph description image generation module for generating a paragraph description image based on the text content of the target paragraph using a generation countermeasure network;
the target book cover determining module is used for inputting the paragraph description image into a cover determining model to obtain a target book cover;
and the recommending module is used for recommending the target book corresponding to the target book cover to the user.
Further, the interest paragraph determination model is a transducer model, the input of the interest paragraph determination model is a video when the user reads a book and the screen recorded video, and the output of the interest paragraph determination model is a plurality of initial interest paragraphs in the book.
Still further, the system is further configured to:
acquiring uninteresting operation of a user on the target book;
in response to the uninteresting operation of the user on the target book, removing the target paragraph from the initial interest paragraphs to obtain multiple removed paragraphs;
generating a plurality of paragraph description images by using the generating countermeasure network based on the text content of the plurality of paragraphs after the elimination;
acquiring a paragraph description image selected by a user, wherein the paragraph description image selected by the user is a paragraph description image selected by the user from a plurality of paragraph description images;
inputting the paragraph description image selected by the user into the cover determination model to obtain a book cover to be recommended;
and recommending the books to be recommended corresponding to the book covers to be recommended to the user.
Still further, the detection module is further configured to: it is detected whether the user clicks the start reading button.
Still further, the input of the generated countermeasure network is the text content of the target paragraph, and the output of the generated countermeasure network is the paragraph description image.
The invention provides a text recommendation method and a system based on artificial intelligence and video processing, wherein the method comprises the steps of detecting whether a user starts book reading operation or not; if the book reading operation of the user is detected, a front camera is opened to acquire a video when the user reads the book, and a mobile phone screen is recorded to acquire a screen recorded video; inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book; determining a target paragraph based on text content corresponding to a plurality of initial interest paragraphs in the book; generating paragraph description images based on the text content of the target paragraphs by using a generated countermeasure network; inputting the paragraph description image into a cover determination model to obtain a target book cover; and recommending the target book corresponding to the target book cover to the user, wherein the method can accurately recommend the book text suitable for the user.
Drawings
FIG. 1 is a schematic flow chart of a text recommendation method based on artificial intelligence and video processing according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of recommending books to be recommended according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a text recommendation system based on artificial intelligence and video processing according to an embodiment of the present invention.
Detailed Description
In the embodiment of the invention, a text recommendation method based on artificial intelligence and video processing is provided as shown in fig. 1, and the text recommendation method based on artificial intelligence and video processing comprises the following steps of S1-S7:
step S1, detecting whether a user starts book reading operation.
Whether the user is performing a book reading operation is determined by monitoring the user's behavior or operation, such as clicking a specific button or entering a specific mode. In some embodiments, whether the user initiates the book reading operation may be determined by detecting whether the user clicks a start reading button. As an example, a button of "start reading" is displayed in the mobile phone screen, and if the user clicks the button of "start reading", it is indicated that the user opens the book reading operation.
And S2, if the fact that the user starts the book reading operation is detected, opening a front camera to obtain videos when the user reads the book, and recording a mobile phone screen to obtain screen recorded videos.
The front camera is a camera of which the front face of the equipment faces towards the user and is used for shooting the face of the user.
The video of the user reading the book is a continuous image sequence containing the user's face and actions acquired by the front camera. The video of the user reading the book records the user's behavior and response during the reading process.
The screen recorded video obtained by the mobile phone screen is recorded video of screen content obtained through a mobile phone screen recording function. The video can capture the operations of scrolling, page turning, labeling and the like when a user reads books.
And S3, inputting the video of the user reading the book and the screen recorded video into an interest paragraph determining model to determine a plurality of initial interest paragraphs in the book.
The interest paragraph determination model is a transducer model, the input of the interest paragraph determination model is a video when the user reads a book and the screen recorded video, and the output of the interest paragraph determination model is a plurality of initial interest paragraphs in the book. The transducer model is one implementation of artificial intelligence.
The transducer establishes a global dependency relationship in the input sequence through a self-attention mechanism, and can process the relevance among different elements at the same time. The transducer model can understand the semantics of each element in the video and the screen recording video information when the user reads the book, and capture the association relationship between the video and the screen recording video information. The transducer model includes an Encoder (Encoder) and a Decoder (Decoder). Each section is formed by stacking a plurality of identical layers. The encoder functions to learn the representation of the input sequence, including the self-attention mechanism and the Feed-Forward Network (Feed-Forward Network). The decoder introduces an additional Multi-Head Attention mechanism (Multi-Head Attention) on the basis of the encoder for decoding the encoder output and generating the target sequence.
The transducer model can capture global information and local associations in the input sequence to better understand the context in the sequence. For the video and the screen recorded video when the user reads, the data can be time series data, wherein the time series data comprise information such as actions, facial expressions and the like of the user in the reading process and the sequence display of contents such as characters, pictures and the like on the screen. In reading videos and screen recording videos, a transducer can learn information such as text content in the videos and behavior feedback of users, and map the information into semantic spaces related to book content for understanding and analysis. As an example, in reading a video, a user may briefly stay on a paragraph or page a number of times, and such repeated gazing or stay behavior may also be effectively captured by the transducer model, helping the model determine the paragraphs of interest to the user.
In some embodiments, the interest paragraph determination model includes a video matching layer, a paragraph action determination layer, a degree of interest determination layer, and an interest paragraph screening layer. The video matching layer, the paragraph action determining layer, the interest degree determining layer and the interest paragraph screening layer all comprise a transducer structure. The input of the video matching layer is the video when the user reads the book and the screen recorded video, the output of the video matching layer is the segmented video when the user reads the book corresponding to each paragraph of the book, the screen recorded segmented video corresponding to each paragraph of the book, the input of the paragraph action determining layer is the segmented video when the user reads the book corresponding to each paragraph of the book, the screen recorded segmented video corresponding to each paragraph of the book, the output of the paragraph action determining layer is the reading time length corresponding to each paragraph of the book, the facial expression sequence, the gesture operation of the user and the eye action sequence, the input of the interest degree determining layer is the reading time length corresponding to each paragraph of the book, the facial expression sequence, the gesture operation of the user and the eye action sequence, the output of the interest degree determining layer is the interest degree of each paragraph of the book, the input of the interest paragraph screening layer is the interest degree of each paragraph of the book, and the output of the interest screening layer is a plurality of initial interest paragraphs.
The reading time length, the facial expression sequence, the gesture operation of the user and the eye action sequence corresponding to each paragraph of the book can be used for judging the interest degree of the user in each paragraph, so that a plurality of initial interest paragraphs are determined. As an example, the reading duration corresponding to each paragraph of the book is the duration that the user stays on each paragraph. A long dwell may indicate that the user is interested in the paragraph content, while a short dwell may indicate lack of interest. The facial expression sequence corresponding to each paragraph of the book is a change sequence of facial expressions of a user in the reading process. The sequence of facial expressions corresponding to each paragraph of the book may represent the user's interest in that paragraph, as an example, if the user exhibits a high level of excitement or concentration of facial expressions, they may be considered to be interested in the paragraph content. Gesture operations of the user corresponding to each paragraph of the book record gesture operations such as page turning, marking and swiping of the user in the reading process. As an example, the gesture operation of the user is related to the user's interest in a specific paragraph, for example, if the user marks or highlights a certain paragraph, it indicates that the user's interest in the paragraph is high. The eye motion sequence refers to a deformed sequence of eye movements of the user. By analyzing the sequence of eye movements of the user, one can learn how much and how much attention they are focusing on each paragraph. For example, looking at a particular paragraph for a long period of time may indicate that the user is interested in the paragraph content, while frequent glances and rapid jumps may indicate that the paragraph is of low interest or difficult to understand.
And S4, determining a target paragraph based on the text contents corresponding to the initial interest paragraphs in the book.
The target paragraph is a paragraph that is more representative of content screened from the plurality of initial interest paragraphs in the book for book recommendation. As an example, one target paragraph may be: "a beautiful sunset, an afterglow of the sunset, is sprinkled on the lake surface, and shows golden yellow brilliance".
In some embodiments, the target paragraph may be determined based on a target paragraph determination model. The target paragraph determination model is a deep neural network model. The input of the target paragraph determining model is text content corresponding to a plurality of initial interest paragraphs in the book, and the output of the target paragraph determining model is the target paragraph. The deep neural network model includes a deep neural network (Deep Neural Networks, DNN). The deep neural network model is one implementation of artificial intelligence. The deep neural network may include a recurrent neural network (Recurrent Neural Network, RNN), a convolutional neural network (Convolutional Neural Networks, CNN), and so on. In the training process, the deep neural network model optimizes the weight and deviation of the deep neural network model through multiple iterations, so that the understanding capability of the deep neural network model on text data is gradually improved. The model learned representation may capture similarity, relevance, and semantic information between different paragraphs, enabling inference of a target paragraph with more representative content from the initial paragraph of interest entered. When a new plurality of initial interest paragraphs are entered, the deep neural network model maps the input to a more representative target paragraph of content associated therewith based on knowledge it learns during the training process. This mapping is based on the abstraction and understanding of the text data by a model, which can identify a more representative target paragraph of content by pattern matching, semantic association, etc.
And step S5, generating paragraph description images based on the text content of the target paragraphs by using a generation countermeasure network.
The input of the generated countermeasure network is the text content of the target paragraph, and the output of the generated countermeasure network is the paragraph description image. The generation of the countermeasure network (Generative Adversarial Network, GAN for short) comprises a generator and a discriminator, which are mutually game and are continuously optimized to achieve the purpose of generating the vivid data.
The generation of the countermeasure network can learn the characteristic representation of the data through a training process. The generator may generate realistic image samples and the arbiter may attempt to distinguish the image generated by the generator from the real image. As training proceeds, the generator gradually learns to generate more realistic images, and the arbiter gradually and accurately determines which images are generated. The generation of the countermeasure network may translate the literal content of the target paragraph into a paragraph description image.
The paragraph descriptive image is an image generated by generating an countermeasure network corresponding to the literal content of the target paragraph.
And S6, inputting the paragraph description image into a cover determination model to obtain the target book cover.
The cover determination model is a deep neural network model. And the input of the cover determination model is the paragraph description image, and the output of the cover determination model is the cover of the target book. The target book cover is the book cover that is the most similar or the most matched to the paragraph descriptive image.
As an example, the paragraph description image describes that "at the afternoon of a sunny day, a girl planted with her grandpa with a bright carnation in a garden, two faces overflowed with happy smiles", and the target book cover determined by the cover determination model is a related story book cover named "memory in the sun", a similar scene is presented on the cover, a warm moment of the girl and grandpa is presented, and information of the title and the author is added. The cover determination model may learn the association between the paragraph description image and the target book cover through a training process. By generating a paragraph description image generated against the network, the contents described in the paragraph and the characteristics thereof can be reflected more accurately. Thus, when the target book covers are matched according to the paragraph description images, the matching accuracy and precision can be improved. By converting the paragraph description into an image and generating a corresponding book cover, a more visual and intuitive reading experience can be provided for the user. The user can directly know the theme, emotion, style and the like of the books through the cover images, so that the books interested in the user can be selected and read more conveniently.
And S7, recommending the target book corresponding to the target book cover to a user.
The target book corresponding to the target book cover is a book corresponding to the target book cover.
In some embodiments, if the user is not interested in the target book, the book to be recommended may be recommended through fig. 2, and fig. 2 is a schematic flow chart of the recommended book to be recommended provided in the embodiment of the present invention, where the recommended book to be recommended includes steps S21 to S26:
step S21, the uninteresting operation of the user on the target book is obtained.
This step refers to obtaining operations or feedback that the user is not interested in the target book representation. As an example, a user clicking on a "no interest" button or other corresponding operation on an application or platform indicates that the target book is not of interest.
Step S22, in response to the user' S uninteresting operation on the target book, the target paragraph is removed from the multiple initial interest paragraphs to obtain multiple removed paragraphs.
And removing the target paragraph from the multiple initial interest paragraphs containing the target paragraph according to the operation that the user is not interested in the target book, so as to obtain multiple removed paragraphs.
As an example, the plurality of initial interest paragraphs include an "a" segment, a "B" segment, a "C" segment, a "D" segment, and the target paragraph is a "B" segment, and the plurality of culled paragraphs are an "a" segment, a "C" segment, and a "D" segment.
Step S23, generating a plurality of paragraph description images by using the generating countermeasure network based on the text content of the plurality of paragraphs after the elimination.
The description about generating the countermeasure network can be referred to as step S5. The generation countermeasure network can generate a plurality of paragraph description images according to the text content of the plurality of paragraphs after being removed.
Step S24, acquiring a paragraph description image selected by a user, wherein the paragraph description image selected by the user is a paragraph description image selected by the user from a plurality of paragraph description images.
This step refers to the user selecting and confirming a paragraph description image of interest from the generated plurality of paragraph description images. As an example, a user selects one of the presented plurality of paragraph description images, indicating an interest in the corresponding paragraph.
Step S25, inputting the paragraph description image selected by the user into the cover determining model to obtain the book cover to be recommended.
The cover determining model takes the paragraph description image selected by the user as input to generate a corresponding book cover to be recommended. The description of the cover determination model may be referred to as step S6.
And S26, recommending the books to be recommended corresponding to the book covers to be recommended to the user.
The method comprises the steps of associating book covers to be recommended generated according to paragraph description images selected by a user with corresponding books to be recommended, and recommending the books to be recommended to the user.
Based on the same inventive concept, fig. 3 is a schematic diagram of a text recommendation system based on artificial intelligence and video processing according to an embodiment of the present invention, where the text recommendation system based on artificial intelligence and video processing includes:
a detection module 31, configured to detect whether a user starts a book reading operation;
the acquiring module 32 is configured to, if it is detected that the user starts the book reading operation, open the front camera to acquire the video when the user reads the book and record the mobile phone screen at the same time to obtain the screen recorded video;
an initial paragraph determination module 33 for inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book;
a target paragraph determining module 34, configured to determine a target paragraph based on text content corresponding to a plurality of initial interesting paragraphs in the book;
a paragraph description image generation module 35 for generating a paragraph description image based on the text content usage of the target paragraph;
the target book cover determining module 36 is configured to input the paragraph description image into a cover determining model to obtain a target book cover;
and the recommending module 37 is configured to recommend the target book corresponding to the target book cover to a user.
Claims (10)
1. A text recommendation method based on artificial intelligence and video processing, comprising:
detecting whether a user starts a book reading operation;
if the book reading operation of the user is detected, a front camera is opened to acquire a video when the user reads the book, and a mobile phone screen is recorded to acquire a screen recorded video;
inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book;
determining a target paragraph based on text content corresponding to a plurality of initial interest paragraphs in the book;
generating paragraph description images based on the text content of the target paragraphs by using a generated countermeasure network;
inputting the paragraph description image into a cover determination model to obtain a target book cover;
and recommending the target book corresponding to the target book cover to a user.
2. The text recommendation method based on artificial intelligence and video processing of claim 1, wherein the interest paragraph determination model is a transducer model, the input of the interest paragraph determination model is video of the user reading a book and the screen recorded video, and the output of the interest paragraph determination model is a plurality of initial interest paragraphs in the book.
3. The artificial intelligence and video processing based text recommendation method according to claim 1, wherein the method further comprises:
acquiring uninteresting operation of a user on the target book;
in response to the uninteresting operation of the user on the target book, removing the target paragraph from the initial interest paragraphs to obtain multiple removed paragraphs;
generating a plurality of paragraph description images by using the generating countermeasure network based on the text content of the plurality of paragraphs after the elimination;
acquiring a paragraph description image selected by a user, wherein the paragraph description image selected by the user is a paragraph description image selected by the user from a plurality of paragraph description images;
inputting the paragraph description image selected by the user into the cover determination model to obtain a book cover to be recommended;
and recommending the books to be recommended corresponding to the book covers to be recommended to the user.
4. The text recommendation method based on artificial intelligence and video processing according to claim 1, wherein the detecting whether a user starts a book reading operation comprises: it is detected whether the user clicks the start reading button.
5. The text recommendation method based on artificial intelligence and video processing of claim 1, wherein the input to the generating countermeasure network is text content of the target paragraph and the output to the generating countermeasure network is paragraph description image.
6. A text recommendation system based on artificial intelligence and video processing, comprising:
the detection module is used for detecting whether a user starts book reading operation or not;
the acquisition module is used for opening the front camera to acquire the video when the user reads the book and recording the mobile phone screen to acquire the screen recorded video if the user is detected to open the book reading operation;
an initial paragraph determination module for inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book;
a target paragraph determining module, configured to determine a target paragraph based on text contents corresponding to a plurality of initial interest paragraphs in the book;
a paragraph description image generation module for generating a paragraph description image based on the text content of the target paragraph using a generation countermeasure network;
the target book cover determining module is used for inputting the paragraph description image into a cover determining model to obtain a target book cover;
and the recommending module is used for recommending the target book corresponding to the target book cover to the user.
7. The text recommendation system based on artificial intelligence and video processing of claim 6 wherein the paragraph of interest determination model is a transducer model, the input of the paragraph of interest determination model is video of the user reading a book and the screen recorded video, and the output of the paragraph of interest determination model is a plurality of initial paragraphs of interest in the book.
8. The artificial intelligence and video processing based text recommendation system according to claim 6, wherein the system is further configured to:
acquiring uninteresting operation of a user on the target book;
in response to the uninteresting operation of the user on the target book, removing the target paragraph from the initial interest paragraphs to obtain multiple removed paragraphs;
generating a plurality of paragraph description images by using the generating countermeasure network based on the text content of the plurality of paragraphs after the elimination;
acquiring a paragraph description image selected by a user, wherein the paragraph description image selected by the user is a paragraph description image selected by the user from a plurality of paragraph description images;
inputting the paragraph description image selected by the user into the cover determination model to obtain a book cover to be recommended;
and recommending the books to be recommended corresponding to the book covers to be recommended to the user.
9. The text recommendation system based on artificial intelligence and video processing of claim 6, wherein the detection module is further configured to: it is detected whether the user clicks the start reading button.
10. The artificial intelligence and video processing based text recommendation system of claim 6, wherein the input to the generate countermeasure network is textual content of the target paragraph and the output to the generate countermeasure network is a paragraph description image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410078311.3A CN117591697B (en) | 2024-01-19 | 2024-01-19 | Text recommendation method and system based on artificial intelligence and video processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410078311.3A CN117591697B (en) | 2024-01-19 | 2024-01-19 | Text recommendation method and system based on artificial intelligence and video processing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117591697A true CN117591697A (en) | 2024-02-23 |
CN117591697B CN117591697B (en) | 2024-03-29 |
Family
ID=89922412
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410078311.3A Active CN117591697B (en) | 2024-01-19 | 2024-01-19 | Text recommendation method and system based on artificial intelligence and video processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117591697B (en) |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103839062A (en) * | 2014-03-11 | 2014-06-04 | 东方网力科技股份有限公司 | Image character positioning method and device |
CN104298682A (en) * | 2013-07-18 | 2015-01-21 | 广州华久信息科技有限公司 | Information recommendation effect evaluation method and mobile phone based on facial expression images |
CN107169002A (en) * | 2017-03-31 | 2017-09-15 | 咪咕数字传媒有限公司 | A kind of personalized interface method for pushing and device recognized based on face |
CN107679070A (en) * | 2017-08-22 | 2018-02-09 | 科大讯飞股份有限公司 | Intelligent reading recommendation method and device and electronic equipment |
CN109213932A (en) * | 2018-08-09 | 2019-01-15 | 咪咕数字传媒有限公司 | Information pushing method and device |
CN111930667A (en) * | 2020-07-09 | 2020-11-13 | 上海连尚网络科技有限公司 | Method and device for book recommendation in reading application |
CN111931062A (en) * | 2020-08-28 | 2020-11-13 | 腾讯科技(深圳)有限公司 | Training method and related device of information recommendation model |
CN113642673A (en) * | 2021-08-31 | 2021-11-12 | 北京字跳网络技术有限公司 | Image generation method, device, equipment and storage medium |
US20220253721A1 (en) * | 2021-01-30 | 2022-08-11 | Walmart Apollo, Llc | Generating recommendations using adversarial counterfactual learning and evaluation |
US20220343100A1 (en) * | 2021-04-23 | 2022-10-27 | Ping An Technology (Shenzhen) Co., Ltd. | Method for cutting video based on text of the video and computing device applying method |
CN115461793A (en) * | 2020-02-29 | 2022-12-09 | 具象有限公司 | System and method for interactive multimodal book reading |
CN115797488A (en) * | 2022-11-28 | 2023-03-14 | 科大讯飞股份有限公司 | Image generation method and device, electronic equipment and storage medium |
US20230196000A1 (en) * | 2021-12-21 | 2023-06-22 | Woongjin Thinkbig Co., Ltd. | System and method for providing personalized book |
CN116485948A (en) * | 2023-04-30 | 2023-07-25 | 上海芯赛云计算科技有限公司 | Text image generation method and system based on recommendation algorithm and diffusion model |
CN117216535A (en) * | 2023-02-16 | 2023-12-12 | 腾讯科技(深圳)有限公司 | Training method, device, equipment and medium for recommended text generation model |
-
2024
- 2024-01-19 CN CN202410078311.3A patent/CN117591697B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104298682A (en) * | 2013-07-18 | 2015-01-21 | 广州华久信息科技有限公司 | Information recommendation effect evaluation method and mobile phone based on facial expression images |
CN103839062A (en) * | 2014-03-11 | 2014-06-04 | 东方网力科技股份有限公司 | Image character positioning method and device |
CN107169002A (en) * | 2017-03-31 | 2017-09-15 | 咪咕数字传媒有限公司 | A kind of personalized interface method for pushing and device recognized based on face |
CN107679070A (en) * | 2017-08-22 | 2018-02-09 | 科大讯飞股份有限公司 | Intelligent reading recommendation method and device and electronic equipment |
CN109213932A (en) * | 2018-08-09 | 2019-01-15 | 咪咕数字传媒有限公司 | Information pushing method and device |
CN115461793A (en) * | 2020-02-29 | 2022-12-09 | 具象有限公司 | System and method for interactive multimodal book reading |
CN111930667A (en) * | 2020-07-09 | 2020-11-13 | 上海连尚网络科技有限公司 | Method and device for book recommendation in reading application |
CN111931062A (en) * | 2020-08-28 | 2020-11-13 | 腾讯科技(深圳)有限公司 | Training method and related device of information recommendation model |
US20220253721A1 (en) * | 2021-01-30 | 2022-08-11 | Walmart Apollo, Llc | Generating recommendations using adversarial counterfactual learning and evaluation |
US20220343100A1 (en) * | 2021-04-23 | 2022-10-27 | Ping An Technology (Shenzhen) Co., Ltd. | Method for cutting video based on text of the video and computing device applying method |
CN113642673A (en) * | 2021-08-31 | 2021-11-12 | 北京字跳网络技术有限公司 | Image generation method, device, equipment and storage medium |
US20230196000A1 (en) * | 2021-12-21 | 2023-06-22 | Woongjin Thinkbig Co., Ltd. | System and method for providing personalized book |
CN115797488A (en) * | 2022-11-28 | 2023-03-14 | 科大讯飞股份有限公司 | Image generation method and device, electronic equipment and storage medium |
CN117216535A (en) * | 2023-02-16 | 2023-12-12 | 腾讯科技(深圳)有限公司 | Training method, device, equipment and medium for recommended text generation model |
CN116485948A (en) * | 2023-04-30 | 2023-07-25 | 上海芯赛云计算科技有限公司 | Text image generation method and system based on recommendation algorithm and diffusion model |
Non-Patent Citations (2)
Title |
---|
QU W 等: ""A novel approach based on multi-view content analysis and semi-supervised enrichment for movie recommendation"", 《JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY》, 30 September 2013 (2013-09-30), pages 776, XP035351298, DOI: 10.1007/s11390-013-1376-7 * |
曹斌 等: "基于用户隐性反馈与协同过滤相结合的电子书籍推荐服务", 《小型微型计算机系统?, vol. 38, no. 2, 30 June 2015 (2015-06-30), pages 252 - 255 * |
Also Published As
Publication number | Publication date |
---|---|
CN117591697B (en) | 2024-03-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6431231B1 (en) | Imaging system, learning apparatus, and imaging apparatus | |
Del Molino et al. | Summarization of egocentric videos: A comprehensive survey | |
Newman et al. | Multimodal memorability: Modeling effects of semantics and decay on video memorability | |
CN102207950B (en) | Electronic installation and image processing method | |
CN103988202A (en) | Image attractiveness based indexing and searching | |
CN101783886A (en) | Information processing apparatus, information processing method, and program | |
CN114390217B (en) | Video synthesis method, device, computer equipment and storage medium | |
CN103959227A (en) | Life-logging and memory sharing | |
US10755087B2 (en) | Automated image capture based on emotion detection | |
WO2019196795A1 (en) | Video editing method, device and electronic device | |
KR20190053481A (en) | Apparatus and method for user interest information generation | |
Maybury | Multimedia information extraction: Advances in video, audio, and imagery analysis for search, data mining, surveillance and authoring | |
Sharma et al. | Emotion-based music recommendation system | |
CN115909390B (en) | Method, device, computer equipment and storage medium for identifying low-custom content | |
CN113079420A (en) | Video generation method and device, electronic equipment and computer readable storage medium | |
Jishan et al. | Hybrid deep neural network for bangla automated image descriptor | |
CN117591697B (en) | Text recommendation method and system based on artificial intelligence and video processing | |
US20230066331A1 (en) | Method and system for automatically capturing and processing an image of a user | |
CN114443916A (en) | Supply and demand matching method and system for test data | |
Yang et al. | Learning the synthesizability of dynamic texture samples | |
CN111931510B (en) | Intention recognition method and device based on neural network and terminal equipment | |
Venkatesh et al. | “You Tube and I Find”—Personalizing multimedia content access | |
Hoy | Deep learning and online video: Advances in transcription, automated indexing, and manipulation | |
Vayadande et al. | The Rise of AI‐Generated News Videos: A Detailed Review | |
Shah et al. | Video to text summarisation and timestamp generation to detect important events |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |