WO2021164255A1 - 一种演示文稿生成方法、装置、计算机设备及存储介质 - Google Patents

一种演示文稿生成方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2021164255A1
WO2021164255A1 PCT/CN2020/118004 CN2020118004W WO2021164255A1 WO 2021164255 A1 WO2021164255 A1 WO 2021164255A1 CN 2020118004 W CN2020118004 W CN 2020118004W WO 2021164255 A1 WO2021164255 A1 WO 2021164255A1
Authority
WO
WIPO (PCT)
Prior art keywords
paragraph
topic
presentation
keywords
text
Prior art date
Application number
PCT/CN2020/118004
Other languages
English (en)
French (fr)
Inventor
谢静文
阮晓雯
徐亮
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021164255A1 publication Critical patent/WO2021164255A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/189Automatic justification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a method, device, computer equipment and storage medium for generating a presentation.
  • This application provides a presentation generation method, device, computer equipment and storage medium to solve the problem of low presentation generation efficiency.
  • a method for generating a presentation includes: receiving a main keyword of a presentation input by a user through a client; using the main keyword to perform a text material search in a text material library to obtain a plurality of text materials; The text material is spliced and integrated to obtain the overall text material; subject recognition and paragraph disassembly are performed on the overall text material to obtain at least one subtopic and the topic paragraph corresponding to the subtopic; using the keyword and the subtopic
  • the theme carries out the manuscript style analysis and processing, and obtains the style analysis result corresponding to each sub-theme; determines the overall style information of the presentation according to the style analysis result corresponding to each sub-theme; compares the sub-theme and the theme paragraph corresponding to the sub-theme Enter the keyword extraction model to extract related words to obtain paragraph keywords related to the topic paragraph; input multiple paragraph keywords into a picture library for search to obtain the target picture corresponding to the paragraph keywords; The target picture, the overall style information of the presentation, the sub-themes, and the theme paragraphs
  • a presentation generating device includes: a receiving module, which receives the main keywords of the presentation input by a user through a client; a first search module, which uses the main keywords to search for text materials in a text material library to obtain multiple Text material; a splicing and integration module, which splices and integrates a plurality of the text materials to obtain an overall text material; a recognition and disassembly module, which performs topic recognition and paragraph disassembly of the overall text material to obtain at least one subtopic, and The topic paragraph corresponding to the subtopic; the analysis module uses the keywords and the subtopic to analyze the style of the manuscript to obtain the style analysis result corresponding to each subtopic; the determination module analyzes the style corresponding to each subtopic As a result, the overall style information of the presentation is determined; the extraction module inputs the subtopic and the topic paragraph corresponding to the subtopic into the keyword extraction model to perform related word extraction, and obtain paragraph keywords related to the topic paragraph; second Search module, input a plurality of said paragraph keywords into the
  • a computer device includes a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor.
  • the processor executes the computer-readable instructions, the following steps are implemented: receiving a user The main keywords of the presentation input through the client; use the main keywords to search for text materials in the text material library to obtain multiple text materials; combine and integrate multiple text materials to obtain the overall text material; Subject recognition and paragraph disassembly are performed on the overall text material to obtain at least one sub-theme and a subject paragraph corresponding to the sub-theme; use the keywords and the sub-theme to analyze the style of the manuscript to obtain each sub-theme Corresponding style analysis results; determine the overall style information of the presentation according to the style analysis results corresponding to each subtopic; input the subtopics and the topic paragraphs corresponding to the subtopics into the keyword extraction model for related word extraction to obtain Paragraph keywords related to the topic paragraph; input a plurality of the paragraph keywords into the picture library to search to obtain the target picture corresponding to the paragraph keywords; according to the target
  • One or more readable storage media storing computer-readable instructions, when the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps: Enter the main keyword of the presentation; use the main keyword to search for text materials in the text material library to obtain multiple text materials; combine and integrate multiple text materials to obtain the overall text material; Subject recognition and paragraph disassembly are performed on the overall text material to obtain at least one sub-theme and the subject paragraph corresponding to the sub-theme; use the keywords and the sub-theme to analyze the style of the manuscript to obtain the style corresponding to each sub-theme Analysis results; determine the overall style information of the presentation according to the style analysis results corresponding to each subtopic; input the subtopics and the topic paragraphs corresponding to the subtopics into the keyword extraction model to extract related words, and obtain the Paragraph keywords related to the topic paragraph; input a plurality of the paragraph keywords into the picture library for search, and obtain the target picture corresponding to the paragraph keywords; according to the target picture, the overall style information of the presentation, sub
  • the above-mentioned presentation generation method, device, computer equipment and storage medium are implemented in one of the solutions that receive the subject keywords input by the user through the client; use the subject keywords to search for text materials in the text material library; Splicing and integration; use keywords and the subtopics to analyze and process the presentation style; determine the overall style information of the presentation; input the subtopics and the topic paragraphs corresponding to the subtopics into the keyword extraction model for related word extraction; input multiple related words Search in the picture library to generate a presentation corresponding to the keyword.
  • This application can complete the intelligent search of text material and picture information through the simple subject keywords input by the user.
  • FIG. 1 is a schematic diagram of an application environment of a method for generating a presentation in an embodiment of the present application
  • FIG. 2 is a flowchart of a method for generating a presentation in an embodiment of the present application
  • FIG. 3 is another flowchart of a method for generating a presentation in an embodiment of the present application
  • FIG. 4 is another flowchart of a method for generating a presentation in an embodiment of the present application.
  • FIG. 5 is another flowchart of a method for generating a presentation in an embodiment of the present application.
  • FIG. 6 is another flowchart of a method for generating a presentation in an embodiment of the present application.
  • FIG. 7 is another flowchart of a method for generating a presentation in an embodiment of the present application.
  • FIG. 8 is another flowchart of a method for generating a presentation in an embodiment of the present application.
  • Fig. 9 is a schematic block diagram of a presentation generating device in an embodiment of the present application.
  • Fig. 10 is a schematic diagram of a computer device in an embodiment of the present application.
  • the presentation generation method provided in this application can be applied in the application environment as shown in Figure 1, where the server communicates with the client through the network.
  • a method for generating a presentation is provided, and the method is applied to the server in FIG. 1 as an example for description, including the following steps:
  • S10 Receive the main keywords of the presentation entered by the user through the client.
  • the client can provide the subject keyword input by the user, and the server can receive the subject keyword fed back by the client.
  • the main keywords can be any words.
  • the main keywords may be "5G", "regional chain”, etc.
  • S20 Use the subject keywords to search for text materials in a text material library to obtain multiple text materials.
  • the text material library may be a material library that stores text materials. For example, using the subject keyword "5G” to search in the text material library to obtain related text materials such as "5G concept” and "5G development status".
  • S30 Splicing and integrating a plurality of the text materials to obtain an overall text material.
  • the overall text material refers to the text content related to the text material.
  • the previously obtained "5G concept", "5G development status” and other related text materials are spliced and integrated to obtain the overall text material, that is, the text content about 5G obtained by integrating multiple text materials of the overall text material.
  • S40 Perform topic recognition and paragraph disassembly on the overall text material to obtain at least one subtopic and a topic paragraph corresponding to the subtopic.
  • the overall text material includes a plurality of natural paragraphs, as shown in FIG. 3, in step S40, that is, subject recognition and paragraph disassembly are performed on the overall text material to obtain at least one subtopic , And the topic paragraph corresponding to the subtopic, specifically including the following steps:
  • the unsupervised clustering model can be K-means clustering, which is used to divide the overall text material into N subtopics.
  • the supervised clustering model can be K-means clustering, which is used to divide the overall text material into N subtopics.
  • the overall text material under the main keyword of "5G” includes “historical development” and “existing applications”. , "Future Trends” and so on.
  • S42 Perform topic paragraph recognition on the overall text material according to the N subtopics, and identify a topic paragraph corresponding to each of the subtopics.
  • the overall text material under the main keywords of "5G” includes sub-themes such as “historical development”, “existing applications”, and “future trends”.
  • Sub-topics such as “Trends” are used as sub-topic keywords.
  • the sub-topic keywords are used to identify thematic paragraphs of the overall text material, and the topic paragraphs corresponding to “historical development” are identified as the following paragraph 1: “1G era, 1986, in The birth of Chicago, the first generation of mobile communication technology (1st Generation, abbreviated as 1G) on stage, etc.”
  • “Existing applications” corresponds to the following paragraph 2: “Existing 5G applications in the fields of folding car networking and autonomous driving, folding surgery, folding smart grid, etc.”.
  • each of the subtopics and the corresponding topic paragraphs are associated and stored in a structured form. Understandably, each subtopic and the corresponding topic paragraph in the overall text material are associated and stored, such as subtopics"
  • the structured form of paragraph 1 corresponding to "historical development” is stored associatively; each subtopic and the corresponding paragraph are stored associative to facilitate subsequent generation of a typesetting template corresponding to the subtopic, and then generating the typesetting template corresponding to the subtopic according to the typesetting template Slides, and the overall text material is divided into N sub-themes, and each sub-theme can correspond to a different typesetting template, and then a presentation is generated based on the slides corresponding to the N sub-themes.
  • the unsupervised clustering model is used to perform topic recognition on the overall text material, and N subtopics can be divided, and then paragraph recognition is performed based on the N subtopics, so as to realize the disassembly of the overall text material into different topics Corresponding to different topic paragraphs, and finally each subtopic in the overall text material and the corresponding topic paragraph are associated and stored to form a clearer text logical framework.
  • This solution intelligently splits the logical structure of the overall text material to ensure The reasonable distribution of paragraph content also makes the subsequent presentations more readable.
  • step S42 that is, subject paragraph recognition is performed on the overall text material according to the N subtopics, and the topic corresponding to each of the subtopics is identified.
  • Paragraph specifically including the following steps:
  • S421 Perform abstract extraction processing on each natural paragraph in the topic paragraph by using the Textrank algorithm to obtain multiple abstracts.
  • S422 Select an abstract that exceeds a preset importance value from a plurality of abstracts extracted from the natural paragraph as an associated sentence of the natural paragraph.
  • the Textrank algorithm containing semantic information is used to perform abstract extraction processing on each of the natural paragraphs in the topic paragraphs to obtain multiple abstracts;
  • the TextRank algorithm is an extractive abstract based on a graph model Method
  • the TextRank algorithm uses the semantic information between words in a document to extract the summary of the document.
  • the principle of the TextRank algorithm for abstract extraction processing by dividing the natural paragraph into several constituent sentences, using the similarity between the sentences as the weight of the edge, calculating the TextRank value of the sentence through loop iteration, and selecting multiple of each natural paragraph Abstract; and then, among the multiple abstracts extracted from the natural paragraph, an abstract that exceeds a preset important value is selected as the associated sentence of the natural paragraph.
  • S423 Use the MMR model to filter the related sentences, remove redundant sentences with high semantic relevance, and obtain the target sentence corresponding to the natural paragraph.
  • MMR is the abbreviation of Maximal Marginal Releuance.
  • Chinese means the maximum boundary correlation algorithm or the maximum margin correlation algorithm.
  • the purpose of the MMR algorithm is to reduce the redundancy of the sorting results and ensure the relevance of the results.
  • the target sentence is a sentence that is not semantically related, and the related sentences are screened using the MMR model, and redundant sentences with high semantic relatedness among the related sentences obtained in step S422 are removed to obtain the target sentence.
  • S424 Integrate all target sentences corresponding to the natural paragraphs of the subtopic to obtain a topic paragraph corresponding to the subtopic.
  • the Textrank algorithm containing semantic information is used for abstract extraction processing, and the sentences with high importance in the natural paragraph are obtained as the abstract.
  • the sentence of the value summary is used as the related sentence, and then the MMR model is used to remove the redundant sentence with high semantic relevance, and the text material corresponding to each subtopic is obtained, so as to prevent the problem of low relevance of text material or excessive text material appearing to be redundant, and then Improve the readability of presentations.
  • step S40 that is, after the unsupervised clustering model is used to perform topic recognition and paragraph disassembly on the overall text material, at least one subtopic and the subtopic are obtained.
  • step S40 that is, after the unsupervised clustering model is used to perform topic recognition and paragraph disassembly on the overall text material, at least one subtopic and the subtopic are obtained.
  • an unsupervised clustering model is used to identify the topic level headings of the topic paragraphs corresponding to the subtopics to detect whether the topic paragraphs corresponding to the subtopics contain hierarchical headings; if the topic paragraphs corresponding to the subtopics contain hierarchical headings , The level heading corresponding to each topic paragraph is detected, and the topic paragraphs are disassembled using sub-level headings to obtain the level paragraphs corresponding to each level heading after the level paragraph is disassembled; finally, each level in the topic paragraph The title and the corresponding level are stored in association.
  • sub-level headings are included in the hierarchical paragraphs. If the sub-level headings are included in the hierarchical paragraphs, the sub-level paragraphs are disassembled to obtain the sub-level headings corresponding to the sub-level paragraphs. Hierarchical paragraphs.
  • the topic paragraphs corresponding to the respective subtopics are disassembled to obtain the hierarchical headings after the disassembled hierarchical paragraphs
  • Corresponding hierarchical paragraphs in order to disassemble the topic paragraphs into different hierarchical paragraphs corresponding to different hierarchical headings, and finally associate each hierarchical heading in the topic paragraph with the corresponding level to form a more complete overall logical framework;
  • the hierarchical headings and hierarchical paragraphs obtained after hierarchical disassembly of the topic paragraphs can generate a more readable presentation.
  • S50 Perform manuscript style analysis processing using the keywords and the subtopics to obtain a style analysis result corresponding to each subtopic, and obtain a style analysis result.
  • manuscript style analysis and processing refers to the process of analyzing, processing, and extracting emotionally colored subjective texts using natural language processing and text mining techniques. Understandably, using text sentiment analysis to analyze, process and extract the subject keywords and the sub-topics, and extract the subject keywords and the sentiments of the sub-topics, such as technology, romance, seriousness, etc. ;
  • the keywords and the subtopics are used to analyze the style of the manuscript, to obtain the style analysis results corresponding to each subtopic, and to obtain the style analysis results corresponding to each subtopic, wherein the style analysis results include but are not limited to : Technological style, romantic style, serious style, fresh style, simple style and other styles.
  • the style analysis result corresponding to each of the subtopics can be determined. If the emotion of the subject keyword is technology and the emotion of the subtopic is serious, then the result of the style analysis is technology and serious style. If the emotion of the subject keyword is technology and the emotion of the subtopic is also technology, then The result of the style analysis is the technological style.
  • S60 Determine the overall style information of the presentation according to the style analysis result corresponding to each subtopic.
  • step S60 that is, determining the overall style information of the presentation according to the style analysis result corresponding to each subtopic, specifically includes the following steps:
  • S61 Use the style analysis result to determine the template color information corresponding to each sub-theme and the text format information corresponding to the topic paragraphs, where the text format information includes text font information and text font size information.
  • S62 Determine the overall style information of the presentation based on the template color matching information, text font information, and text font size information.
  • the template color matching information refers to the different color matching information corresponding to the style template.
  • the template color matching information corresponding to the sub-theme is searched from the template sample database by using the style analysis result. For example, if the style analysis result is a romantic style, the sub-theme is determined
  • the template color information corresponding to the theme is a pink romantic template; the text format information corresponding to the sub-theme to the corresponding theme paragraph is found according to the sub-theme.
  • step S40 which will not be repeated here; use the style analysis result to determine The text format information corresponding to the topic paragraph corresponding to the subtopic.
  • the text font information refers to the font type of the text
  • the text font size information refers to the font size of the text.
  • the text format information is the font size information of No. 5 text, the text font information of Song Ti, and further, the color of the text in the romantic style is light yellow; finally, the overall style of the presentation is determined according to the template color information, text font information, and text font size information information.
  • the template color information corresponding to the sub-theme and the text format information corresponding to the disassembled materials are obtained, and then the template color information, text font information, and text font size information are used to determine The overall style information of the presentation, so as to make the subsequent generated presentation highly relevant.
  • S70 Input the subtopic and the topic paragraph corresponding to the subtopic into a keyword extraction model to extract related words to obtain paragraph keywords related to the topic paragraph.
  • TF-IDF term frequency-inverse document frequency
  • TextRank Term Frequency
  • the TF-IDF algorithm is a commonly used weighting technique for information retrieval and information exploration. It is a statistical method used to evaluate the importance of a word to a document set or one of the documents in a corpus. The importance of a word increases proportionally with the number of times it appears in the document, but at the same time it decreases inversely with the frequency of its appearance in the corpus.
  • TextRank is an algorithm based on graph sorting. It divides the text into several units (words, sentences) and builds a graph model.
  • the voting mechanism is used to rank the important components of the text, and the information in the single chapter document itself can be used as much as possible.
  • Realize keyword extraction and abstract Input the sub-topic and the topic paragraph corresponding to the sub-topic into the keyword extraction model for related word extraction, and extract relevant word examples related to the topic paragraph for the topic paragraph corresponding to the sub-topic, such as for the frontier of the large paragraph
  • related words may be "5G", “blockchain”, “big data”, etc.
  • S80 Input a plurality of the paragraph keywords into a picture library for searching, and obtain a target picture corresponding to the paragraph keywords.
  • step S80 that is, inputting a plurality of the paragraph keywords into a picture library for searching, to obtain the target picture corresponding to the paragraph keywords, specifically includes the following steps :
  • S81 Input a plurality of the paragraph keywords into a picture library for searching, and obtain a target picture corresponding to the paragraph keywords.
  • the pictures stored in the picture library have related topic tags, and a plurality of paragraph keywords are input into the picture library for searching, and the target picture corresponding to the paragraph keywords is obtained.
  • the typesetting template of the presentation refers to the standard style template corresponding to the overall style information of the presentation determined in step S60, and different style templates have certain restrictions on the size of the picture. Since the shape and definition of the pictures in the picture library are different, when generating a presentation with pictures, the pixels of the target picture in the picture library may be different from the picture pixels required by the typesetting template. Therefore, it is necessary to key to the paragraph The size of the target picture corresponding to the word is processed.
  • the pixel of the target picture in the searched picture library is 400*500, and the picture pixel required by the typesetting template is 300*400, then the target picture in the picture library is compressed first It is 300*375, and the remaining 25 pixels of the vertical pixels are temporarily filled with transparent colors.
  • users can also adjust the size of the target image by themselves.
  • the shape of the pictures in the picture library may be in various forms, for example, a five-pointed star, circle, triangle, polygon, etc., and the picture shape required by the typesetting template is a fixed square, it is necessary to correspond to the paragraph keywords
  • the target picture is cropped to obtain a cropped target picture that matches the shape of the picture required by the typesetting template.
  • the layout template of the presentation contains the preset display position corresponding to the picture. According to the layout template of the presentation, the position of the target picture can be determined. If the position of the target picture deviates from the preset display position, the position of the target picture Perform adjustment processing to obtain the adjusted target image.
  • users can edit or modify the content displayed in each page of the presentation. For example, adjust text, pictures, fonts, colors, text boxes, and add pictures to the specified location of the presentation.
  • the size of the target picture corresponding to the paragraph keyword and the position of the target picture are adjusted according to the presentation template, so that the target picture of the subsequent generated presentation does not appear Too big or small, the position is beyond the presentation, or the position is too biased, so that the subsequent generated presentation is more readable and more beautiful.
  • S90 Typesetting according to the target picture, overall style information of the presentation, sub-topics, and topic paragraphs corresponding to the sub-topics, to generate a presentation corresponding to the main body keywords.
  • step S90 that is, typesetting is performed according to the target picture, the overall style information of the presentation, the sub-theme, and the theme paragraphs corresponding to the sub-theme, and the main keywords are generated.
  • the corresponding presentation includes the following steps:
  • S91 Extract the target picture, the overall style information of the presentation and the feature of the topic paragraph to obtain the corresponding picture feature, document style feature, and topic paragraph feature.
  • the materials used to generate the presentation can include at least one of text, pictures, audio, and video. According to different types of materials, the corresponding characteristics of the materials are also different.
  • the theme paragraph refers to the number of lines of text in the paragraph, the font, and the format of each line of text; the feature of the target image, that is, the image feature refers to the format, type and other features of the image; the style feature refers to the presentation
  • the corresponding overall style please refer to step S60 for details, which will not be repeated here.
  • S92 Match the picture feature, the topic paragraph feature, and the style feature with the pre-stored typesetting rules of the typesetting template to obtain a successfully matched typesetting template corresponding to each sub-theme.
  • the pre-stored database stores different types of presentations Templates and corresponding typesetting rules.
  • the typesetting rules of a typesetting template can be for a one-page presentation with three lines of text; or for a one-page presentation containing a title and body, or for a presentation that contains The presentation of pictures and text is typeset according to the size of the pictures and text and the proportion of the page position.
  • the typesetting rules corresponding to different typesetting templates can be the same, that is, the same typesetting rule may correspond to multiple different Or, according to the different typesetting templates, the typesetting rules of the typesetting templates may also be different, and the typesetting templates stored in the database will be continuously updated according to the needs of users.
  • one typesetting rule may correspond to multiple different typesetting templates, so When matching the picture features and the theme paragraph features with the pre-stored typesetting rules of the typesetting template, it is first necessary to extract the typesetting rules corresponding to each typesetting template in the database, and then select the typesetting rules that match the picture from all the extracted typesetting rules. The typesetting rules that match the features and the features of the topic paragraphs, and then according to the successfully matched typesetting rules, the corresponding typesetting templates are searched from the database.
  • the subtopic corresponding to the topic paragraph indicates that the topic needs to be highlighted at the main level.
  • the first line of text in the topic paragraph feature is "a company's 2018 spring new product launch conference", and the subtopics are highlighted.
  • the second line of text is "2018.04.09 14:30”, which means the time, which can be non-highlighted
  • the third line of text is "Beijing a university gymnasium" to indicate the location, or it can be non-highlighted.
  • the corresponding topic paragraph is characterized as a page of presentation containing 3 lines of text, among which the main subtopics are highlighted, the first line is highlighted, and the second and third lines are highlighted.
  • the picture feature includes a square picture, then according to the typesetting rules, the typesetting rules extracted from the database can be used to match the topic paragraph features and the picture features, and then you can find out that it is suitable for a page of presentation.
  • the typesetting template of the typesetting rule for example, the style feature is romantic, and the typesetting template with the style feature of romance is further matched.
  • the typesetting template corresponding to the successfully matched typesetting rule is displayed to the user, and the user can perform a pre-selection operation for the typesetting template.
  • S93 Use the successfully matched typesetting template corresponding to each subtopic to typeset the target picture, the subtopic, and the topic paragraphs corresponding to the subtopic to generate a presentation corresponding to the main keyword.
  • the typesetting template corresponding to the successfully matched typesetting rule to typeset the target picture, sub-theme, and the theme paragraph corresponding to the sub-theme, so that each sub-theme, the theme paragraph corresponding to the sub-main body, and the theme paragraph correspond to the theme paragraph.
  • the target picture will be typeset using its successfully matched typesetting template to generate a slideshow corresponding to the subtopic, and then automatically generate a presentation corresponding to the main keyword based on the slides corresponding to the N subtopics.
  • the picture features, manuscript style features, and theme paragraph features are matched with the typesetting rules of the pre-stored typesetting template, and typesetting templates corresponding to the successfully matched typesetting rules are used for typesetting.
  • the presentation corresponding to the main keywords is automatically generated, thereby realizing a smarter and more automated way of generating presentations, saving time and effort, which not only improves the user experience, but also improves the efficiency of presentation generation.
  • this application can complete the intelligent search of text material and picture information through simple subject keywords input by the user, and at the same time, combined with the types of subject keywords, it gives a presentation template recommendation with appropriate style. , Saving a lot of time for information search and integration work in the early stage, material search, picture material search, style recommendation, format typesetting based on main keywords, so that the corresponding presentation can be quickly and automatically generated after the main keywords input by the client. The problem of low efficiency in generating presentations in the prior art is solved.
  • a presentation generating device corresponds to the presentation generating method in the above-mentioned embodiment in a one-to-one correspondence.
  • the presentation generating device includes a receiving module 10, a first search module 20, a splicing integration module 30, an identification and disassembly module 40, an analysis module 50, a determination module 60, an extraction module 70, and a second search module 80 And generating module 90.
  • each functional module receives the main keywords of the presentation input by the user through the client; the first search module 20 uses the main keywords to search for text materials in the text material library to obtain multiple Text material; a splicing and integration module 30, which splices and integrates a plurality of the text materials to obtain an overall text material; a recognition and disassembly module 40, which performs topic recognition and paragraph disassembly of the overall text material to obtain at least one subtopic, And the topic paragraphs corresponding to the sub-topics; the analysis module 50 uses the keywords and the sub-topics to analyze the style of the manuscript to obtain the style analysis results corresponding to each sub-topic; the determination module 60, according to the corresponding sub-topic The style analysis result determines the overall style information of the presentation; the extraction module 70 inputs the subtopic and the topic paragraph corresponding to the subtopic into the keyword extraction model to extract related words to obtain paragraphs related to the topic paragraph Keywords; the second search module 80, input a plurality
  • Each module in the above-mentioned presentation generating device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor.
  • the processor executes the computer-readable instructions, The following steps are implemented: receiving the main keywords of the presentation entered by the user through the client; using the main keywords to search for text materials in the text material library to obtain multiple text materials; combining the multiple text materials , To obtain the overall text material; subject the overall text material to subject identification and paragraph disassembly to obtain at least one sub-topic and the subject paragraph corresponding to the sub-topic; use the keywords and the sub-topic to analyze the manuscript style Process to obtain the style analysis result corresponding to each subtopic; determine the overall style information of the presentation according to the style analysis result corresponding to each subtopic; input the subtopic and the topic paragraph corresponding to the subtopic into the keyword extraction model Perform related word extraction to obtain paragraph keywords related to the topic paragraph; input multiple paragraph keywords into a picture library to search to obtain the target picture corresponding to the paragraph keyword; according to the target picture, presentation
  • one or more readable storage media storing computer readable instructions are provided.
  • the readable storage medium stores computer readable instructions.
  • the readable storage media provided in this embodiment include non-transitory A volatile readable storage medium and a volatile readable storage medium.
  • the one or more processors implement the following steps: Receive a presentation input by the user through the client Use the subject keywords to search for text materials in the text material library to obtain multiple text materials; stitch and integrate multiple text materials to obtain an overall text material; perform a text material search on the overall text material Subject identification and paragraph disassembly to obtain at least one sub-theme and a subject paragraph corresponding to the sub-theme; use the keywords and the sub-themes to analyze the style of the manuscript to obtain the style analysis results corresponding to each sub-theme; The style analysis result corresponding to each sub-topic determines the overall style information of the presentation; the sub-topic and the topic paragraph corresponding to the sub-topic are input into the keyword extraction model to perform related word extraction,
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种演示文稿生成方法、装置、计算机设备及存储介质,涉及人工智能,其中方法部分包括:接收用户通过客户端输入的演示文稿的主体关键词;在文本素材库中进行文本素材搜索;将文本素材进行拼接整合;进行文稿风格分析处理;确定演示文稿整体风格信息;将子主题和子主题对应的主题段落输入关键词抽取模型进行相关词提取;将段落关键词输入图片库中进行搜索,生成关键词对应的演示文稿。本申请通过简单的主体关键词,就能进行素材搜索、图片素材搜索、风格推荐、格式排版,节约了前期大量的信息搜索、整合工作的时间,以实现客户端输入的主体关键词后快速自动生成对应的演示文稿,解决了现有技术中演示文稿的生成效率低的问题。

Description

一种演示文稿生成方法、装置、计算机设备及存储介质
本申请要求于2020年7月28日提交中国专利局、申请号为202010737234.X,申请名称为“一种演示文稿生成方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种演示文稿生成方法、装置、计算机设备及存储介质。
 
背景技术
随着互联网技术的不断发展,演示文稿的制作水平逐步提高,应用领域越来越广,正成为人们工作生活的重要组成部分,并在工作汇报、企业宣传、产品推介、婚礼庆典、项目竞标、管理咨询、教育培训等领域占着举足轻重的地位。演示文稿的应用领域日益广泛,人们对幻灯片的制作需求也越来越多。演示文稿已经成为现代社会工作中必不可少的表现形式,通常汇报类的演示文稿形式较为规整统一,内容也相对固定。
 
技术问题
目前用户在制作演示文稿时,申请人意识到,需要依靠人工手动利用搜索引擎搜索相关的信息,人工手动从成千上万篇的文献中筛选所需的素材,所需的素材需要包含有大量文字、图片等,接着手动搭建演示文稿框架,再在演示文稿框架上进行素材的填充,最后对排版进行美化,这种手动整理演示文稿素材的方式不仅十分耗时耗力,影响用户体验,也影响了演示文稿的生成效率。
 
技术解决方案
本申请提供一种演示文稿生成方法、装置、计算机设备及存储介质,以解决演示文稿生成效率低的问题。
一种演示文稿生成方法,包括:接收用户通过客户端输入的演示文稿的主体关键词;利用所述主体关键词在文本素材库中进行文本素材搜索,得到多个文本素材;将多个所述文本素材进行拼接整合,得到整体文本素材;对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落;利用所述关键词和所述子主题进行文稿风格分析处理,得到每个子主题对应的风格分析结果;根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息;将所述子主题和所述子主题对应的主题段落输入关键词抽取模型进行相关词提取,得到与所述主题段落相关的段落关键词;将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
一种演示文稿生成装置,包括:接收模块,接收用户通过客户端输入的演示文稿的主体关键词;第一搜索模块,利用所述主体关键词在文本素材库中进行文本素材搜索,得到多个文本素材;拼接整合模块,将多个所述文本素材进行拼接整合,得到整体文本素材;识别拆解模块,对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落;分析模块,利用所述关键词和所述子主题进行文稿风格分析处理,得到每个子主题对应的风格分析结果;确定模块,根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息;提取模块,将所述子主题和所述子主题对应的主题段落输入关键词抽取模型进行相关词提取,得到与所述主题段落相关的段落关键词;第二搜索模块,将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;生成模块,根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:接收用户通过客户端输入的演示文稿的主体关键词;利用所述主体关键词在文本素材库中进行文本素材搜索,得到多个文本素材;将多个所述文本素材进行拼接整合,得到整体文本素材;对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落;利用所述关键词和所述子主题进行文稿风格分析处理,得到每个子主题对应的风格分析结果;根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息;将所述子主题和所述子主题对应的主题段落输入关键词抽取模型进行相关词提取,得到与所述主题段落相关的段落关键词;将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:接收用户通过客户端输入的演示文稿的主体关键词;利用所述主体关键词在文本素材库中进行文本素材搜索,得到多个文本素材;将多个所述文本素材进行拼接整合,得到整体文本素材;对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落;利用所述关键词和所述子主题进行文稿风格分析处理,得到每个子主题对应的风格分析结果;根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息;将所述子主题和所述子主题对应的主题段落输入关键词抽取模型进行相关词提取,得到与所述主题段落相关的段落关键词;将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
 
有益效果
上述演示文稿生成方法、装置、计算机设备及存储介质,所实现的其中一个方案中接收用户通过客户端输入的主体关键词;利用主体关键词在文本素材库中进行文本素材搜索;将文本素材进行拼接整合;利用关键词和所述子主题进行演示文稿风格分析处理;确定演示文稿整体风格信息;将子主题和子主题对应的主题段落输入关键词抽取模型进行相关词提取;将多个相关词输入图片库中进行搜索,生成关键词对应的演示文稿。本申请通过用户输入的简单的主体关键词,就能完成文本素材、图片信息的智能搜索,同时结合主体关键词的类型,给出风格贴切的演示文稿模板推荐,节约了前期大量的信息搜索、整合工作的时间,基于关键词进行素材搜索、图片素材搜索、风格推荐、格式排版,以实现客户端输入的主体关键词后快速自动生成对应的演示文稿,解决了现有技术中演示文稿的生成效率低的问题。
本申请的一个或多个实施例的细节在下面的附图和描述中提出,本申请的其他特征和优点将从说明书、附图以及权利要求变得明显。
 
附图说明
为了更清楚地说明本申请的技术方案,下面将对本申请的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一实施例中演示文稿生成方法的一应用环境示意图;
图2是本申请一实施例中演示文稿生成方法的一流程图;
图3是本申请一实施例中演示文稿生成方法的另一流程图;
图4是本申请一实施例中演示文稿生成方法的另一流程图;
图5是本申请一实施例中演示文稿生成方法的另一流程图;
图6是本申请一实施例中演示文稿生成方法的另一流程图;
图7是本申请一实施例中演示文稿生成方法的另一流程图;
图8是本申请一实施例中演示文稿生成方法的另一流程图;
图9是本申请一实施例中演示文稿生成装置的一原理框图;
图10是本申请一实施例中计算机设备的一示意图。
 
具体实施方式
下面将结合本申请中的附图,对本申请中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请提供的演示文稿生成方法,可应用在如图1的应用环境中,其中,服务端通过网络与客户端进行通信。
在一实施例中,如图2所示,提供一种演示文稿生成方法,以该方法应用在图1中的服务器为例进行说明,包括如下步骤:
S10:接收用户通过客户端输入的演示文稿的主体关键词。
可理解地,在需要生成演示文稿时,客户端可提供接收用户通过输入的主体关键词,服务器可以接收该客户端反馈的主体关键词。其中,主体关键词可以为任意词,例如,针对大段的前沿科技新闻稿时,主体关键词可能为“5G”、“区域链”等。
S20:利用所述主体关键词在文本素材库中进行文本素材搜索,得到多个文本素材。
可理解地,文本素材库可以为存储文本素材的素材库,例如利用主体关键词“5G”在文本素材库中进行搜索,得到“5G概念”、“5G发展现状”等相关的文本素材。
S30:将多个所述文本素材进行拼接整合,得到整体文本素材。
可理解地,整体文本素材是指与文本素材相关的文本内容。例如,将前述获取的“5G概念”、“5G发展现状”等相关的文本素材进行拼接整合,得到整体文本素材,也就是说,整体文本素材多个文本素材整合得到的关于5G的文本内容。
S40:对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落。
在一实施例中,所述整体文本素材包括多个自然段落,如图3所示,步骤S40中,也即所述对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落,具体包括如下步骤:
S41:以每个所述自然段落中的关键词为特征,利用无监督聚类模型对所述整体文本素材进行主题识别,得到N个所述子主题。
该步骤中,需要对整体文本素材进行细分主题识别,示例性地,以每个所述自然段落中的每句中的关键词为特征利用无监督聚类模型进行无监督聚类,该无监督聚类模型可以为K-means聚类,用于对将整体文本素材划分为N个子主题,例如,在“5G”主体关键词下的整体文本素材包括“历史发展”、“现有应用”、“未来趋势”等等的子主题。
S42:根据N个所述子主题对所述整体文本素材进行主题段落识别,识别出每个所述子主题对应的主题段落。
例如,在“5G”主体关键词下的整体文本素材包括“历史发展”、“现有应用”、“未来趋势”等等的子主题,将“历史发展”、“现有应用”、“未来趋势”等等的子主题作为子主题关键词,利用子主题关键词对整体文本素材进行主题段落识别,识别出“历史发展”对应的主题段落为如下段落1:“1G时代,1986年,在美国芝加哥诞生,第一代移动通讯技术(1st Generation,简称1G)登上舞台等”,“现有应用”对应的主题段落为如下段落2:“现有5G应用在折叠车联网与自动驾驶、折叠外科手术、折叠智能电网等领域”。
最后将每个所述子主题与对应的主题段落以结构化的形式进行关联存储,可理解地,将所述整体文本素材中的每个子主题和对应的主题段落进行关联存储,如子主题“历史发展”对应的段落1的结构化的形式进行关联存储;每个子主题和对应的段落进行关联存储以便于后续生成与子主题相对应的排版模板,再根据该排版模板生成与子主题对应的幻灯片,而整体文本素材划分为N个子主题,每个子主题可以对应有不同的排版模板,进而再根据N个子主题所对应的幻灯片生成演示文稿。
在图3对应的实施例中,利用无监督聚类模型对整体文本素材进行主题识别,可划分出N个子主题,再根据N个子主题进行段落识别,以实现将整体文本素材拆解出不同主题对应的不同主题段落,最后将整体文本素材中的每个子主题和对应的主题段落进行关联存储,以形成较为清晰的文本逻辑框架,本方案针对整体文本素材进行逻辑结构的智能拆分,以保证段落内容分布的合理性,同时,使得后续生成的演示文稿具有较高的可读性。
在一实施例中,如图4所示,在步骤S42中,也即根据N个所述子主题对所述整体文本素材进行主题段落识别,所述识别出每个所述子主题对应的主题段落,具体包括如下步骤:
S421:利用Textrank算法对所述主题段落中的每个所述自然段落进行摘要提取处理,得到多个摘要。
S422:从所述自然段落提取的多个摘要中选取超过预设的重要值的摘要作为所述自然段落的关联句子。
该步骤中,利用包含语义信息的Textrank算法对所述主题段落中的每个所述自然段落进行摘要提取处理,得到多个摘要;可理解地,TextRank算法是基于图模型的一种抽取式摘要方法,TextRank算法利用一篇文档内部的词语间的语义信息便可以抽取该文档的摘要。TextRank算法进行摘要提取处理的原理:通过把自然段落分割成若干组成句子,用句子之间的相似度作为边的权重,通过循环迭代计算句子的TextRank值,选取每个所述自然段落的多个摘要;再所述自然段落提取的多个摘要中选取超过预设的重要值的摘要作为所述自然段落的关联句子。
S423:利用MMR模型对所述关联句子进行筛选,去除语义关联高的冗余句子,得到所述自然段落对应的目标句子。
可理解地,MMR是Maximal Marginal Releuance的缩写,中文为最大边界相关算法或最大边缘相关算法,MMR算法目的是减少排序结果的冗余,同时保证结果的相关性。
目标句子为语义不关联的句子,利用MMR模型对所述关联句子进行筛选,去除步骤S422获取的关联句子中语义关联高的冗余句子,得到目标句子。
S424:对所述子主题的所有所述自然段落对应的目标句子进行整合,得到所述子主题对应的主题段落。
可以理解地,将每个子主题下的所有所述自然段落对应目标句子进行整合,得到所述子主题对应的主题段落,以每个子主题下的主题段落对应的文本素材。
在图4对应的实施例中,针对主题段落中的每一个自然段落,利用包含语义信息的Textrank算法进行摘要提取处理,得到自然段落中重要性高的句子作为摘要,在选取超过预设的重要值的摘要的句子作为关联句子,再利用MMR模型去除语义关联高的冗余句子,得到每个子主题对应的文本素材,以防止文本素材关联性低或者文本素材过多显得冗余的问题,进而提高演示文稿具有较高的可读性。
在一实施例中,如图5所示,在步骤S40之后,也即在利用无监督聚类模型对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落之后,具体还包括如下步骤:
S43:检测所述子主题对应的主题段落中是否包含层级标题。
S44:若所述子主题对应的主题段落中包含层级标题,则对所述各个子主题对应的主题段落进行层级段落拆解,得到层级段落拆解后的各个层级标题对应的层级段落。
示例性地,利用无监督聚类模型对子主题对应的主题段落材进行主题层级标题识别,以检测子主题对应的主题段落中是否包含层级标题;若述子主题对应的主题段落中包含层级标题,则检测出每个主题段落对应的层级标题,利用子层级标题对主题段落进行层级段落拆解,得到层级段落拆解后的各个层级标题对应的层级段落;最后,主题段落中的每个层级标题和对应的层级进行关联存储。
进一步地,可以进一步判断层级段落中是否包括次层级标题,若层级段落中包含次层级标题,则对层级段落进行次层级段落拆解,得到次层级段落拆解后的各个次层级标题对应的次层级段落。
在图5对应的实施例中,若所述子主题对应的主题段落中包含层级标题,则对所述各个子主题对应的主题段落进行层级段落拆解,得到层级段落拆解后的各个层级标题对应的层级段落,以实现将主题段落拆解出不同层级标题对应的不同层级段落,最后将主题段落中的每个层级标题和对应的层级进行关联存储,以形成更完整的整体的逻辑框架;此外,对主题段落进行层级拆解后,得到的层级标题和层级段落,能够生成可读性更高的演示文稿。
S50:利用所述关键词和所述子主题进行文稿风格分析处理,得到每个子主题对应的风格分析结果,得到风格分析结果。
可理解地,文稿风格分析处理也即文本情感分析(Sentiment Analysis)是指利用自然语言处理和文本挖掘技术,对带有情感色彩的主观性文本进行分析、处理和抽取的过程。可理解地,利用文本情感分析对所述主体关键词和所述子主题进行分析、处理和抽取,提取出所述主体关键词和所述子主题的情感,例如科技、浪漫、严肃等等情感;
该步骤中利用所述关键词和所述子主题进行文稿风格分析处理,得到每个子主题对应的风格分析结果,得到每个子主题对应的风格分析结果,其中,所述风格分析结果包括但不限于:科技风格、浪漫风格、严肃风格、清新风格、简约风格以及其他风格。这样,可确定每个所述子主题对应的风格分析结果。假如所述主体关键词的情感为科技,子主题的情感为严肃,则得到风格分析结果为科技且严肃风格,假如所述主体关键词的情感为科技,且子主题的情感也为科技,则得到风格分析结果为科技风格。
S60:根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息。
在一实施例中,如图6所示,步骤S60中,也即所述根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息,具体包括如下步骤:
S61:利用所述风格分析结果确定每个子主题对应的模板配色信息和主题段落对应的文本格式信息,其中,文本格式信息包括文本字体信息和文本字号信息。
S62:根据所述模板配色信息、文本字体信息和文本字号信息,确定演示文稿整体风格信息。
可理解地,模板配色信息是指风格模板对应的不同配色信息,利用所述风格分析结果从模板样本数据库查找子主题对应的模板配色信息,例如,风格分析结果为浪漫风格,则确定所述子主题对应的模板配色信息为粉色的浪漫模板;根据子主题查找到子主题到对应的主题段落对应的文本格式信息,详细可参见步骤S40,此处不再累述;利用所述风格分析结果确定子主题对应的主题段落对应的文本格式信息,可理解地,文本字体信息是指文本的字体型号,文本字号信息是指文本的字体大小,例如,风格分析结果为浪漫风格,在烂漫风格下的文本格式信息为五号的文本字号信息,宋体的文本字体信息,进一步地,烂漫风格下的文本配色为浅黄色;最后根据所述模板配色信息、文本字体信息和文本字号信息确定演示文稿整体风格信息。
在图6对应的实施例中,先根据风格分析结果,得到子主题对应的模板配色信息和拆解后的素材对应的文本格式信息,再根据模板配色信息、文本字体信息和文本字号信息,确定演示文稿整体风格信息,以使得后续生成的演示文稿切合度高。
S70:将所述子主题和所述子主题对应的主题段落输入关键词抽取模型进行相关词提取,得到与所述主题段落相关的段落关键词。
可理解地,关键词抽取模型常见的算法有TF-IDF(term frequency-inverse document frequency))、TextRank(Term Frequency)等,TF-IDF算法是一种用于资讯检索于资讯探勘的常用加权技术。是一种统计方法,用以评估一字词对于一个文件集或一个语料库中的其中一份文件的重要程度。字词的重要性随着它在文件中出现的次数呈正比地增加,但同时也会随着它在语料库中出现的频率呈反比地下降。TextRank是一种基于图排序的算法,通过把文本分割成若干个单元(单词、句子)并建立图模型,利用投票机制对文本中的重要成分进行排序,尽利用单章文档本身的信息即可实现关键词提取、做文摘。将所述子主题和所述子主题对应的主题段落输入关键词抽取模型进行相关词提取,针对子主题对应的主题段落提取出与所述主题段落相关的相关词例,如针对大段的前沿科技新闻稿,相关词可能为“5G”、“区块链”、“大数据”等。
S80:将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片。
在一实施例中,如图7所示,步骤S80中,也即所述将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片,具体包括如下步骤:
S81:将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片。
S82:根据演示文稿排版模板对所述段落关键词对应的目标图片的尺寸进行处理以及目标图片的位置进行调整处理,得到调整后的目标图片。
可理解地,图片库中存储的图片均有相关的主题标签,将多个所述段落关键词输入图片库中进行搜索,得到段落关键词对应的目标图片。演示文稿的排版模板是指步骤S60中确定的演示文稿整体风格信息对应的标准的风格模板,不同的风格模板会对图片的大小有一定限制。由于图片库中的图片形状及清晰度都有区别,所以在生成带有图片的演示文稿时,图片库中的目标图片的像素与排版模板要求的图片像素可能会存在差别,因此需要对段落关键词对应的目标图片的尺寸进行处理,例如,搜索到的图片库中的目标图片的像素为400*500,而排版模板要求的图片像素为300*400, 则首先将图片库中的目标图片压缩为300*375,纵向像素剩余的25像素暂时用透明色填充。另外,在生成后的演示文稿中,用户也可以对目标图片的尺寸进行自行调整。
进一步地,由于图片库中的图片形状可能是多种形态的,例如,五角星形、圆形、三角形、多边形等,而排版模板要求的图片形状是固定的正方形,因此需要将段落关键词对应的目标图片进行裁剪处理,得到与排版模板要求的图片形状匹配的裁剪后的目标图片。
可理解地,演示文稿的排版模板包含图片对应的预设展示位置,根据演示文稿的排版模板可以确定出目标图片的位置,假如目标图片的位置存在偏离预设展示位置,则对目标图片的位置进行调整处理,得到调整后的目标图片。
另外,针对每页演示文稿中的显示内容,用户可以进行编辑或修改。例如,调整文本、图片、字体、颜色、文本框,以及添加图片到演示文稿的指定位置。
在图7对应的实施例中,根据演示文稿的模板对所述段落关键词对应的目标图片的尺寸进行处理以及目标图片的位置进行调整处理,以便于后续生成的演示文稿的目标图片不会出现偏大或者偏小、位置超出演示文稿或者位置过偏的问题,以使得后续生成的演示文稿的可读性强且更美观。
S90:根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
在一实施例中,如图8所示,步骤S90中,也即所述根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿,具体包括如下步骤:
S91:提取所述目标图片、演示文稿整体风格信息和主题段落的特征,得到对应的图片特征、文稿风格特征和主题段落特征。
可理解地,用于生成演示文稿的素材可以包括文本、图片、音频和视频中的至少一种,而根据素材类型的不同,素材对应的特征也是不同的,以主题段落为例,则主题段落的特征即主题段落特征指的就是段落中的文本行数、字体以及每行文字的格式等;目标图片的特征即图片特征指的就是图片的格式、类型等特征;风格特征指的是演示文稿对应的整体风格,详细可参见步骤S60,此处不再累述。
S92:将所述图片特征、主题段落特征和风格特征与预先存储的排版模板的排版规则进行匹配,得到每个子主题对应的匹配成功的排版模板。
可理解地,将所述图片特征、主题段落特征和风格特征与预先存储的排版模板的排版规则进行匹配,得到匹配成功的排版规则对应的排版模板;预先存储数据库中存储有演示文稿的不同排版模板以及对应的排版规则,例如一个排版模板的排版规则可以是针对具有三行文字的一页演示文稿而进行排版;或者是针对包含标题和正文的一页演示文稿进行排版,再或者是针对包含图片以及文字的演示文稿,根据图片及文字大小以及占据页面位置的比例进行排版等,不同的排版模板对应的排版规则可以是相同的,也就是说,同一个排版规则,可能对应有多个不同的排版模板;或者,根据排版模板的不同,排版模板的排版规则也可能是不同的,并且数据库中存储的排版模板会根据用户的需求不断更新。
将所述图片特征和主题段落特征与预先存储的排版模板的排版规则进行匹配,得到符合图片特征且符合主题段落特征的预先存储的排版模板的排版规则;可理解地,由于排版模板的排版规则与图片特征和主题段落特征具有对应的关系,所以,针对图片特征和主题段落特征在预先存储的排版模板以及对应的排版规则的数据库中,一个排版规则可能对应了多个不同的排版模板,因此,在进行图片特征和主题段落特征与预先存储的排版模板的排版规则进行匹配时,首先需要提取出数据库中各个排版模板对应的排版规则,再从所提取出来的所有排版规则中选出与图片特征及主题段落特征的特征相匹配的排版规则,进而根据匹配成功的排版规则,从数据库中查询出与其对应的各个排版模板。
例如,主题段落特征中存储3行文字,主题段落对应的子主题表示主题需要主级突出显示,主题题段落特征的第一行文字为 “某公司2018春季新品发布会”,次级突出显示,第二行文字为 “2018.04.09 14:30”,表示时间,可以非突出显示,第三行文字为“北京某大学体育馆”表示地点,也可以非突出显示。根据该主题题段落,可获取到其对应的主题段落特征为一页演示文稿中包含3行文字,其中,子主题主级突出显示,第一行次级突出显示,第二行及第三行为非突出显示,图片特征包括一张正方形图片,则根据该排版规则,可以利用从数据库中提取排版规则与该主题段落特征及图片特征进行匹配,进而可以查询出适用于一页演示文稿中包含子主题、3行文字及一张正方形图片的排版规则与上述主题段落特征及图片特征包匹配成功的排版模板,而符合该排版规则的排版模板可能有多种;因此还需要根据风格特征进一步确定出该排版规则的排版模板,例如风格特征为浪漫,则进一步匹配出风格特征为浪漫的排版模板。
另外,将匹配成功的排版规则对应的排版模板显示给所述用户,用户可以针对所述排版模板的进行预选定操作。
S93:利用每个子主题对应的匹配成功的排版模板,对所述目标图片、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
可理解地,利用匹配成功的排版规则对应的排版模板,对目标图片、子主题以及子主题对应的主题段落进行排版,使得每个子主题、与该子主体对应的主题段落、与该主题段落对应的目标图片,将使用其匹配成功的排版模板进行排版后生成与该子主题对应的幻灯片,进而再根据N个子主题所对应的幻灯片自动生成主体关键词对应的演示文稿。
在图8对应的实施例中,将所述图片特征、文稿风格特征和主题段落特征与预先存储的排版模板的排版规则进行匹配,并利用匹配成功的排版规则对应的排版模板进行排版,进而可以自动生成主体关键词对应的演示文稿,从而实现了更智能、自动化水平更高的演示文稿的生成方式,省时省力,既提升了用户体验,也提升了演示文稿的生成效率。
在图2对应的实施例中,本申请通过用户输入的简单的主体关键词,就能完成文本素材、图片信息的智能搜索,同时结合主体关键词的类型,给出风格贴切的演示文稿模板推荐,节约了前期大量的信息搜索、整合工作的时间,基于主体关键词进行素材搜索、图片素材搜索、风格推荐、格式排版,以实现客户端输入的主体关键词后快速自动生成对应的演示文稿,解决了现有技术中演示文稿的生成效率低的问题。
 
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请的实施过程构成任何限定。
在一实施例中,提供一种演示文稿生成装置,该演示文稿生成装置与上述实施例中演示文稿生成方法一一对应。如图9所示,该演示文稿生成装置包括接收模块10、第一搜索模块20、拼接整合模块30、识别拆解模块40、分析模块50、确定模块60、提取模块70、第二搜索模块80和生成模块90。各功能模块详细说明如下:接收模块10,接收用户通过客户端输入的演示文稿的主体关键词;第一搜索模块20,利用所述主体关键词在文本素材库中进行文本素材搜索,得到多个文本素材;拼接整合模块30,将多个所述文本素材进行拼接整合,得到整体文本素材;识别拆解模块40,对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落;分析模块50,利用所述关键词和所述子主题进行文稿风格分析处理,得到每个子主题对应的风格分析结果;确定模块60,根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息;提取模块70,将所述子主题和所述子主题对应的主题段落输入关键词抽取模型进行相关词提取,得到与所述主题段落相关的段落关键词;第二搜索模块80,将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;生成模块90,根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
关于演示文稿生成装置的具体限定可以参见上文中对于演示文稿生成方法的限定,在此不再赘述。上述演示文稿生成装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,如图10所示,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机可读指令,处理器执行计算机可读指令时实现以下步骤:接收用户通过客户端输入的演示文稿的主体关键词;利用所述主体关键词在文本素材库中进行文本素材搜索,得到多个文本素材;将多个所述文本素材进行拼接整合,得到整体文本素材;对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落;利用所述关键词和所述子主题进行文稿风格分析处理,得到每个子主题对应的风格分析结果;根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息;将所述子主题和所述子主题对应的主题段落输入关键词抽取模型进行相关词提取,得到与所述主题段落相关的段落关键词;将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
在一个实施例中,提供了一个或多个存储有计算机可读指令的可读存储介质,该可读存储介质上存储有计算机可读指令,本实施例所提供的可读存储介质包括非易失性可读存储介质和易失性可读存储介质,该计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器实现以下步骤: 接收用户通过客户端输入的演示文稿的主体关键词;利用所述主体关键词在文本素材库中进行文本素材搜索,得到多个文本素材;将多个所述文本素材进行拼接整合,得到整体文本素材;对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落;利用所述关键词和所述子主题进行文稿风格分析处理,得到每个子主题对应的风格分析结果;根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息;将所述子主题和所述子主题对应的主题段落输入关键词抽取模型进行相关词提取,得到与所述主题段落相关的段落关键词;将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink) DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。

Claims (20)

  1. 一种演示文稿生成方法,其中,包括:
    接收用户通过客户端输入的演示文稿的主体关键词;
    利用所述主体关键词在文本素材库中进行文本素材搜索,得到多个文本素材;
    将多个所述文本素材进行拼接整合,得到整体文本素材;
    对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落;
    利用所述关键词和所述子主题进行文稿风格分析处理,得到每个子主题对应的风格分析结果;
    根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息;
    将所述子主题和所述子主题对应的主题段落输入关键词抽取模型进行相关词提取,得到与所述主题段落相关的段落关键词;
    将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;
    根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
  2. 如权利要求1所述的演示文稿生成方法,其中,所述整体文本素材包括多个自然段落,所述对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落,包括:
    以每个所述自然段落中的关键词为特征,利用无监督聚类模型对所述整体文本素材进行主题识别,得到N个所述子主题;
    根据N个所述子主题对所述整体文本素材进行主题段落识别,识别出每个所述子主题对应的主题段落。
  3. 如权利要求2所述的演示文稿生成方法,其中,所述根据N个所述子主题对所述整体文本素材进行主题段落识别,识别出每个所述子主题对应的主题段落,包括:
    利用Textrank算法对所述主题段落中的每个所述自然段落进行摘要提取处理,得到多个摘要;
    从所述自然段落提取的多个摘要中选取超过预设的重要值的摘要作为所述自然段落的关联句子;
    利用MMR模型对所述关联句子进行筛选,去除语义关联高的冗余句子,得到所述自然段落对应的目标句子;
    对所述子主题的所有所述自然段落对应的目标句子进行整合,得到所述子主题对应的主题段落。
  4. 如权利要求1所述的演示文稿生成方法,其中,所述对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落之后,还包括:
    检测所述子主题对应的主题段落中是否包含层级标题;
    若所述子主题对应的主题段落中包含层级标题,则对所述各个子主题对应的主题段落进行层级段落拆解,得到层级段落拆解后的各个层级标题对应的层级段落。
  5. 如权利要求1所述的演示文稿生成方法,其中,所述根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息,包括:
    利用所述风格分析结果确定每个所述子主题对应的模板配色信息和主题段落对应的文本格式信息,其中,文本格式信息包括文本字体信息和文本字号信息;
    根据所述模板配色信息、文本字体信息和文本字号信息,确定演示文稿整体风格信息。
  6. 如权利要求1所述的演示文稿生成方法,其中,所述将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片,包括:
    将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;
    根据演示文稿排版模板对所述段落关键词对应的目标图片的尺寸进行处理以及目标图片的位置进行调整处理,得到调整后的目标图片。
  7. 如权利要求1所述的演示文稿生成方法,其中,所述根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿,包括:
    提取所述目标图片、演示文稿整体风格信息和主题段落的特征,得到对应的图片特征、文稿风格特征和主题段落特征;
    将所述图片特征、主题段落特征和风格特征与预先存储的排版模板的排版规则进行匹配,得到每个子主题对应的匹配成功的排版模板;
    利用每个子主题对应的匹配成功的排版模板,对所述目标图片、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
  8. 一种演示文稿生成装置,其中,包括:
    接收模块,接收用户通过客户端输入的演示文稿的主体关键词;
    第一搜索模块,利用所述主体关键词在文本素材库中进行文本素材搜索,得到多个文本素材;
    拼接整合模块,将多个所述文本素材进行拼接整合,得到整体文本素材;
    识别拆解模块,对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落;
    分析模块,利用所述关键词和所述子主题进行文稿风格分析处理,得到每个子主题对应的风格分析结果;
    确定模块,根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息;
    提取模块,将所述子主题和所述子主题对应的主题段落输入关键词抽取模型进行相关词提取,得到与所述主题段落相关的段落关键词;
    第二搜索模块,将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;
    生成模块,根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
  9. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其中,所述处理器执行所述计算机可读指令时实现如下步骤:
    接收用户通过客户端输入的演示文稿的主体关键词;
    利用所述主体关键词在文本素材库中进行文本素材搜索,得到多个文本素材;
    将多个所述文本素材进行拼接整合,得到整体文本素材;
    对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落;
    利用所述关键词和所述子主题进行文稿风格分析处理,得到每个子主题对应的风格分析结果;
    根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息;
    将所述子主题和所述子主题对应的主题段落输入关键词抽取模型进行相关词提取,得到与所述主题段落相关的段落关键词;
    将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;
    根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
  10. 如权利要求9所述的计算机设备,其中,所述整体文本素材包括多个自然段落,所述对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落,包括以下步骤:
    以每个所述自然段落中的关键词为特征,利用无监督聚类模型对所述整体文本素材进行主题识别,得到N个所述子主题;
    根据N个所述子主题对所述整体文本素材进行主题段落识别,识别出每个所述子主题对应的主题段落。
  11. 如权利要求10所述的计算机设备,其中,所述根据N个所述子主题对所述整体文本素材进行主题段落识别,识别出每个所述子主题对应的主题段落,包括以下步骤:
    利用Textrank算法对所述主题段落中的每个所述自然段落进行摘要提取处理,得到多个摘要;
    从所述自然段落提取的多个摘要中选取超过预设的重要值的摘要作为所述自然段落的关联句子;
    利用MMR模型对所述关联句子进行筛选,去除语义关联高的冗余句子,得到所述自然段落对应的目标句子;
    对所述子主题的所有所述自然段落对应的目标句子进行整合,得到所述子主题对应的主题段落。
  12. 如权利要求9所述的计算机设备,其中,所述对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落之后,所述处理器执行所述计算机可读指令时还实现如下步骤:
    检测所述子主题对应的主题段落中是否包含层级标题;
    若所述子主题对应的主题段落中包含层级标题,则对所述各个子主题对应的主题段落进行层级段落拆解,得到层级段落拆解后的各个层级标题对应的层级段落。
  13. 如权利要求9所述的计算机设备,其中,所述根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息,包括以下步骤:
    利用所述风格分析结果确定每个所述子主题对应的模板配色信息和主题段落对应的文本格式信息,其中,文本格式信息包括文本字体信息和文本字号信息;
    根据所述模板配色信息、文本字体信息和文本字号信息,确定演示文稿整体风格信息。
  14. 如权利要求9所述的计算机设备,其中,所述将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片,包括以下步骤:
    将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;
    根据演示文稿排版模板对所述段落关键词对应的目标图片的尺寸进行处理以及目标图片的位置进行调整处理,得到调整后的目标图片。
  15. 一个或多个存储有计算机可读指令的可读存储介质,其中,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
    接收用户通过客户端输入的演示文稿的主体关键词;
    利用所述主体关键词在文本素材库中进行文本素材搜索,得到多个文本素材;
    将多个所述文本素材进行拼接整合,得到整体文本素材;
    对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落;
    利用所述关键词和所述子主题进行文稿风格分析处理,得到每个子主题对应的风格分析结果;
    根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息;
    将所述子主题和所述子主题对应的主题段落输入关键词抽取模型进行相关词提取,得到与所述主题段落相关的段落关键词;
    将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;
    根据所述目标图片、演示文稿整体风格信息、子主题以及子主题对应的主题段落进行排版,生成所述主体关键词对应的演示文稿。
  16. 如权利要求15所述的可读存储介质,其中,所述整体文本素材包括多个自然段落,所述对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落,包括以下步骤:
    以每个所述自然段落中的关键词为特征,利用无监督聚类模型对所述整体文本素材进行主题识别,得到N个所述子主题;
    根据N个所述子主题对所述整体文本素材进行主题段落识别,识别出每个所述子主题对应的主题段落。
  17. 如权利要求16所述的可读存储介质,其中,所述根据N个所述子主题对所述整体文本素材进行主题段落识别,识别出每个所述子主题对应的主题段落,包括以下步骤:
    利用Textrank算法对所述主题段落中的每个所述自然段落进行摘要提取处理,得到多个摘要;
    从所述自然段落提取的多个摘要中选取超过预设的重要值的摘要作为所述自然段落的关联句子;
    利用MMR模型对所述关联句子进行筛选,去除语义关联高的冗余句子,得到所述自然段落对应的目标句子;
    对所述子主题的所有所述自然段落对应的目标句子进行整合,得到所述子主题对应的主题段落。
  18. 如权利要求15所述的可读存储介质,其中,所述对所述整体文本素材进行主题识别和段落拆解,得到至少一个子主题,以及所述子主题对应的主题段落之后,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:
    检测所述子主题对应的主题段落中是否包含层级标题;
    若所述子主题对应的主题段落中包含层级标题,则对所述各个子主题对应的主题段落进行层级段落拆解,得到层级段落拆解后的各个层级标题对应的层级段落。
  19. 如权利要求15所述的可读存储介质,其中,所述根据每个子主题对应的所述风格分析结果,确定演示文稿整体风格信息,包括以下步骤:
    利用所述风格分析结果确定每个所述子主题对应的模板配色信息和主题段落对应的文本格式信息,其中,文本格式信息包括文本字体信息和文本字号信息;
    根据所述模板配色信息、文本字体信息和文本字号信息,确定演示文稿整体风格信息。
  20. 如权利要求15所述的可读存储介质,其中,所述将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片,包括以下步骤:
    将多个所述段落关键词输入图片库中进行搜索,得到所述段落关键词对应的目标图片;
    根据演示文稿排版模板对所述段落关键词对应的目标图片的尺寸进行处理以及目标图片的位置进行调整处理,得到调整后的目标图片。
PCT/CN2020/118004 2020-07-28 2020-09-27 一种演示文稿生成方法、装置、计算机设备及存储介质 WO2021164255A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010737234.X 2020-07-28
CN202010737234.XA CN111881307B (zh) 2020-07-28 2020-07-28 一种演示文稿生成方法、装置、计算机设备及存储介质

Publications (1)

Publication Number Publication Date
WO2021164255A1 true WO2021164255A1 (zh) 2021-08-26

Family

ID=73201395

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118004 WO2021164255A1 (zh) 2020-07-28 2020-09-27 一种演示文稿生成方法、装置、计算机设备及存储介质

Country Status (2)

Country Link
CN (1) CN111881307B (zh)
WO (1) WO2021164255A1 (zh)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113822025A (zh) * 2021-11-25 2021-12-21 深圳市明源云链互联网科技有限公司 办公文件自动生成方法、装置、设备及存储介质
CN113849462A (zh) * 2021-09-16 2021-12-28 广东创意热店互联网科技有限公司 一种网络素材的智能推荐方法、系统、计算机设备及介质
CN114254610A (zh) * 2021-12-17 2022-03-29 广州金山移动科技有限公司 幻灯片处理方法、装置及电子设备
CN114501076A (zh) * 2022-02-07 2022-05-13 浙江核新同花顺网络信息股份有限公司 视频生成方法、设备以及介质
CN114707009A (zh) * 2022-04-13 2022-07-05 中国银行股份有限公司 一种书面汇报演示文稿制作方法和装置
CN114912425A (zh) * 2022-05-17 2022-08-16 中国银行股份有限公司 演示文稿生成方法及装置
CN115618852A (zh) * 2022-11-22 2023-01-17 山东天成书业有限公司 一种文本数字化自动校对系统
CN116842138A (zh) * 2023-07-24 2023-10-03 上海诚狐信息科技有限公司 基于文档的检索方法、装置、设备及存储介质
CN117113961A (zh) * 2023-10-20 2023-11-24 中电数创(北京)科技有限公司 一种基于Agent的公文写作方法和系统

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112328149B (zh) * 2020-11-11 2022-03-25 维沃移动通信有限公司 图片格式的设置方法、装置及电子设备
CN113220200B (zh) * 2021-05-25 2023-11-28 深圳市爱思软件技术有限公司 演示文件的展示方法和装置、存储介质、电子装置
CN113221509B (zh) * 2021-06-11 2022-06-17 中国平安人寿保险股份有限公司 幻灯片的自动生成方法、装置、设备以及存储介质
CN113268971B (zh) * 2021-06-23 2022-12-13 中国平安人寿保险股份有限公司 演示报告智能生成方法、装置、计算机设备及存储介质
CN113553450B (zh) * 2021-08-03 2024-01-30 广东新学未科技有限公司 Ppt演示文稿自动生成方法、装置、计算设备及存储介质
CN113743087B (zh) * 2021-09-07 2024-04-26 珍岛信息技术(上海)股份有限公司 一种基于神经网络词汇扩展段落的文本生成方法及系统
CN114398883B (zh) * 2022-01-19 2023-07-07 平安科技(深圳)有限公司 演示文稿生成方法、装置、计算机可读存储介质及服务器
CN115145452A (zh) * 2022-07-01 2022-10-04 杭州网易云音乐科技有限公司 帖子生成方法、介质、终端设备和计算设备
CN115203614B (zh) * 2022-07-28 2023-04-28 湖南创研科技股份有限公司 一种基于网页开发的页面自动生成分析处理方法
CN116561324B (zh) * 2023-07-04 2023-09-01 江苏曙光云计算有限公司 一种基于人工智能的网络信息智能分析调控系统及方法
CN116579308B (zh) * 2023-07-06 2023-10-10 之江实验室 一种演示文稿生成方法及装置
CN116579317B (zh) * 2023-07-13 2023-10-13 中信联合云科技有限责任公司 一种基于ai内容自动生成出版物的方法及系统
CN117036203B (zh) * 2023-10-08 2024-01-26 杭州黑岩网络科技有限公司 一种智能绘图方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9081765B2 (en) * 2008-08-12 2015-07-14 Abbyy Infopoisk Llc Displaying examples from texts in dictionaries
CN106021226A (zh) * 2016-05-16 2016-10-12 中国建设银行股份有限公司 一种文本摘要生成方法及装置
CN107077460A (zh) * 2014-09-30 2017-08-18 微软技术许可有限责任公司 结构化样本创作内容
CN111291210A (zh) * 2020-01-14 2020-06-16 广州视源电子科技股份有限公司 图像素材库生成方法、图像素材推荐方法及相关装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3531579B2 (ja) * 1999-04-30 2004-05-31 日本電気株式会社 マルチメディア文書生成装置及び方法、及びこれらをコンピュータに実行させるプログラムを記録した記録媒体
CN102256030A (zh) * 2010-05-20 2011-11-23 Tcl集团股份有限公司 可匹配背景音乐的相册演示系统及其背景音乐匹配方法
WO2013163636A1 (en) * 2012-04-28 2013-10-31 Hewlett-Packard Development Company, L.P. Generating a page, assigning sections to a document and generating a slide
CN105763925A (zh) * 2014-12-17 2016-07-13 珠海金山办公软件有限公司 演示文稿视频录制方法及装置
CN110390086A (zh) * 2018-04-19 2019-10-29 北京搜狗科技发展有限公司 一种生成文本的方法、装置和存储介质
CN111259180B (zh) * 2020-01-14 2024-04-19 广州视源电子科技股份有限公司 图像推送方法、装置、电子设备和存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9081765B2 (en) * 2008-08-12 2015-07-14 Abbyy Infopoisk Llc Displaying examples from texts in dictionaries
CN107077460A (zh) * 2014-09-30 2017-08-18 微软技术许可有限责任公司 结构化样本创作内容
CN106021226A (zh) * 2016-05-16 2016-10-12 中国建设银行股份有限公司 一种文本摘要生成方法及装置
CN111291210A (zh) * 2020-01-14 2020-06-16 广州视源电子科技股份有限公司 图像素材库生成方法、图像素材推荐方法及相关装置

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113849462A (zh) * 2021-09-16 2021-12-28 广东创意热店互联网科技有限公司 一种网络素材的智能推荐方法、系统、计算机设备及介质
CN113822025A (zh) * 2021-11-25 2021-12-21 深圳市明源云链互联网科技有限公司 办公文件自动生成方法、装置、设备及存储介质
CN114254610A (zh) * 2021-12-17 2022-03-29 广州金山移动科技有限公司 幻灯片处理方法、装置及电子设备
CN114501076A (zh) * 2022-02-07 2022-05-13 浙江核新同花顺网络信息股份有限公司 视频生成方法、设备以及介质
CN114707009A (zh) * 2022-04-13 2022-07-05 中国银行股份有限公司 一种书面汇报演示文稿制作方法和装置
CN114912425A (zh) * 2022-05-17 2022-08-16 中国银行股份有限公司 演示文稿生成方法及装置
CN115618852A (zh) * 2022-11-22 2023-01-17 山东天成书业有限公司 一种文本数字化自动校对系统
CN115618852B (zh) * 2022-11-22 2023-04-07 山东天成书业有限公司 一种文本数字化自动校对系统
CN116842138A (zh) * 2023-07-24 2023-10-03 上海诚狐信息科技有限公司 基于文档的检索方法、装置、设备及存储介质
CN117113961A (zh) * 2023-10-20 2023-11-24 中电数创(北京)科技有限公司 一种基于Agent的公文写作方法和系统
CN117113961B (zh) * 2023-10-20 2024-02-09 中电数创(北京)科技有限公司 一种基于Agent的公文写作方法和系统

Also Published As

Publication number Publication date
CN111881307B (zh) 2024-04-05
CN111881307A (zh) 2020-11-03

Similar Documents

Publication Publication Date Title
WO2021164255A1 (zh) 一种演示文稿生成方法、装置、计算机设备及存储介质
US11222167B2 (en) Generating structured text summaries of digital documents using interactive collaboration
US11468550B2 (en) Utilizing object attribute detection models to automatically select instances of detected objects in images
US11657231B2 (en) Capturing rich response relationships with small-data neural networks
US10366093B2 (en) Query result bottom retrieval method and apparatus
US20120158686A1 (en) Image Tag Refinement
US20160085742A1 (en) Automated collective term and phrase index
US20170364495A1 (en) Propagation of changes in master content to variant content
WO2016022822A2 (en) Knowledge automation system
US11182540B2 (en) Passively suggesting text in an electronic document
WO2022262266A1 (zh) 文本摘要生成方法、装置、计算机设备及存储介质
US20210151038A1 (en) Methods and systems for automatic generation and convergence of keywords and/or keyphrases from a media
US11645095B2 (en) Generating and utilizing a digital knowledge graph to provide contextual recommendations in digital content editing applications
US20200192921A1 (en) Suggesting text in an electronic document
US11868714B2 (en) Facilitating generation of fillable document templates
JP2012221316A (ja) 文書トピック抽出装置及び方法及びプログラム
US20230237251A1 (en) Deriving global intent from a composite document to facilitate editing of the composite document
WO2019200699A1 (zh) 政务系统发文方法、装置、计算机设备及存储介质
CN114840685A (zh) 一种应急预案知识图谱构建方法
GB2585972A (en) Utilizing object attribute detection models to automatically select instances of detected objects in images
US20240126981A1 (en) Systems and methods for machine-learning-based presentation generation and interpretable organization of presentation library
CN112529743A (zh) 合同要素抽取方法、装置、电子设备及介质
CN111681731A (zh) 一种对检查报告进行自动颜色标注的方法
Makrynioti et al. PaloPro: a platform for knowledge extraction from big social data and the news
CN114818639A (zh) 演示文稿生成方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20920403

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20920403

Country of ref document: EP

Kind code of ref document: A1